The software is one of the most complex products ever produced by man, without any doubt. There is no literary product, scientific program, architectural work that can match the number of man/hours or man/years necessary to produce a software of great complexity like the one that equips a modern plane, a car of the latest generation, the computer or smartphone that you are using, a social like Facebook.
And the software is so complex that it often fails … or rather, it kills.
Software kills people, burns huge capital, blasts companies, creates irreparable damage to image
In 1982, it is suspected that the CIA has deliberately introduced a bug (software error) within the control code of the Trans-Siberian gas pipeline in Russia. For counter-intelligence purposes, the US has decided to blow up this conduct once operational with the result of provoking the largest non-nuclear explosion in history.
Between 1985 and 1987, a particle accelerator called Terac 25 caused several deaths in some hospitals. Based on a previous design, the new device was equipped with a software-based security https://www.viva64.com/media/images/content/b/0438_Therac_25/image3.pngdevice instead of a mechanical one but unfortunately it was programmed by a technician without any formal preparation and without safety criteria. In some cases of clumsy use, a very rare bug called “race condition” caused the emission of high-powered X-rays without a protective shield that directly hit the patient by killing or seriously injuring him.
In 1996, the Ariane V rocket was blown up because it was now totally out of line and without control. The inertial navigation software of the previous version, Arian IV, was considered so reliable to be used without modifying it and above all without testing it with the new speed and acceleration requirements of the new and more powerful vector. A trivial conversion error from 64 to 16 bits caused an unmanaged overflow that led to self-destruction with a damage of 370 million dollars.
In 1999, NASA’s Mars Climate Orbiter surpassed its goal when it tried to move in orbit around the red planet. It was then pulverized in the atmosphere. The reason: a group of engineers working on the spacecraft used metric measurements while another team used Imperial units.
On August 1, 2012, a Knight Capital ridden computer began to generate thousands of incorrect stock trading orders per second, selling at market prices instead of using the Bid / Ask fork and systematically loosing money at every transaction. In just over 30 minutes, the company lost $ 440 million and failed.
On October 19, 2016, the Schiaparelli space probe crashed on the Martian surface as part of the ExoMars exploration mission of Mars in collaboration between ESA and Russia. The parachute opened for only 3 seconds instead of 30 as it should have, with the result of getting too fast and destroying the impact. Also in this case, the software has not been able to manage the exceptional conditions that have occurred.
And these are only the macroscopic damages, those from hundreds of millions and that become famous for that. In reality, there are certainly less striking episodes where companies lose not only millions of dollars and euros, but above all they lose brand, fame, reliability, market position, user confidence and so on. All things that can cause problems not only immediate in the short term, but a long and inexorable decline if not to failure.
Do not worry, there is even worse: the process of software production, in most companies including the best, is at least badly managed if not out of control and the consequences sooner or later will become striking and plain for all to see.
But what happens, in the meantime?
It happens that, as a multiple internal hemorrhage, the life cycle of the software causes loss of “blood” by the corporate bodies in an inadverted and inexorable way, consuming incredible amounts (and almost never measured) of money, lengthening the delivery times of arrival on the market, silently making the discontent of the final customers grow and so on.
For years and years, before anyone can notice it. And usually, when one realizes that one’s internal “organs” are bleeding, it begins to be too late.
AND THEREFORE? WHAT DO WE HAVE TO DO?
The actions to be taken are essentially:
- Spreading a Continuous Culture of Software throughout the company, so that there is a growing awareness of the risks and the incredible opportunities of its intelligent management
- Adopt a Software Life Cycle Method based on an EFFICIENT DEVELOPMENT integrating it with the traditional business one of its products sold
- Introducing an ecosystem of Tools, Third-party libraries in the production of software in an accurate way where they are really useful and is not strategic their internal development
- Activate the Verification, Monitoring and Control procedures to be aware at all times of the health of your software and consequently of your company
- Evaluate the appropriate Metrics to calculate the Return on Investment of each action undertaken in this process so as to make the development process more efficient and secure
I will write more articles here and on my blog they will be dedicated exactly to these cornerstones that will determine the difference between saving and growing your company on the one hand, or the inexorable decline and exponential increase in risk on the other.
And you, which side do you want to be?