Recently I was cleaning out my attic, and I came across a copy of Software Safety Assessment During Program Maintenance by Scott Stephen Sheppard, a thesis presented in partial fulfillment of the requirements for the degree Master of Science at Arizona State University. Though defended in November of 1988, much of it applies to today. Here are some excerpts from Chapter 1.
The study of system safety is a subdiscipline of system engineering that involves the application of scientific, management, and engineering principles to optimize safety within the constraints of operational effectiveness, time, and cost throughout the system life cycle.
Software safety involves the application of those principles for the software portion of a safety-critical system.
A safety-critical system is one that could cause injury, property damage, or environmental harm.
- The consequences of an accident involving a safety-critical system can range from minor annoyance to death.
The software portion of a safety-critical system can be considered safe if:
- The software never produces output that will transform the system into an unsafe state.
- If factors beyond the control of the computer place the system into an unsafe state, the software takes steps to eliminate or minimize the risk of accident.
A safe system is one that is accident-free. This is not equivalent to failure-free since failures are not necessarily synonymous with accidents.
Since software failures are only a portion of risk involved, evaluating system safety requires a merging of (among other things) reliable hardware, reliable (although not totally correct) software, fault tolerance, security, privacy, and system integrity, so it is not surprising that the study of software safety is not a stand-alone engineering subdiscipline.
The takeaway from my thesis is that there are specific sections of code, that if contain a bug, can lead to injury, property damage, or environmental harm. For those pieces of code, extra care (e.g., peer code review) and analysis (e.g., test coverage tools) should be conducted.
In our internal newsletter, the POV Dispatch, VP of Corporate Strategy, Jon Pittman, defined the Internet of Things as:
"...let's get clear about what the Internet of Things actually is. The simplest way to think about it is that we can now have the capability to put connectivity and intelligence into things and places. As these things and places become more intelligent, they are no longer inert lumps of physical material, but have computational power, sensors, actuators, and internet connectivity. It means they can share data with us but also learn by themselves and be far more automated than they could in the past. Over time, it means that they can communicate and interact with each other and change in response to what they sense from us, from their environment, and from their peers."
Given the Internet of Things and our eventual evolution to trillions of devices being woven into the fabric of our everyday lives, what does it mean to be safe? What now constitutes a safety-critical system? Things like self-driving cars immediately come to mind as requiring software safety. Pacemakers also come to mind. But what about things like robotic assistants, home thermostats, vacuum cleaners, or elevators? This topic is important because it is woven into the future of making things. As software provides more and more of the functionality of what everyday things do, applications and services used to design these things have to provide designers with the ability to make safety assessments and make design decisions accordingly. Autodesk has been studying the internet of things from a "What do our tools need to provide?" and "What has to be taught in schools?" perspective. We're on it.
Safety is alive in the lab.