(Inter)Connected Connundrums

Today, you can’t open a newspaper (leave alone a technical journal) without reading something about how the future will be ‘connected’ – one only  needs to think of something before it is done. I am not very sure that that is such an unalloyed blessing as the articles make it out to be. Don’t get me wrong; I am no Luddite who doesn’t ‘get’ the convenience of a connected world.

It is a mythology trope that downfall is brought about by one’s greatest strength; in a similar vein, I see a few problems that arise because of the interconnectedness of things. By its very nature, a connected world will suffer from a barrage of serious unintended consequences.

IOTOne of the simplest, is the readiness of a connected system to tip towards catastrophic failure.  Since everything is connected in the, well, connected world, the architecture has to necessarily have built in fire-lines that prevent failures from becoming calamitous or system wide. The problem is compounded for IoT eco-systems as by design it is expected to work with bad input.  Sensor failure is expected and is compensated for at all layers in the architecture. However, current IoT systems are not designed to discriminate reliably between a local disaster and a cascaded system failure.  The latter is not really a catastrophic event but a case of the system reacting automatically to a local event causing adjunct sensors to react which in turn triggers the next layer of sensors and so on, ad infinitum.  If you thought the brownouts in Eastern USA were bad, imagine how bad it will be when everything is connected to everything else. A true case of the cure being worse than the disease!

The most major unintended consequences, of course, are bound to be those surrounding security.   Security experts have long warned us about the gaping holes created by devices using low-footprint software.  By architecture IoT systems’ intelligence is centralized and most of the nodes are of low or no intelligence. This creates a veritable rabbit’s burrow of security flaws allowing access from multiple interconnected points at various access levels which makes it impossible to guarantee a system as ‘secured’.

This is further compounded by needless convenience features that create further vulnerabilities. Take for instance the darling of the connected world proponents – the connected car.  Today, most cars come with a Keyless Entry and Start System (KESS) which enable you to open, start and drive a car without going to the trouble of taking the key from your pocket. This creates a vulnerability that enables a hacker to do the same merely by being present when you legitimately operate your car. Subsequently, he does not even require to be physically present to access the innards of your car.  (See the paper, “Lock it and still lose it” by Garcia et al https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/garcia or watch https://www.wired.com/2016/08/jeep-hackers-return-high-speed-steering-acceleration-hacks/). Please note, this is not a code failure but essentially a man-in-the-middle attack.

Now imagine the same car connected to your home security system or even just your home AC system (presumably to keep your home at the right temperature when you arrive – a favorite use case!). You now have a vulnerability in your home that is enabled purely because everything is connected.  This one is a zinger as vulnerabilities go – it does not even require the intruder to be physically present!

The old bromide, “Make it idiot-proof and someone will make a better idiot” applies equally well to connected systems.  Designers, and consequently testers, usually take good care to design safeguards for normal use of systems and conceivable use with mal intent or incompetence.  However, the complexity and interconnectedness of IoT systems make the threat of unintended consequences of higher import.   Creating a test plan that validates these effects is like looking for the proverbial black cat in a coal cellar with the added complication that neither is there!

Conventional approaches to testing IoT systems will only provide confirmatory evidence. Testing of IoT systems has to approach the issue completely differently as most of behavior is not visible but is under the wraps, as it were.  Take the issue of security for instance; intrusion prevention would be very close to impossible as there will be too many vulnerabilities.  A cascaded failure would be difficult to simulate as mere load does not trigger a failure but a condition where a single node’s failure signals to the nodes connected to it to also shut down in sympathy leading to a system wide failure. How they should such a system be tested?

Testing of connected systems should be partitioned as Sufficiency Testing & Implication Testing.

Sufficiency Testing needs to consider the problem in a slightly retrograde fashion – “If I am to make this decision, what is the quality of the data that I have?”  For example, “Is this a real intrusion? Why does it appear so? Are the sensors triggering in working? Is it one sensor or many sensors signaling these failures? Are they adjunct or distributed? Is there disproving evidence?”

If the data are good, then the adequacy of the data needs to be ascertained before the decision is finalized. On the other hand, if the quality of the data is not appropriate, then the decision has to be arrived at either via secondary sources or per force, be abandoned.  Testing at this stage has to focus on the quality of the data and the various ways data can be degraded and its effect on the correctness of the outputs at each stage.

When the sufficiency criteria have been met, testing can move on to the much more complex and nebulous field of Implication Testing.  This phase is where the system behavior, both designed and emergent are to be tested.   The system behavior is conceptualized as a probability curve and testing becomes an exercise that plots the actual behavior against the probability expected.  Any significant variation in expected vs realized probability needs explanation and, most probably, rework. This probability based testing implies the ability to repeat the testing in statistically meaningful numbers with its attendant requirements in automation, test data creation etc.  The testing has to also allow for, and simulate black swan events.  Finally it has provide impacts from external causes outside the system boundaries targeted at various levels of the IoT eco-system architecture.

The standard tester of today, unfortunately, is ill equipped to do this kind of testing.  Implication Testing requires a deep understanding of statistics on the tester’s part along with the ability to simulate chaotic input. While the gap in knowledge can be addressed, the requisite free association thinking will require a lot of unlearning.   Would it spell the demise of the tester as we know him? Would it finally bring testing to the fore front of the SDLC as, in systems with emergent capabilities, testing is the only way to understand what has been delivered?  Or, would it finally remove the tester-developer dichotomy and make them one?

Who knows? Only one thing is sure: the times ahead, where testing of interconnected systems is going to be the norm, are going to be interesting. It would be well to remember the warning from the cartographers of yore – “Here be dragons”!