May 27, 2008

Serendipitous Data Collection

In 1969 the Firesign Theatre recorded "How Can You Be in Two Places at Once When You're Not Anywhere at All"

People who diagnose and repair networks have long experienced the truth of that title - no matter where you happen to be, the test data you need to know can only be acquired by being somewhere else.

In my own experience at Wells Fargo in the 1980's I more than once had to run back and forth through the streets of the San Francisco financial district, often at 3am, to check circuits and devices on a malfunctioning network path.

Telco people long ago learned to incorporate "remote loopback" and remote testing capabilities into their devices.  Internet people have not been as smart.

Today's state of the art of network troubleshooting is a individual practitioner, a person who has deep knowledge and experience with the net from the bottom to the top, who carries his/her own ad hoc toolkit of favored hardware widgets and software packages, and who is tired of being called at 3am to fix some routine network outage.

Today's net is filled with sorry excuses for equipment that can't recover from routine outages.  How many home ADSL modems lock up every few days or whever the telco drops the phone circuit for a few moments?  The usual repair is the old-fashioned, brute force, but quite effective power cycling of the unit.  Why are not these devices designed so that they recycle themselves when they fail to perceive a flow of IP packets for an extended period of time?  A little self-introspection and self-recovery could go a long way.

I've long been on the quest to build the Internet Buttset.  My 1993 product Dr. Watson, The Network Detective's Assistant (DTWNDA) was a first step.  (The sad story of why that product disappeared is something for another day.)

It strikes me that one thing that could be incorporated into internet devices, particularly devices used for testing and diagnostics, is something that I call "Serendipitous Data Collection".

The basic idea is quite simple - When a device needs a chunk of data, that device publishes a request in a well known directory.  Other devices periodically look at the published requests.  If one of those other devices happens to be in, or later happens to travel to, a part of the net where the requested data can be obtained that other device collects the data and holds it.  That other device, if it comes near the directory, deposits that data into the directory.  The original device may (or may not) subsequently notice the published data, pick it up, and use it.

The words "directory" tend to imply something more glorified than is really necessary.  Consider handheld network testing devices that live in a charging/docking unit when not in use and that, when in use, are carried by network operations staff to various parts of an Enterprise network.  The charging/docking station then becomes the interface to a simple repository of requests and answers.

It is a simple bulletin-board model.  Security is not strong - which is why I tend to think of this in the context of network testing and diagnostic devices.

Network diagnostic and repair tools must often be exempt from the constraints of network security.  This means that many of these tools would have to designed only to engage in intrusive or risky operations only when used by by people who are both trustworthy and skilled.

A surgeon's scalpel must be very sharp.  A surgeon's scalpel can cause a lot of harm if misused.  Network troubleshooting and repair require invasive tools that are able to cut into the network to reveal the inner workings.  These tools could cause a lot of harm if misused.  None of us would want a surgeon to operate with a dull scalpel; we accept that the value outweighs the risks.  Similarly, we should not want the internet to be denied good repair tools just because those tools, if misued, could cause harm.

Posted by karl at May 27, 2008 1:00 AM