The Zeroth Law of Debugging

We recently had a very eye opening experience while helping a customer with a problem. They were having issues with our ATS1936 and TM1936-SFP+ in their engineering lab. Their ATCA nodes and external nodes would link to the ATS1936, the ATS1936 reported the links as up, but they could not get traffic to pass between the nodes. We checked spanning-tree settings. We checked port physical settings. We checked various other switch settings. We checked SFP+ module numbers. We checked fiber cable types. We checked E-Keying settings. We ran various tests and we could not find anything wrong. It was getting frustrating. We were close to issuing an RMA and then we remembered the Zeroth Law of Debugging.
Zeroth Law of Debugging: Never assume anything.
or
Zeroth Law of Debugging (rephrased): Verify all of your assumptions.
I call it the Zeroth Law of Debugging because it is just like the Zeroth Law of Thermodynamics. It is the most fundamential rule of any list of ‘laws’ for debugging. Debugging itself could be defined as making a list of assumptions and verifying them.
What is the most basic assumption of a layer 2 switch? What is something you would rarely check? What is absolutely required for a layer 2 switch to connect two nodes? Do you think you have figured it out? Well if you answered unique MAC addresses you solved this problem.
Obviously, all nodes need unique MAC addresses and would have unique MAC addresses in the field, but this is testing in a engineering lab with prototype nodes. The MAC addresses were the same on all the nodes. The only way we figure it out was when we simplified the test down to just 2 nodes pinging each other. The customer saw the particular behavior of ping working from node A to node B or node B to node A but not both at the same time. That set off a red flag and we quickly figured it out from there once we verified our most basic assumption.
Every once in a while problems like this come up to remind you that not all issues are hard to fix. As long as you keep the Zeroth Law of Debugging in mind, most (all?) product issues can be solved.
JP Landry
Network Division Manager
