On Race Conditions A long time ago I worked for ICL on the development of the 3900 range. The system could be seen as organised according to a series of priority levels. The OS had levels 15 down to 2, lower ones being for hardware and testing. So the raw instruction set could be seen as level 1, and the microcode (internal instructions out of which programmer level instructions were built) could be seen as level 0. I worked on the address translation microcode, which could be seen as level -1, a yet more primitive set of instructions that performed service functions for the main level 0 microcode. Conceptually, each level could pre-empt higher levels, but had the responsibility of leaving the system in a state that higher levels were capable of understanding. One day, in completing the coding of my subsystem, certain functionality had to be provided by my code. It turned out to be impossible as stated, because part of the data that was involved was inaccessible under the circumstances. I consulted my immediate boss. He said to complete the functionality using a so-called 'auxiliary function'. This involved planting an interrupt to the level 0 microcode and geting it to invoke the required auxiliary function in my code. I could see that this would work on its own terms, but I was very uneasy about doing it this way. In between the start of the task and its completion via the auxiliary function, the system would be left in an inconsistent state. On the other hand, the design was such that interrupts from my subsystem had the very highest priority among all the things that could interrupt at level 0, so, when it occurred, my interrupt would run first, and the inconsistency would be cleaned up before any other part of the system could be exposed to it. In the event, that's what we did. Nothing could go wrong, right? Things continued swimmingly. The development was completed, machines began to be sold. Everyone was happy. We moved on to develop the multi-node version of the machines. In these, several machines would be tightly coupled at the memory level. To preserve consistency, some memory writes (including things called semaphores), needed to be strictly serialised, i.e. there had to be global agreement among all the coupled machines about the order in which they took place. There was provision for these in the design of the single machines, but it proved to be hopelessly inefficient (basically because the OS people had not managed to rewrite the OS so as to drastically reduce the number of these semaphores ... eventually they did this, and the original design was reverted to, but something had to be done NOW). A new piece of hardware was conceived to handle the high number of semaphores efficiently. It was coupled directly to the memory, in effect introducing a new level -2, with even higher pre-emption capability than my level -1 code. The technically savvy can imagine the rest at this point..... Everything went swimmingly. The development was completed, testing revealed no problem, multi-node machines began to be sold. Everyone was happy. At that point I left ICL to join the university. British Gas were buying machines as fast as they could be built to keep ICL in business. (From their point of view, if there was anything worse than running their code on ICL hardware, it was not having any ICL hardware to run their code on). One day there was an urgent call to the development team. One of their multinode systems had seized up over a period of about 15 minutes. A semaphore had gone missing, and as a consequence, the entire system had ended up in a tree of queues waiting for the semaphore that was never going to come back. Of course all the internal hardware logs that retained a bit of diagnostic information had been overwritten millions of times during this 15min death rattle, so there was no clue. It was flagged as a 'red alert' problem (which means 'fix within 24hrs ... or else'). No one had any idea what was going on. Various noises were made, ruffled feathers were soothed, etc. No one got too upset though. Most people involved had seen it all before. About once a month a semaphore would go missing, and the system would go into its 15min death rattle. The BG guys just shrugged their shoulders and restarted the system from the last checkpoint. Back at base, the commissioning team carried on investigating. One day a test program stopped dead. Unlike an operating system, which is designed to soldier on at all costs, a good test program will stop at the first whiff of trouble. A semaphore had gone missing. The lads eagerly picked over the steaming entrails of the fresh kill. It seemed that just prior to the semaphore going west, the system had been doing a guard page interrupt (it doesn't matter what that is). The lads wrote new tests that did more or less nothing except guard page interrupts and semaphores. Pretty soon, semaphores were being lost to order. They looked more closely. It was the functionality mentioned earlier. Because of the new multinode hardware, which had higher priority than my code, the semaphores could sneak in, in between the two parts of the functionality, see the inconsistent state, and consequently get written to a spurious location. The rest of the system meanwhile waited for the semaphore in the place it should have gone. When this was realised, the functionality was redesigned (in a much less efficient, though now correct manner), and machines stopped losing semaphores. It had taken 18 months from the original red altert. That's what race conditions are like.