A History of the Bug in the Age of Reason

Confinement, Surveillance, and the Institutional Life of the Error


I. The Ship of Errors

Before the great partition, errors moved freely through computation. They were not yet bugs — that designation would require a classificatory regime that had not yet been invented. In the earliest days of mechanical and electronic calculation, what we now call a “bug” was simply what the machine did. The operator who received an unexpected result did not528 diagnose the machine; she checked her own instructions. The locus of fault had not yet migrated from the human to the system.

This was the age of what we might call the classical tolerance. Errors circulated within the computational body the way the mad once circulated within the medieval city: visible, unremarkable, part of the ordinary traffic of social life. A program that produced the wrong output was not sick. It had not failed. It had merely done something other than what was wanted, and the distinction between the wanted and the unwanted had not yet hardened into an ontological boundary. Grace Hopper’s moth, taped into the logbook of the Mark II in 1947, was recorded with the cheerful notation “first actual case of bug being found.” The word actual is diagnostic. The term already existed as a metaphor — the insect merely literalized it. But the logbook entry betrays no anxiety. The moth was removed. The machine continued. There was no protocol, no incident report, no post-mortem. The error had not yet become a case.

In this classical period, the program and its errors were coextensive. One did not debug a program so much as revise it, the way one might revise a letter. The vocabulary was editorial, not medical. And crucially, the error was not yet understood as internal to the system — as something the system harbored, something that could be found within it through sufficiently rigorous inspection. The program was not yet a body that could be examined.

II. The Great Confinement

The transformation begins — as these transformations always do — not with a discovery but with an institutional rearrangement. In the 1960s and 1970s, a new apparatus emerges for the management of computational error: structured exception handling. The try/catch block. The error code. The return value that signals failure. What had been a diffuse condition of computational life is now partitioned, enclosed, confined to designated spaces within the program’s architecture.

The analogy to the Hôpital Général is precise. The founding of the Hôpital in 1656 did not create madness; it created a space in which madness could be confined, and in confining it, constituted it as a distinct category of experience requiring institutional management. In precisely the same way, the try/catch block does not create the error. It creates the site of confinement — a textual enclosure within which the error is expected to appear, and from which it is not permitted to escape.

Consider the taxonomy that rapidly follows. Errors are no longer simply errors. They are exceptions, and exceptions are organized into hierarchies — checked and unchecked, recoverable and fatal, system errors and application errors, the IOException and the NullPointerException. Each receives its proper name, its position in the classificatory tree. The error has become a specimen. It can be caught, which is to say arrested; handled, which is to say managed; or thrown, which is to say expelled from one jurisdiction to another, passed upward through the call stack like a prisoner transferred between institutions.

The language is not accidental. To catch an exception is to intercept it at the boundary, to prevent its free circulation. To handle it is to apply a prescribed treatment. The unhandled exception — the error that escapes confinement — produces a crash, which is the computational equivalent of the madman loose in the streets: a scandal, a disruption of order, a failure of institutional containment that reflects badly not on the error itself but on the system that failed to confine it.

The moral economy of this confinement deserves attention. In the classical period, an error was no one’s fault in particular — it was a condition of the work. Under the new regime, the unhandled exception becomes an accusation. Someone should have anticipated it. Someone should have written the catch block. The stack trace — that extraordinary document — functions as a kind of judicial record, establishing the exact sequence of calls through which the error propagated, identifying the precise line of code at which confinement failed. It is at once a medical chart and an indictment.

III. The Diagnostic Gaze

If the Great Confinement established the space of the error, the next transformation establishes the gaze — the apparatus of continuous observation through which the system is constituted as a field of potential pathology.

The clinical metaphor is explicit in the vocabulary of the practitioners themselves. Systems are described as healthy or unhealthy. They are monitored for symptoms. Alerts are triggered when vital signs — CPU usage, memory consumption, response latency — deviate from established baselines. The dashboard, that central artifact of the observability regime, reproduces with uncanny fidelity the structure of the clinical chart: time-series data, threshold indicators, color-coded severity levels (green for healthy, yellow for warning, red for critical — the same semiotic system used in hospital triage).

The birth of this diagnostic gaze can be dated with some precision to the emergence of Application Performance Management in the early 2000s, and its maturation into the “observability” paradigm of the 2010s. The key conceptual innovation is the shift from reactive to proactive detection — from responding to errors after they occur to constructing an apparatus of surveillance so comprehensive that errors can be detected before they manifest as symptoms. The healthy system is no longer one that functions correctly. It is one that is known to be functioning correctly, which is an entirely different condition and one that requires continuous verification.

Distributed tracing deserves particular attention here. In a microservices architecture, a single user request may traverse dozens of services, each running independently, each capable of failing in its own way. The trace — a unique identifier propagated across every service boundary — allows the diagnostician to reconstruct the complete path of a request through the system, to identify the precise service at which degradation occurred, to measure the latency contributed by each component. It is the computational equivalent of the anatomo-clinical method: the correlation of symptoms observed at the surface (slow response times, elevated error rates) with lesions discovered in the interior (a database query that has begun performing a full table scan, a memory leak in a background worker).

The effect of this gaze is not merely to detect errors but to constitute them. Under the observability regime, phenomena that were previously invisible — a slight increase in p99 latency, a gradual rise in garbage collection pauses — become anomalies, which is to say, deviations from a norm that the apparatus itself has established. The system does not merely have errors; it has a health, a continuous quantitative state that can be measured, graphed, compared against historical baselines, and — crucially — alerted upon. The on-call engineer, awakened at three in the morning by a pager, is summoned not because something has broken but because a metric has crossed a threshold. The error, in many cases, has not yet occurred. What has occurred is a deviation from the norm, and under the diagnostic gaze, deviation and pathology have become synonymous.

IV. The Asylum: or, Error Budgets

The most recent transformation is perhaps the most remarkable, because it represents not a further tightening of the regime but a calculated relaxation — a controlled tolerance of error that nevertheless operates within, and indeed depends upon, the apparatus of confinement and surveillance established in the earlier periods.

The doctrine of the error budget, formalized in Google’s Site Reliability Engineering framework, holds that a system should not aim for perfect reliability. Instead, it should establish a target — 99.9% availability, say, or 99.99% — and treat the remaining margin as a budget that can be spent. If the system has been too reliable — if it has consumed too little of its error budget — then it is, paradoxically, underperforming: it is being too cautious, deploying too slowly, innovating too little. The error budget must be spent, or else it represents waste.

This is a genuinely novel epistemic formation. Error is no longer a condition to be eliminated but a resource to be managed. The bug has been granted a kind of institutional legitimacy — not the free circulation of the classical period, but a supervised, quantified, budgeted existence within the system. It is permitted to appear, but only at the prescribed rate, and only within the spaces designated for its appearance. If it exceeds its budget, the regime tightens: deployments are frozen, engineering effort is redirected toward reliability. If it stays within budget, the regime relaxes: new features may be shipped, risks may be taken. The error has become a variable in an optimization function.

Chaos engineering extends this logic to its conclusion. In the practice pioneered by Netflix’s Chaos Monkey and its descendants, errors are not merely tolerated but deliberately introduced. Services are killed at random. Network partitions are simulated. Latency is injected. The purpose is not to find specific bugs but to verify that the system’s response to failure — its immune system, as the practitioners say — is functioning correctly. The error has been fully instrumentalized. It is no longer the thing that must be confined but the stimulus applied to test the apparatus of confinement itself.

One detects here a certain circularity that Foucault would have recognized. The elaborate apparatus of observability, exception handling, error budgets, and chaos engineering does not exist to eliminate errors. It exists to manage them — and in managing them, it requires their continued existence. A system with no errors would have no need of error budgets, no use for chaos engineering, no occasion for the on-call rotation, the incident review, the post-mortem, the blameless retrospective. The entire institutional apparatus — the SRE team, the observability platform, the alerting infrastructure, the incident management tooling — depends for its legitimacy on the continued production of the very condition it was created to govern.

The bug, like the madman, has become indispensable to the institution that confines it.

V. The Undiagnosable

There remains a category of error that resists all these regimes: the error that is not recognized as an error. The system that functions perfectly — meets its SLOs, passes its chaos tests, triggers no alerts — while doing the wrong thing. The recommendation engine that optimizes engagement by promoting content that degrades the user. The credit model that reproduces, with mathematical precision, the discriminatory patterns embedded in its training data. The automated hiring system that filters candidates by proxy variables correlated with protected characteristics.

These are not bugs in any sense that the diagnostic apparatus can detect. They are correct implementations of incorrect specifications — or, more precisely, they are cases in which the distinction between correct and incorrect has become undecidable, because the system’s behavior is simultaneously optimal by its own metrics and pathological by any external standard.

Foucault argued that the history of madness was not the history of a disease progressively understood but the history of a partition — a line drawn and redrawn between reason and unreason, with the content on either side of the line shifting according to the epistemic and institutional arrangements of each period. The history of the computer bug follows the same trajectory. What counts as a bug — what is confined, diagnosed, budgeted, or tolerated — is not a natural fact but an artifact of the classificatory regime in force. And the most consequential errors are always the ones that the regime, by its very structure, is unable to classify.

They do not appear in the logs. They pass every test. They are, by every available measure, correct.

It is only outside the system — in the world the system acts upon — that their effects become visible. But by then, of course, the system has been working flawlessly for years.