Today in Tedium: Love them or hate them, relational databases are one of the most fundamental building blocks of modern computing. Organizing data in tables and accessing it later on? While it often gets more complex than that—we’ve, after all, figured out numerous ways to access data over the years—databases work largely like this, all these years later. They are boring, but they make lots of money. After all, there’s a reason Larry Ellison is the fourth-richest person in the world. When you mention databases, you do not think about Halloween. But yet … it turns out that one of the most famous examples of a database bug is named after Halloween. Today’s Tedium explains why. — Ernie @ Tedium
We accept advertising, too! Check out this page to learn more.
The boring database problem that showed itself on Halloween
It was normal day in a dark, spooky cubicle in 1976 when a group of database engineers working at the IBM Almaden Research Center discovered a problem with the database they were attempting to test.
On the surface, the query was fairly straightforward: They set up a query in database designed to increase the salary for every employee making less than $25,000 by 10 percent. On the surface, the database change seemed to work, and reported no errors. But like a shadowy figure dressed up like a skeleton, appearances were deceiving.
See, it turns out that the query worked too well. It didn’t just update the salaries by 10 percent—it kept looping through the database, continually increasing every modest salary by 10 percent until everyone made at least $25,000 a year. Essentially, the database was not going through once, as intended, but was pushing the change to the front of the queue, ensuring it would keep hitting it again and again until it was no longer valid in the system.
In the real world, a mistake of this nature would be a massive windfall for employees—but a nightmare for executives. Nonetheless, the date the mistake was discovered immortalized the database problem as the Halloween problem.
In a retrospective joint interview dating to 1995, Mike Blasgen, one of the systems managers who worked at IBM Almaden back in the 1970s, stated that it was one of many phantoms that were discovered during this era:
It’s interesting because all these odd-ball things had names: there were phantoms, and there were other things, and those had to do with names that were somehow representative of what you were observing, right? So the phantom was because it was something that was sort of there, but not there; the name was descriptive. And this was called the Halloween problem not because it surprises you, or it’s spooky, or trick-or-treat or anything; this is because it happened to be discovered on Halloween day. But I think most people think it’s the other; I think most people think it’s called Halloween because it’s so surprising. But it’s not.
Don Chamberlin, who was part of the team that uncovered the problem, noted that the term took on a life of its own.
“It’s famous in the industry—everybody knows the Halloween problem,” he recalled.
Over the years, this problem, thanks in part to how early it happened in the history of technology and where it happened, evolved from a silly example into a classic discussion point of SQL database books. It’s a great way to touch on a whole lot of problems with querying databases thoughtlessly.
And in many ways, the nature of the Halloween problem is relatively easy to understand for a database bug, which is why it sometimes shows itself in mainstream popular culture—albeit not under its proper name. For example, the movie Office Space is built around a failed attempt at white-collar crime that failed because it ran into a problem very similar to the Halloween problem. (To explain it I’m going to have to spoil the movie, but you’ve had 25 years to watch it.)
Late in the film, the three main characters—Peter Gibbons, Michael Bolton, and Samir Nagheenanajar—decide to exact revenge on their company Initech by infecting the company’s accounting system with a virus that removes fractions of pennies from the company’s systems and puts it into a bank account. It’s intended to work slowly over a long period, but because of a bug in the code, the virus ends up taking more than $300,000 in the span of a single weekend—a classic example of overquerying if you’ve ever seen one.
Suddenly, what was intended to be a passive, nearly invisible database query had instead hit the system hundreds of millions of times in just a few days.
While Bolton (David Herman) suggests he might have messed up a decimal point, a query error in a handwritten virus almost feels more likely, knowing what we know about the Halloween problem.
“I always do that. I always mess up some mundane detail,” he says.
(Good thing the stapler enthusiast Milton set the office on fire, ensuring all records of the gaffe were lost in the blaze.)
“It happened to be on a Friday, and we said, ‘Listen, we are not going to be able to solve this problem this afternoon. Let’s just give it a name. We’ll call it the Halloween Problem and we’ll work on it next week.’ And it turns out it has been called that ever since.”
— Don Chamberlin, one of the key figures involved in the Halloween problem saga, on how the term got its name. “It has absolutely nothing to do with Halloween,” he said in the 2001 oral history. In other words, they literally named it after Halloween because it was inconvenient to fix at the time they had discovered it.
Why the Halloween problem is kind of a headache, even if it’s not particularly scary
So now that we’ve laid out that this is the most boring Halloween-related thing to happen, ever, let’s talk about why it’s a problem for database management, in layperson terms.
Essentially, because you cannot just straight-query an SQL database when trying to change a number of fields, lest you risk changing data you don’t want to change, you have to attack it using another method. The problem is, these methods are generally more complex than the obvious one, and that means they can either be slower or harder to implement.
In a 2013 piece on the website SQLPerformance.com, author Paul White explained the many nuances of this problem, as well as the challenges it creates from a performance standpoint to query an SQL database without changing data accidentally.
With regards to the UPDATE query, the approach used in the classic 1976 example, there are a couple of ways to do this, per White:
- Query the database using a read-only search, then change only the records that are affected, doing an analysis on the backend. “Implementing these three phases literally in a database engine would produce correct results, but performance might not be very good,” White writes. “The intermediate results at each stage will require system memory, reducing the number of queries the system can execute concurrently. ”
- Only do updates one row at a time, to ensure that changes are isolated from one another. While more performant, this process is more complex, per White: “The challenge for the query optimizer is to find an iterative (row by row) execution plan that satisfies the UPDATE semantics required by the SQL standard, while retaining the performance and concurrency benefits of pipelined execution.”
(If you want to get into the weeds of this concept, I recommend White’s four-part series, as it explains the nuances of this better than I can. Warning though: It may go over your head.)
Perhaps it’s for this reason that attempts have been made over the years to mitigate the problem. In particular, Microsoft has a number of patent filings to its name that exist specifically to mitigate the Halloween problem. (Which makes sense, as Microsoft sells an SQL server tool.) The first, granted in 2000, notes that the goal is to simply avoid the problem outright:
There exists a need for a more efficient approach to avoiding the Halloween problem. In particular, there exists a need for a query optimizer that is capable of interleaving record-at-a-time pipelining and set-at-a-time pipelining in order to achieve more efficient performance by permitting as many efficient update plans as possible while still maintaining the correct semantics of search and then update.
Over the years, Microsoft has developed a variety of solutions to take on this issue. In 2008, the company’s Craig Freedman, who is responsible for some of the patents credited to Microsoft, noted that SQL Server has to separate processes to ensure that changes of this nature occur with a minimum of damage, a concept called “Halloween protection.”
“Logically speaking, SQL Server must execute the read cursor and write cursor of an update plan in two separate steps or phases,” he wrote. “To put it another way, the actual update of rows must not affect the selection of which rows to update.”
To put it all another way, the Halloween problem is essentially the result of a common database architecture working contrary to the way you’d logically expect it to, creating numerous extra hoops along the way.
Talk about cursed.
So, it took us nearly nine years, but we finally found an example of Halloween culture that is truly scary—because of how dull and boring it is. It is devoid of any true cultural relevance to the holiday.
The best you could hope for is that maybe a black cat walked into the IBM Almaden Research Center on that fateful day in 1976 and gave the database query a fateful stroke of bad luck, but something tells me the vibe of the building is less Hocus Pocus, more Severance.
When I told my wife Cat about this story, she told me that a small part of her died as I explained it, that the idea that something as interesting as Halloween could be associated with something so inherently dull and boring as a database querying error just feels antithetical to the spirit of this costume-friendly hobby. Simply put, you cannot dress up a logic gap in database design to be much more friendly than it already is.
That may be the real trick. Whether or not it was the true goal, the Halloween problem, by not actually being about the holiday for which it is named, is far scarier than any man with a knife in a reconstituted William Shatner mask.
It is what happens when hope and terror give way to true drudgery. It is truly something to fear when something borrows an association but absolutely none of the spirit that creates that thing.
It’s enough to make you want to drop a table.
Find this one an interesting read? Share it with a pal!