Software lore
CSV as code
At one job, I was working on a system that ingested a lot of data and classified it according to certain attributes. The core function of the system – the classification logic – was not in code at all, but in a large CSV file which was essentially executed like a program: each row represented a classification outcome, and the input was compared row-by-row and column-by-column until it found a row that fully matched. There was special syntax for a column to match all values or a fixed set of values, as well as some custom values that, depending on the column, would match a set of inputs determined at runtime. The CSV had blank rows, blank columns, and even a column titled "meaningless description".
All of this was implemented by a module which compiled the CSV into an anonymous function and executed it on each line of input. It was a sight to behold. I think the original intention was to allow non-technical people to update the classification logic, but the file got to be so gnarly that no one really understood it anymore.
I tamed it by getting rid of some of the cruft (MM/DD/YY date formats, inconsistent syntax for boolean values, long-unused classification rules), writing a comprehensive test suite and validation checks that looked for unreachable rules, and documenting everything.
Others
- "Why are the Microsoft Office file formats so complicated? (And some workarounds)" (Joel on Software, 2008)