I’m officially an old dog.
I’ve been in the tech business since punched cards were the only way to provide input to a mainframe–whether that was a program or the data for a program.
I’ve used paper tape systems to bootstrap minicomputers.
I’ve designed I/O hardware for core memory systems.
I’ve used cassette tapes to read data into early personal microcomputers.
Throughout all that time, I’ve understood the value of seeing the genesis of new technologies from the ashes of the old. It’s the natural order of the world: old technologies give way to new ones, and while they continue to exist for a while until the systems that needed them to go out of service, they eventually disappear.
Punched cards are hard to find. Paper tape readers are seen as relics of a bygone age. Core memory systems–what are those?
But I should know better than to think it’s as clear cut at that.
From time to time, I run into an old technology–be it hardware, software, protocols–that still seems to be hanging on.
The most recent case: working with a client to migrate an application written in Cobol and running on a mainframe to AWS, I had to delve into the files produced by the program so we could recreate them in Python.
Here’s what I saw:
Yes, it’s a column-oriented format with 80-character line lengths.
Not JSON. Not XML. Not even CSV.
Bloody punched card formatted data.
Well, a lot of old systems–ones that use mainframes and transport financial data–haven’t been changed in years. Why bother? They work and making changes would be expensive–all that re-testing, for instance–for little apparent gain.
No big deal.
And then I run across a case where the choice to keep alive an old data format has real-world consequences (though in this particular case, it’s rather more silly than serious).
The American Kennel Club is the US organization that maintains standards for purebred dog pedigrees. As part of that process, the AKC maintains a registry where owners of pure-bred dogs can register their dogs, which helps maintain the breed line and keep unscrupulous breeders from passing off “fake” dogs as pure-bred.
The registration for a given dog requires the assignment of a name for the dog–it can range from the common (e.g., Spot) to the more interesting (e.g., Pablo Escobark).
To avoid “name collision” the AKC adds a number after each non-unique name.
For instance, the first owner to register “Lady Luck” would be given that name.
The second owner would be given “Lady Luck II”. And so on.
It’s been known for years that a maximum of 37 dogs with the same name and breed could be registered. Why 37? What a strange limit. Why not 10, or 100, or some multiple of tens or powers of two?
Notice that the number is expressed in Roman numerals. There-in lies the story.
The system used by the AKC–defined in the long-ago technology past–permits no more than 6 characters for the number extension to the name.
If you look at the list of Roman numbers starting from “I” and going up–“II”, then “III”, then “IV”, and so on–you will see that up to the value of 37–“XXXVII” no more than 6 characters are required.
But, at 38–“XXXVIII” in Roman numerals–6 characters is no longer enough.
Rather than skip to the next set of Roman numerals that would fit–starting at 39, “XXXIX”–the AKC chose to stop and not allow any further numbering.
And there it stands. Apparently the AKC made the choice, in all the decades since then, to not change the data format for storage of this information. We can only guess why–it could be that they don’t run into this problem often, and this is supported by the fact that if you pay more money they will extend past 37.
Not earth-shattering. But an interesting reminder that some technologies–including data formats–may not be retired simply because there are better ones available. Institutional inertia can be overwhelming.
I am convinced that, even decades from now, we will still see that some dinosaurs still roam the earth.
And I will still know how to ride them.