Can we measure how much more complicated computing is?
25 years ago, a simple question was asked about storage, access times, and economics, and the result was a simple paper. Every ten-ish years since then, an updated paper was written to answer the same question. It’s not a terribly good measure of complexity, but it is enlightening.
In 1985 Jim Gray and Franco Putzolu asked a simple question.
When does it make economic sense to make a piece of data resident in main memory and when does it make sense to have it resident in secondary memory (disc) where it must be moved to main memory prior to reading or writing?
I’m not going to go into the exact calculations, they’re pretty simple, and you can read the paper for that (link below). Every ten years since then, someone has written an updated paper on how things have changed. The most recent update was written in 2017.
I did a rough calculation on these papers to show how much more complicated the world of computing has gotten. It’s an unscientific measure, since it doesn’t account for things like the changing standards of science papers and different audience, but it’s still enlightening.
The tldr of what has changed in 30 years:
- According to the number of words to answer the same question, it is now 4.5x more complicated to talk about the same topic.
For some context:
- According to Moore’s Law, computing has gotten 32,768x more efficient.
- Personal computers were mostly 8-bit in 1985, and are mostly 64-bit in 2017.
The original 1985 paper⌗
The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time by Jim Gray, and Franco Putzolu
This 1985 paper has two authors, ~1750 words, and zero references.
The storage hierarchy in 1985 was DRAM, then HDD.
The answer they come up with is simple:
- Pages referenced every five minutes should be in DRAM.
The 2017 update⌗
The Five-minute Rule Thirty Years Later and its Impact on the Storage Hierarchy by Raja Appuswamy, Renata Borovica-Gajic, Goetz Graefe, and Anastasia Ailamaki
This 2017 update has four authors, ~7850 words, and 33 references.
The storage hierarchy is DRAM, SSD, and HDD.
Note 1: The authors acknowledge VTL and tape backups for long term storage, but they run different calculations on it, since it just isn’t used for dynamic access.
Note 2: This paper only handles the first half of the original paper, it doesn’t bother with the 5 byte rule. But if you remove the section on long term storage from this paper, and the section on the 5 byte rule from the first paper, the resulting ratio of words is the same.
By adding just one more layer, SSD, answer is now in three parts.
- Between DRAM and HDD, pages referenced every four hours should be in DRAM.
- Between DRAM and SSD, pages referenced every seven minutes should be in DRAM.
- Between SSD and HDD, pages referenced every one day should be in SSD.