Five-minute Rule
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, the five-minute rule is a
rule of thumb In English language, English, the phrase ''rule of thumb'' refers to an approximate method for doing something, based on practical experience rather than theory. This usage of the phrase can be traced back to the 17th century and has been associat ...
for deciding whether a data item should be kept in
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
, or stored on disk and read back into memory when required. It was first formulated by Jim Gray and Gianfranco Putzolu in 1985, and then subsequently revised in 1997 and 2007 Free version in ''
ACM Queue ACM ''Queue'' (stylized ''acmqueue'') is a bimonthly computer magazine, targeted to software engineer Software engineering is a branch of both computer science and engineering focused on designing, developing, testing, and maintaining softwar ...
''
September 2008
to reflect changes in the relative cost and performance of memory and persistent storage. The rule is as follows:
The 5-minute random rule: cache randomly accessed disk pages that are re-used every 5 minutes or less.
Gray also issued a counterpart one-minute rule for sequential access:
The 1-minute rule: cache sequentially accessed disk pages that are re-used every 1 minute or less.
Although the 5-minute rule was invented in the realm of databases, it has also been applied elsewhere, for example, in
Network File System Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like ...
cache capacity planning. The original 5-minute rule was derived from the following cost-benefit computation: :BreakEvenIntervalinSeconds = (PagesPerMBofRAM / AccessesPerSecondPerDisk) × (PricePerDiskDrive / PricePerMBofRAM) Applying it to 2007 data yields approximately a 90-minutes interval for magnetic-disk-to-DRAM caching, 15 minutes for
SSD A solid-state drive (SSD) is a type of solid-state storage device that uses Integrated circuit, integrated circuits to store data persistence (computer science), persistently. It is sometimes called semiconductor storage device, solid-stat ...
-to-DRAM caching and 2 hours for disk-to-SSD caching. The disk-to-DRAM interval was thus a bit short of what Gray and Putzolu anticipated in 1987 as the "five-hour rule" was going to be in 2007 for RAM and disks. According to calculations by
NetApp NetApp, Inc. is an American data infrastructure company that provides unified data storage, integrated data services, and cloud operations (CloudOps) solutions to enterprise customers. The company is based in San Jose, California. It has ranked ...
engineer David Dale as reported in
The Register ''The Register'' (often also called El Reg) is a British Technology journalism, technology news website co-founded in 1994 by Mike Magee (journalist), Mike Magee and John Lettice. The online newspaper's Nameplate_(publishing), masthead Logo, s ...
, the figures for disc-to-DRAM caching in 2008 were as follows: "The 50KB page break-even was five minutes, the 4KB one was one hour and the 1KB one was five hours. There needed to be a 50-fold increase in page size to cache for break-even at five minutes." Regarding disk-to-SSD caching in 2010, the same source reported that "A 250KB page break even with SLC was five minutes, but five hours with a 4KB page size. It was five minutes with a 625KB page size with MLC flash and 13 hours with a 4KB MLC page size." In 2000, Gray and Shenoy applied a similar calculation for web page caching and concluded that a browser should "cache web pages if there is any chance they will be re-referenced within their lifetime."Jim Gray, Prashant Shenoy,
Rules of Thumb in Data Engineering
, MS-TR-99-100


References

{{Reflist, 30em Memory management algorithms Rules of thumb Cache (computing)