Cache Hit Ratio

From Wikistix

The Cache Hit Ratio is the ratio of the number of cache hits to the number of lookups, usually expressed as a percentage. Depending on the nature of the cache, expected hit ratios can vary from 60% to greater than 99%.

Caches are used in many parts of computer systems - from CPU level 1 and level 2 caches, translation look-aside buffers (TLBs), operating system file system caches, and database (block) buffer caches (Oracle, Sybase, DB2, PostgreSQL, MySQL using InnoDB, etc). In all cases, the cache attempts to keep recently used data in a small area that is faster than the large, slow primary storage area, with the hope that the data will be accessed again, soon. The system then benefits from the faster access times.

Cache Hit Ratio vs Relative Performance

Cache Hit Ratios are inherently logarithmic; the closer to 100%, the exponentially greater the gains. A simple way of visualising the nature of cache hit ratios, is to attempt to convert a ratio to a relative performance metric (ie. "transactions" or "operations" per second), by estimating the relative costs of a cache hit and a cache miss. This can be expressed as:

[math]\displaystyle{ \begin{align} a & = \mathit{cachehitcost}\\ b & = \mathit{cachemisscost}\\ r & = \mathit{cachehitratio}\\ p & = \mathit{relativeperformance}\\ p & = \frac{1}{a r + b(1 - r)}\\ \end{align} }[/math]

Graphically, given a cache miss cost of 0.005 s (5 ms) and a hit cost of 0.000001 s (1 μs), which may be the case for a database engine (disk I/O vs virtual memory overheads), the exponential behaviour is clear.

It can also be seen, that the more disparate the hit and miss costs, as is the case in modern computer systems, the relative performance quickly approaches:

[math]\displaystyle{ p = \frac{1}{1 - r} }[/math]

Therefore the difference between two relative cache hit ratios, with a large difference between hit and miss costs, can be given by:

[math]\displaystyle{ \frac{1 - r_{1}}{1 - r_{2}} }[/math]

Example: The difference between 98% cache hit ratio and 95% cache hit ratio is a factor of 2.5.

[math]\displaystyle{ \frac{1 - 0.95}{1 - 0.98} = 2.5 }[/math]

Prefer measuring the cache miss ratio

Cache Miss Ratio (log10) vs Relative Performance
  • cache hit ratio asymptotes towards 100%, making displaying high dynamic range challenging.
  • miss ratio may be graphed with a logarithmic scale, expanding the dynamic range. Given that the relationship is exponential, this is appropriate.
  • there may be a psychological aspect, in that 85% hit ratio may sound much better to some observers than a 15% miss ratio, even though they are equivalent.

To the right is a graph of the same data as above, presented as a logarithmic miss ratio.