Disk Caching Technology
Are you using MS-DOS? If yes, do you feel something different after
applying SMARTDRV.EXE? There should be difference if your hard disk is not as
fast as the RAM (of course, computers today are like this). Are you using
Windows? If yes, do you feel when there is still enough free physical memory,
the system runs faster than it does when there is not enough physical memory? I
guess you may answer yes to both questions. What's the difference? Files
accessed faster, the disk LED flashing less, or something else? And, if you
start an application twice at a time, you will find the second time it starts
faster. This is an example of using room for time. If you run SMARTDRV.EXE, you
may have noticed that it has an option "cache size", which means this certain
amount of RAM will be occupied. Also, when you are using Windows, by default, a
certain amount of RAM is used as disk cache. Try running "System Monitor" or
"Performance Monitor" in Windows. Then add an item in the category "memory",
the item "disk cache size", and then you can see some space of the RAM is used
as the disk cache.
If you are using MS-DOS, you can specify the memory usage of the disk
cache by giving parameters to SMARTDRV.EXE when it starts. If you are using
Windows 9x, you can also specify the maximum disk cache size and the minimum
disk cache size. It is changed automatically by Windows. By default, the two
sizes are not specified. To specify them, open your Windows' SYSTEM.INI file.
Find out the line "[vcache]". To specify the maximum, type "MaxFileCache=nnnn",
here "nnnn" is the number of kilobytes in decimal. And the minimum, type
"MinFileCache=nnnn". I don't suggest a novice do
so. If you know the proper cache size for your computer, specifying it
may enhance your computer's performance.
There are several different ways to speed up disk access through
caching:
1. Store part of a file or information just read. The next time the system
reads the file or information, the part stored in the RAM can be got swiftly.
This is called read caching. To speed up directory access, Windows also
provides a pathname cache. It is also a read cache. On MS-DOS, the pathname
cache is implemented in fastopen. However, even if you do not use fastopen,
smartdrv still boosts the speed of file name access, because it implemented
physical block caching.
2. Store information that is going to be written to the disk in the RAM. The
next time the system reads or writes the information, the operation can be
done in the RAM. In order to keep the information on the disk up with the
information in RAM, the system flushes the data to the disk regularly. This is
called write caching. Because data is not written to the disk
immediately, this kind of disk cache is also called "lazy write" or
"write-behind cache". This cache is sometimes risky, as data may be lost. It is
more risky on unstable systems such as MS-DOS or Windows 9x.
3. When the disk is being accessed, the information should be read by sectors.
Reading sectors sequentially is much faster than reading sectors randomly,
because time spent on the moves of the heads of the hard disk and waiting for
the correct sector on the disk to be rotated right under the head is often
long. And in most situations, a file is stored continuously on a disk, so
reading a set of continuous sectors and storing them in the RAM can usually
speed up disk access by a large extent. This is called prefetch caching.
This sort of cache is especially useful during random file access. Also, it
will speed up file copying on the same disk by a large extent.
As mentioned in a paragraph above, smartdrv implements physical block
caching. That is, smartdrv caches data blocks below the file system level, and
above the hard disk level. Windows NT, on the other hand, implements logical
disk caching. It caches data blocks belonging to specific files, but it also
needs to know the positions of the file blocks on the disk. It is implemented
as a library for the file system to call.
The ways of caching listed above are only of general concept. In actual
systems, we need to find some good ways to make the disk cache utilized better.
There have been many disk cache tools in history. Some of them are very good. A
traditional cache recycling technology is the LRU (Least Resently Used) list,
which helps determining which page should be discarded. However, there are many
new technologies that work better. I am interested in this field, although I
don't know very much. I think I would like to research these technologies
serveral years later.