Interpreting File System Fragmentation:

   In my experience, when a file system approaches 15-20% fragmentation,
   performance has already degraded sufficiently to warrant tuning it.
   On the other hand, my system has very slow disks.  Faster disks may be
   able to absorb the additional overhead without much noticeable degradation.

   Remember also that the most fragmented files are most likely those which
   have been created most recently (and therefore most likely to be used
   frequently).  So even though the total fragmentation seems low, the files
   that are used most frequently may be very fragmented.  Fsanalyze can tell
   you which files are most fragmented, making it possible to fix them
   without taking the time to rebuild the entire file system.  For hints on
   how to do this, I have included a message at the end of this file that
   was recently posted to the net by Charles Hedrick
   (hedrick@athos.rudgers.edu).

   The BSD file system is designed to minimize fragmentation at the expense
   of CPU cycles.  Therefore, unless the file system is too full, BSD file
   systems should never fragmentation over a few percent, and average seek
   distance should always be very small.  BSD file systems allocation
   strategy may be set to minimize allocation time or disk fragmentation.
   Fsanalyze could be used to determine the relative benefits/costs of
   each strategy.

   On standard file systems, fsanalyze sometimes will report fragmentation
   even for clean (i.e., tuned) file systems.  The telltale symptom is a
   large fragmentation percentage but an average seek distance of exactly
   1.0 cylinders.  There are a couple of reasons for this behavior.

   First of all, on standard file systems at least, mkfs is not very
   smart in figuring out the optimum order of blocks on the free list.
   Specifically, mkfs assumes that the first data block is at the
   beginning of a cylinder, whether it is or not.  If it is not (because
   the user wasn't very careful about chosing the number of inodes
   in the file system), the interleaving algorithm mkfs uses may end
   up causing "sequential" data blocks to jump back and forth across
   a cylinder boundary.  Fsanalyze correctly detects this suboptimal
   block placement as excess disk seeks.  Re-mkfs-ing the file system
   with a number of inodes that uses disk cylinders more effectively
   will improve things a great deal.

   Another source of excess disk seeks (with distance of 1 cylinder) is
   when a file is placed such that it spans a cylinder even though it
   is physically capable of residing entirely on one cylinder.  This
   is just a fact of life on standard file systems, but it shouldn't
   show up very often on BSD systems.  I don't see any way of improving
   file placement at this level except through the use of a more intelligent
   disk tuning strategy.

Porting fsanalyze to other systems:
   Fsanalyze takes its information on the structure of the file system
   from <sys/param.h>, <sys/filsys.h> and <sys/ino.h> (or their equivalents).
   As long
   as these files (or equivalent) are present on your system, porting
   should be straightforward.  When looking for the equivalent of these
   files on your system, be sure to get the disk-based versions, rather
   than the in-memory structures -- they are different.  You may have to
   modify the #include directives in fsanalyze.h to reflect the structure
   of your /usr/include directory.

   In the file fsconfig.h are a number of macro definitions which provide
   common access to file system structures for various systems.  The
   vast majority of the changes required to port fsanalyze should be made
   here.  There are currently two major sets of macros, those for
   BSD-derived file systems, and those for AT&T-derivatives.  Choose the
   macro set which most closely reflects your file system, and make
   OS-specific changes based on the _OS_TYPE macro.  Feel free to
   create new os-types as needed!

   I attempted to use types defined in /usr/include/sys/types.h wherever
   possible to ensure portability.  I don't think I made any assumptions
   about sizeof(int) == sizeof(long) == sizeof(char *), so there should
   be no problems there.

   Please try to make changes to the source using #defines and #ifdefs
   wherever possible, and please email me the changes you make along with
   a description of the system you ported it to (and problems encountered),
   so I can merge all the changes into a single copy.

   Good luck!

------------------------------------------------------------------------------
The following is a message posted to the net recently that gives details
of how to defragment individual files under System V.  This method should
also work for any version of unix which has the fsck -s option (everything
since Version 7, I think).  This won't work for BSD file systems.

From mrst!apollo!ulowell!dandelion!necntc!ames!ncar!boulder!sunybcs!rutgers!aramis.rutgers.edu!athos.rutgers.edu!hedrick Mon Apr 18 10:23:50 EDT 1988
Article 459 of comp.unix.microport:
Path: sdti!mrst!apollo!ulowell!dandelion!necntc!ames!ncar!boulder!sunybcs!rutgers!aramis.rutgers.edu!athos.rutgers.edu!hedrick
>From: hedrick@athos.rutgers.edu (Charles Hedrick)
Newsgroups: comp.unix.microport
Subject: disk fragmentation
Message-ID: <Apr.12.21.47.44.1988.21023@athos.rutgers.edu>
Date: 13 Apr 88 01:47:00 GMT
Organization: Rutgers Univ., New Brunswick, N.J.
Lines: 63


Today I called the Microport support people to see if they could do
something to help the slow startup I'm seeing for long programs,
particularly Emacs (actually Jove).  I suspected that the file system
was badly fragmented, and thus that the file was inefficiently ordered
on the disk.  They gave me some advice that I thought is worth passing
on.  Apparently System V doesn't bother to keep the free list in
order.  So after time, it begins to get randomized.  The result is
that reading a long file requires the disk drive to skip all around
the disk.  The only way to completely solve this problem is to
reorganize the disk: either do a backup, recreate the file system
(i.e. do the mkfs again) and reload from the backup, or use dcopy to
copy it to a fresh file system.  However fsck's -s option provides a
fairly good way of defragmenting a live file system.  Note that this
doesn't reorder files that are already randomized.  What it does do is
sort the free list.  So any new files you create after doing this are
reasonably organized.  What I tried was doing fsck -s, the copying a
few programs that seemed to be loading slowly, and finally renaming
the copy to the original name.  (Warning: copy all the files first,
before you get rid of any of the originals.  As soon as you remove any
of the originals, you'll be putting those random blocks right back
onto the free list.)  The result is that reading through emacs went
from 18 seconds to 4 seconds.  I used getblks (a program from the bbs
that shows the block numbers of all the blocks in a file) on both
copies of the file, and the difference is drastic.  I wouldn't quite
use the block numbers of the original as a random number generator,
but they do skip around alarmingly.  The copy has only a few jumps.
Presumably if I did a backup and restore things would be slightly
better yet, as those remaining jumps would be gone.  But it's much
easier to run fsck regularly than to dump the file system and restore
it regularly.

For those who find my wording ambiguous, he is roughly what I did to
defragment files /usr/local/bin/emacs and /usr/local/bin/kermit.  (I
didn't write the steps down as I did them, so something could be
missing, but this is fairly close.)

1) reboot from the standalone boot floppy
2) fsck /dev/rdsk/0s2.  Do this until it runs without errors
3) fsck -s /dev/rdsk/0s2.  This sorts the free list.
4) /etc/mount /dev/dsk/0s2 /mnt    (this is /usr)
5) cd /mnt/local/bin
6) cp emacs emacs.new
   cp kermit kermit.new
  ... etc. for the group I files I wanted to fix
   mv emacs.new emacs
   mv kermit.new kermit
  ... note that it is critical to do all the cp's before any of
  the mv's.
7) cd /
8) /etc/umount /dev/dsk/0s2
9) fsck -s /dev/rdsk/0s2.  (This is because the process above
   freed a bunch of files with random junk in them.  I wanted
   to start off with my free list in good shape.  This isn't
   strictly necessary.
10) sync
11) reboot from hard disk normally

This worked because there were just a few big files that needed to be
defragmented.  If I had a huge number of files that all were causing
performance problems, it would be easier to dump the disk and recreate
it.  Of course if I did the fsck -s regularly enough, I might be
able to prevent the problem in the first place.

