Subject: v12i068: Public domain TAR, Part01/03 Newsgroups: comp.sources.unix Sender: sources Approved: rs@uunet.UU.NET Submitted-by: John Gilmore Posting-number: Volume 12, Issue 68 Archive-name: pdtar/part01 [ See the first two paragraphs of the README for information on what this fine program does. --r$ ] : To unbundle, sh this file echo README cat >README <<'@@@ Fin de README' This is the Nov87 release of a public domain tar(1) replacement. It implements the 'c', 'x', and 't' commands of Unix tar, and many of the options. It creates P1003 "Unix Standard" [draft 6] tapes by default, and can read and write both old and new formats. It can compress or decompress tar archives "on the fly" (using the 'z' option) as well as accessing remote tape drives or files by specifying "host:/dev/tapedrive". It lets you set the default tape drive by setting TAPE in your environment. Its verbose output looks more like "ls -l" than the Unix tar, the columns line up, and you can get verbose listings from the 'cvv' option as well as from 'xvv' and 'tv'. It does shell-globbing (regular expressions) for listing and extraction. It is a little better at reading damaged tapes than Unix tar. There is a half-baked "diff" option for comparing a tape against the file system. And it's free. It is designed to be a lot more efficient than the standard Unix tar; it does as little bcopy-ing as possible, and does file I/O in large blocks. On the other hand, it has not been timed or performance-tuned; it's just *designed* to be faster. On SunOS 3.3, the tar archives it creates under the 'old' option are byte-for-byte the same as those created by /bin/tar, except the trash at the end of each file and at the end of the archive has been replaced by zeroes. It was written and initially debugged on a Sun Workstation running 4.2BSD. It has been run on Xenix, Unisoft, Vax 4.2BSD, utzoonix, USG, Masscomp, Minix, and MSDOS systems. I'm interested in finding people who will port it to other types of (Unix and non-Unix) systems, use it, and send back the changes; and people who will add the obscure tar options that they happen to use and I don't. In particular, VMS, Mac, Atari and Amiga versions would be handy. It still has a number of loose ends, marked by "FIXME" comments in the source. Fixes to these things are also welcome. I am the author of all the code in this program, except some of the subroutines, which are from contributors listed below. I hereby place it in the public domain. If you modify it, or port it to another system, please send me back a copy, so I can keep a master source. This program is much better than it started, due to the effort and care put in by Henry Spencer, Fred Fish, Ian Darwin, Geoff Collyer, Stan Barber, Guy Harris, Dave Brower, Richard Todd, Michael Rendell, Stu Heiss, and Rich $alz. Thank you, one and all. John Gilmore Nebula Consultants PO Box 170608 San Francisco, California, USA 94117-0608 hoptoad!gnu or gnu@toad.com Hoptoad talks to sun, ptsfa, ihnp4, utzoo, ucsfcgl. @(#)README 1.14 87/11/11 @@@ Fin de README echo PORTING cat >PORTING <<'@@@ Fin de PORTING' Porting hints for public domain tar John Gilmore, ihnp4!hoptoad!gnu @(#)PORTING 1.13 87/11/11 The Makefile should be edited to comment out all the undesired versions, and create the following configuration lines for the system you are compiling it on: DEFS = the proper #define's to conditionally compile for your system. LIBS = the system libraries and/or object modules to link with the program. LINT = the lint program (or the compiler with extra checking turned on) LINTFLAGS = a good strong way to invoke 'lint' on your system. DEF_AR_FILE = the name of the default archive file on your system. It should be enclosed in quoted quotes, e.g. \"/dev/foo\" . DEFBLOCKING = the default blocking factor on your system. O = the suffix for object files ('o', except 'obj' for MSDOS). A copy of "getopt", the standard argument parser, is required. It's in libc on Missed'em V systems and 4.3BSD; on most other systems, you'll need a copy of a public domain getopt, available through the comp.sources.unix archives, or from the AT&T Toolchest if you can't find it elsewhere. A copy of the Berkeley directory access routines is also required. These are in libc and on Berkeley systems. A public domain version is available through comp.sources.unix. There is an #include you have to change in create.c for this, to set the name of the include file you have. Some systems have the include file in . You'll have to find it on your system, or get the public domain one and place it somewhere. For MSDOS, I have supplied these directory routines in msd_dir.c and msd_dir.h, since it's likely that your system doesn't have them. To permanently install these into your MSC 3.0 library, do the following: copy msd_dir.h c:\c\include\sys\dir.h cl -A$(MODEL) -c msd_dir.c lib $(MODEL)dir.lib msd_dir.obj; Change c:\c\include to wherever your standard include directory is. You might have to modify this procedure if you aren't using MSC 3.0. Grep for FIXME to find places that aren't finished or which have portability problems. Also see the file TODO. The MSDOS port was done under the Microsoft C 3.0 compiler and libraries. In the Makefile, COPTS should be changed to -Zi or nothing; and there is a special link command for making tar.exe, which you will have to uncomment, since MSDOS can't handle command lines longer than 128 bytes. Also, clean and install will not work unless you change / in path names to \. On Minix, there are a bunch of problems. "V7 compatible" my ass. * "make" doesn't expand macros in the Makefile properly. You will probably have to expand them by hand. Better to go in and fix Minix "make" though... * The directory access library is nonexistent. It wasn't in V7 but anybody who writes code without it, even on V7 systems, is a fool. * Various other library routines are broken, e.g. printf() doesn't take "%*s" or "%.*s"; no which Unix requires, ctime(), getopt(). @@@ Fin de PORTING echo Makefile cat >Makefile <<'@@@ Fin de Makefile' # Makefile for public domain tar program. # @(#)Makefile 1.30 87/11/11 # Berserkeley version DEFS = -DBSD42 LDFLAGS = LIBS = LINT = lint LINTFLAGS = -abchx DEF_AR_FILE = \"/dev/rmt8\" DEFBLOCKING = 20 O = o # USG version #DEFS = -DUSG #LDFLAGS = #LIBS = -lndir #LINT = lint #LINTFLAGS = -p #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 #O = o # UniSoft's Uniplus SVR2 with NFS #DEFS = -DUSG -DUNIPLUS -DNFS -DSVR2 #LDFLAGS = #LIBS = -lndir #LINT = lint #LINTFLAGS = -bx #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 #O = o # MASSCOMP version #CC = ucb cc #DEFS = -DBSD42 #LDFLAGS = #LIBS = #LINT = lint #LINTFLAGS = -bx #DEF_AR_FILE = \"/dev/rmt0\" #DEFBLOCKING = 20 #O = o # (yuk) MS-DOS (Microsoft C) version #MODEL = S #DEFS = -DNONAMES -A$(MODEL) -nologo #LDFLAGS = #LIBS = $(MODEL)dir.lib #LINT = $(CC) #LINTFLAGS = -W3 #DEF_AR_FILE = \"tar.out\" #DEFBLOCKING = 20 #O = obj # V7 version # Pick open3 emulation or nonexistence. See open3.h, port.c. ##DEFS = -DV7 -DEMUL_OPEN3 -Dvoid=int ##DEFS = -DV7 -DNO_OPEN3 -Dvoid=int #LDFLAGS = #LIBS = -lndir #LINT = lint #LINTFLAGS = -abchx #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 #O = o # Minix version # No lint, so no lintflags. Default file is stdin/out. (Minix "tar" # doesn't even take an "f" flag, it assumes argv[2] is the archive name!) # Minix "make" doesn't expand macros right, so Minix users will have # to expand CFLAGS, SRCS, O, etc by hand, or fix your make. Not my problem! # You'll also need to come up with getopt() and ctime(), the directory # library, and a fixed doprintf() that handles %*s. Put this stuff in # the "SUBSRC/SUBOBJ" macro below if you didn't put it in your C library. # Note that Minix "cc" produces ".s" files, not .o's, so O = s has been set. # # Pick open3 emulation or nonexistence. See open3.h, port.c. ##DEFS = -DV7 -DMINIX -DEMUL_OPEN3 ##DEFS = -DV7 -DMINIX -DNO_OPEN3 #LDFLAGS = #LIBS = #DEF_AR_FILE = \"-\" #DEFBLOCKING = 8 /* No good reason for this, change at will */ #O = s # Xenix version #DEFS = -DUSG -DXENIX #LDFLAGS = #LIBS = -lx #LINT = lint #LINTFLAGS = -p #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 #O = o CFLAGS = $(COPTS) $(ALLDEFS) ALLDEFS = $(DEFS) \ -DDEF_AR_FILE=$(DEF_AR_FILE) \ -DDEFBLOCKING=$(DEFBLOCKING) # next line for Debugging COPTS = -g # next line for Production #COPTS = -O # Add things here like getopt, readdir, etc that aren't in your # standard libraries. (E.g. MSDOS needs getopt, msd_dir.c, msd_dir.obj) SUBSRC= SUBOBJ= # Destination directory and installation program for make install DESTDIR = /usr/pd INSTALL = cp RM = rm -f SRC1 = tar.c create.c extract.c buffer.c getoldopt.c SRC2 = list.c names.c diffarch.c port.c wildmat.c $(SUBSRC) SRCS = $(SRC1) $(SRC2) OBJ1 = tar.$O create.$O extract.$O buffer.$O getoldopt.$O list.$O OBJ2 = names.$O diffarch.$O port.$O wildmat.$O $(SUBOBJ) OBJS = $(OBJ1) $(OBJ2) AUX = README PORTING Makefile TODO tar.1 tar.5 tar.h port.h open3.h \ msd_dir.h msd_dir.c all: tar tar: $(OBJS) $(CC) $(LDFLAGS) -o tar $(COPTS) $(OBJS) $(LIBS) # command is too long for Messy-Dos (128 char line length limit) so # this kludge is used... # @echo $(OBJ1) + > command # @echo $(OBJ2) >> command # link @command, $@,,$(LIBS) /NOI; # @$(RM) command install: all $(RM) $(DESTDIR)/tar $(DESTDIR)/.man/tar.[15] $(INSTALL) tar $(DESTDIR)/tar $(INSTALL) tar.1 $(DESTDIR)/.man/tar.1 $(INSTALL) tar.5 $(DESTDIR)/.man/tar.5 lint: $(SRCS) $(LINT) $(LINTFLAGS) $(ALLDEFS) $(SRCS) clean: $(RM) errs $(OBJS) tar tar.shar: $(SRCS) $(AUX) shar >tar.shar1 $(AUX) shar >tar.shar2 $(SRC1) shar >tar.shar3 $(SRC2) tar.tar.Z: $(SRCS) $(AUX) /bin/tar cf - $(AUX) $(SRCS) | compress -v >tar.tar.Z $(OBJS): tar.h port.h @@@ Fin de Makefile echo TODO cat >TODO <<'@@@ Fin de TODO' @(#) TODO 1.15 87/11/06 Test owner/group on extraction better. creation of links, symlinks, nodes doesn't follow the -k (f_keep) guidelines; if the file already exists, it is not replaced, even though no -k. Check stderr and stdout for errors after writing, and quit if so. Preliminary design of Multifile option to handle EOFs on input and output. Multifile can just close the archive when it hits end of archive, and ask for archive to be changed. It has no choice on some media, e.g. floppies and cartridge tapes, where there is no room for an EOF block there. Start off 2nd archive medium with odd header block, duplicating original, but with offset to start of data spec'd. Reading such a header causes tar non-'M' to complain while extracting (but to seek there and do it anyway!) Big win -- this works on cartridge tapes, should work on floppies, might work on magtape. It would encourage the *&%#$ systems programmers to fix their drivers, too! Profile it and see where the time, call counts, etc are going. Fix directory timestamps after inserting files into them. Wait til next file that's not in the directory. Need a stack of them. Option to seek the input file (in skip_file) rather than reading and tossing it? (Could just jump in buffer if stuff is in core.) Could misalign archive reads versus filesys and slow it down, who knows? Add -C option for creating from odd directories a la 4.2BSD? Break out odd bits of code into separate support modules. Add the r, u, X, l, F, C, and digit options of Unix tar. V8 tar does something that is quite handy when reading tapes written on 4.2 system into non-4.2 systems: it reduces file name components to 14 bytes or less and ensures that they are unique (I think it truncates to 10 bytes and appends "..aa" where aa are two unique letters) and puts out a file containing the mapping between long names on tape and short names on disk. Clean up 'd' (diff) option. Currently it works for regular files and symlinks, needs work for dirs and links. Ideally, output should look like "diff -r" or -rl after an extract of the tape and a real diff. Right now it's very messy. To do the above, we'd need to read the directories that we touch and check all the file names against what's on the tape. All we do now is check the file contents and stats. Check "int" variables to see if they really need to be long (file sizes, record counts, etc). Sizes of in-core buffers should be int; since malloc() takes an int argument we can never allocate one any bigger. Maybe unsigned int would be better, though. Little system people, help me out here! (E.g. run lint on it on your system and send me the result if it shows anything fixable.) @@@ Fin de TODO echo tar.1 cat >tar.1 <<'@@@ Fin de tar.1' .TH TAR 1 "5 November 1987" .\" @(#)tar.1 1.12 11/6/87 Public Domain - gnu .SH NAME tar \- tape (or other media) file archiver .SH SYNOPSIS \fBtar\fP \-[\fBBcdDhiklmopRstvxzZ\fP] [\fB\-b\fP \fIN\fP] [\fB\-f\fP \fIF\fP] [\fB\-T\fP \fIF\fP] [ \fIfilename or regexp\fP\| .\|.\|. ] .SH DESCRIPTION \fItar\fP provides a way to store many files into a single archive, which can be kept in another Unix file, stored on an I/O device such as tape, floppy, cartridge, or disk, sent over a network, or piped to another program. It is useful for making backup copies, or for packaging up a set of files to move them to another system. .LP \fItar\fP has existed since Version 7 Unix with very little change. It has been proposed as the standard format for interchange of files among systems that conform to the IEEE P1003 ``Portable Operating System'' standard. .LP This version of \fItar\fP supports some of the extensions which were proposed in the P1003 draft standards, including owner and group names, and support for named pipes, fifos, contiguous files, and block and character devices. .LP When reading an archive, this version of \fItar\fP continues after finding an error. Previous versions required the `i' option to ignore checksum errors. .SH OPTIONS \fItar\fP options can be specified in either of two ways. The usual Unix conventions can be used: each option is preceded by `\-'; arguments directly follow each option; multiple options can be combined behind one `\-' as long as they take no arguments. For compatability with the Unix \fItar\fP program, the options may also be specified as ``keyletters,'' wherein all the option letters occur in the first argument to \fItar\fP, with no `\-', and their arguments, if any, occur in the second, third, ... arguments. Examples: .LP Normal: tar -f arcname -cv file1 file2 .LP Old: tar fcv arcname file1 file2 .LP At least one of the \fB\-c\fP, \fB\-t\fP, \fB-d\fP, or \fB\-x\fP options must be included. The rest are optional. .LP Files to be operated upon are specified by a list of file names, which follows the option specifications (or can be read from a file by the \fB\-T\fP option). Specifying a directory name causes that directory and all the files it contains to be (recursively) processed. If a full path name is specified when creating an archive, it will be written to the archive without the initial "/", to allow the files to be later read into a different place than where they were dumped from, and a warning will be printed. If files are extracted from an archive which contains full path names, they will be extracted relative to the current directory and a warning message printed. .LP When extracting or listing files, the ``file names'' are treated as regular expressions, using mostly the same syntax as the shell. The shell actually matches each substring between ``/''s separately, while \fItar\fP matches the entire string at once, so some anomalies will occur; e.g. ``*'' or ``?'' can match a ``/''. To specify a regular expression as an argument to \fItar\fP, quote it so the shell will not expand it. .IP "\fB\-b\fP \fIN\fP" Specify a blocking factor for the archive. The block size will be \fIN\fP x 512 bytes. Larger blocks typically run faster and let you fit more data on a tape. The default blocking factor is set when \fItar\fP is compiled, and is typically 20. There is no limit to the maximum block size, as long as enough memory can be allocated for it, and as long as the device containing the archive can read or write that block size. .IP \fB\-B\fP When reading an archive, reblock it as we read it. Normally, \fItar\fP reads each block with a single \fIread(2)\fP system call. This does not work when reading from a pipe or network socket under Berkeley Unix; \fIread(2)\fP only gives as much data as has arrived at the moment. With this option, it will do multiple \fIread(2)\fPs to fill out to a record boundary, rather than reporting an error. This option is default when reading an archive from standard input, or over a network. .IP \fB\-c\fP Create an archive from a list of files. .IP \fB\-d\fP Diff an archive against the files in the file system. Reports differences in file size, mode, uid, gid, and contents. If a file exists on the tape, but not in the file system, that is reported. This option needs further work to be really useful. .IP \fB\-D\fP When creating an archive, only dump each directory itself; don't dump all the files inside the directory. In conjunction with \fIfind\fP(1), this is useful in creating incremental dumps for archival backups, similar to those produced by \fIdump\fP(8). .IP "\fB\-f\fP \fIF\fP" Specify the filename of the archive. If the specified filename is ``\-'', the archive is read from the standard input or written to the standard output. If the \fB-f\fP option is not used, and the environment variable \fBTAPE\fP exists, its value will be used; otherwise, a default archive name (which was picked when tar was compiled) is used. The default is normally set to the ``first'' tape drive or other transportable I/O medium on the system. .IP If the filename contains a colon before a slash, it is interpreted as a ``hostname:/file/name'' pair. \fItar\fP will invoke the commands \fIrsh\fP and \fIdd\fP to access the specified file or device on the system \fIhostname\fP. If you need to do something unusual like rsh with a different user name, use ``\fB\-f \-\fP'' and pipe it to rsh manually. .IP \fB\-h\fP When creating an archive, if a symbolic link is encountered, dump the file or directory to which it points, rather than dumping it as a symbolic link. .IP \fB\-i\fP When reading an archive, ignore blocks of zeros in the archive. Normally a block of zeros indicates the end of the archive, but in a damaged archive, or one which was created by appending several archives, this option allows \fItar\fP to continue. It is not on by default because there is garbage written after the zeroed blocks by the Unix \fItar\fP program. Note that with this option set, \fItar\fP will read all the way to the end of the file, eliminating problems with multi-file tapes. .IP \fB\-k\fP When extracting files from an archive, keep existing files, rather than overwriting them with the version from the archive. .IP \fB\-l\fP When dumping the contents of a directory to an archive, stay within the local file system of that directory. This option only affects the files dumped because they are in a dumped directory; files named on the command line are always dumped, and they can be from various file systems. This is useful for making ``full dump'' archival backups of a file system, as with the \fIdump\fP(8) command. Files which are skipped due to this option are mentioned on the standard error. .IP \fB\-m\fP When extracting files from an archive, set each file's modified timestamp to the current time, rather than extracting each file's modified timestamp from the archive. .IP \fB\-o\fP When creating an archive, write an old format archive, which does not include information about directories, pipes, fifos, contiguous files, or device files, and specifies file ownership by uid's and gid's rather than by user names and group names. In most cases, a ``new'' format archive can be read by an ``old'' tar program without serious trouble, so this option should seldom be needed. .IP \fB\-p\fP When extracting files from an archive, restore them to the same permissions that they had in the archive. If \fB\-p\fP is not specified, the current umask limits the permissions of the extracted files. See \fIumask(2)\fP. .IP \fB\-R\fP With each message that \fItar\fP produces, print the record number within the archive where the message occurred. This option is especially useful when reading damaged archives, since it helps to pinpoint the damaged section. .IP \fB\-s\fP When specifying a list of filenames to be listed or extracted from an archive, the \fB\-s\fP flag specifies that the list is sorted into the same order as the tape. This allows a large list to be used, even on small machines, because the entire list need not be read into memory at once. Such a sorted list can easily be created by running ``tar \-t'' on the archive and editing its output. .IP \fB\-t\fP List a table of contents of an existing archive. If file names are specified, just list files matching the specified names. The listing appears on the standard output. .IP "\fB\-T\fP \fIF\fP" Rather than specifying file names or regular expressions as arguments to the \fItar\fP command, this option specifies that they should be read from the file \fIF\fP, one per line. If the file name specified is ``\-'', the list is read from the standard input. This option, in conjunction with the \fB\-s\fP option, allows an arbitrarily large list of files to be processed, and allows the list to be piped to \fItar\fP. .IP \fB\-v\fP Be verbose about the files that are being processed or listed. Normally, archive creation, file extraction, and differencing are silent, and archive listing just gives file names. The \fB\-v\fP option causes an ``ls \-l''\-like listing to be produced. The output from -v appears on the standard output except when creating an archive (since the new archive might be on standard output), where it goes to the standard error output. .IP \fB\-x\fP Extract files from an existing archive. If file names are specified, just extract files matching the specified names, otherwise extract all the files in the archive. .IP "\fB\-z\fP or \fB\-Z\fP" The archive should be compressed as it is written, or decompressed as it is read, using the \fIcompress(1)\fP program. This option works on I/O devices and over the network, as well as on disk files; data to or from such devices is reblocked using a ``dd'' command to enforce the specified (or default) block size. The default compression parameters are used; if you need to override them, avoid the ``z'' option and compress it yourself. .SH "SEE ALSO" shar(1), tar(5), compress(1), ar(1), arc(1), cpio(1), dump(8), restore(8), restor(8), rsh(1), dd(1), find(1) .SH BUGS The \fBr, u, w, X, l, F, C\fP, and \fIdigit\fP options of Unix \fItar\fP are not supported. .LP Multiple-tape (or floppy) archives should be supported, but so far no clean way has been implemented. .LP A bug in the Bourne Shell usually causes an extra newline to be written to the standard error when using compressed or remote archives. .LP A bug in ``dd'' prevents turning off the ``x+y records in/out'' messages on the standard error when ``dd'' is used to reblock or transport an archive. @@@ Fin de tar.1 echo tar.5 cat >tar.5 <<'@@@ Fin de tar.5' .TH TAR 5 "15 October 1987" .\" @(#)tar.5 1.4 11/6/87 Public Domain - gnu .SH NAME tar \- tape (or other media) archive file format .SH DESCRIPTION A ``tar tape'' or file contains a series of records. Each record contains TRECORDSIZE bytes (see below). Although this format may be thought of as being on magnetic tape, other media are often used. Each file archived is represented by a header record which describes the file, followed by zero or more records which give the contents of the file. At the end of the archive file there may be a record filled with binary zeros as an end-of-file indicator. A reasonable system should write a record of zeros at the end, but must not assume that an end-of-file record exists when reading an archive. The records may be blocked for physical I/O operations. Each block of \fIN\fP records (where \fIN\fP is set by the \fB\-b\fP option to \fItar\fP) is written with a single write() operation. On magnetic tapes, the result of such a write is a single tape record. When writing an archive, the last block of records should be written at the full size, with records after the zero record containing all zeroes. When reading an archive, a reasonable system should properly handle an archive whose last block is shorter than the rest, or which contains garbage records after a zero record. The header record is defined in the header file as follows: .nf .sp .5v .DT /* * Standard Archive Format - Standard TAR - USTAR */ #define RECORDSIZE 512 #define NAMSIZ 100 #define TUNMLEN 32 #define TGNMLEN 32 union record { char charptr[RECORDSIZE]; struct header { char name[NAMSIZ]; char mode[8]; char uid[8]; char gid[8]; char size[12]; char mtime[12]; char chksum[8]; char linkflag; char linkname[NAMSIZ]; char magic[8]; char uname[TUNMLEN]; char gname[TGNMLEN]; char devmajor[8]; char devminor[8]; } header; }; /* The checksum field is filled with this while the checksum is computed. */ #define CHKBLANKS " " /* 8 blanks, no null */ /* The magic field is filled with this if uname and gname are valid. */ #define TMAGIC "ustar " /* 7 chars and a null */ /* The linkflag defines the type of file */ #define LF_OLDNORMAL '\\0' /* Normal disk file, Unix compatible */ #define LF_NORMAL '0' /* Normal disk file */ #define LF_LINK '1' /* Link to previously dumped file */ #define LF_SYMLINK '2' /* Symbolic link */ #define LF_CHR '3' /* Character special file */ #define LF_BLK '4' /* Block special file */ #define LF_DIR '5' /* Directory */ #define LF_FIFO '6' /* FIFO special file */ #define LF_CONTIG '7' /* Contiguous file */ /* Further link types may be defined later. */ /* Bits used in the mode field - values in octal */ #define TSUID 04000 /* Set UID on execution */ #define TSGID 02000 /* Set GID on execution */ #define TSVTX 01000 /* Save text (sticky bit) */ /* File permissions */ #define TUREAD 00400 /* read by owner */ #define TUWRITE 00200 /* write by owner */ #define TUEXEC 00100 /* execute/search by owner */ #define TGREAD 00040 /* read by group */ #define TGWRITE 00020 /* write by group */ #define TGEXEC 00010 /* execute/search by group */ #define TOREAD 00004 /* read by other */ #define TOWRITE 00002 /* write by other */ #define TOEXEC 00001 /* execute/search by other */ .fi .LP All characters in header records are represented using 8-bit characters in the local variant of ASCII. Each field within the structure is contiguous; that is, there is no padding used within the structure. Each character on the archive medium is stored contiguously. Bytes representing the contents of files (after the header record of each file) are not translated in any way and are not constrained to represent characters or to be in any character set. The \fItar\fP(5) format does not distinguish text files from binary files, and no translation of file contents should be performed. The fields \fIname, linkname, magic, uname\fP, and \fIgname\fP are null-terminated character strings. All other fields are zero-filled octal numbers in ASCII. Each numeric field (of width \fIw\fP) contains \fIw\fP-2 digits, a space, and a null, except \fIsize\fP and \fImtime\fP, which do not contain the trailing null. The \fIname\fP field is the pathname of the file, with directory names (if any) preceding the file name, separated by slashes. The \fImode\fP field provides nine bits specifying file permissions and three bits to specify the Set UID, Set GID and Save Text (TSVTX) modes. Values for these bits are defined above. When special permissions are required to create a file with a given mode, and the user restoring files from the archive does not hold such permissions, the mode bit(s) specifying those special permissions are ignored. Modes which are not supported by the operating system restoring files from the archive will be ignored. Unsupported modes should be faked up when creating an archive; e.g. the group permission could be copied from the `other' permission. The \fIuid\fP and \fIgid\fP fields are the user and group ID of the file owners, respectively. The \fIsize\fP field is the size of the file in bytes; linked files are archived with this field specified as zero. The \fImtime\fP field is the modification time of the file at the time it was archived. It is the ASCII representation of the octal value of the last time the file was modified, represented as in integer number of seconds since January 1, 1970, 00:00 Coordinated Universal Time. The \fIchksum\fP field is the ASCII representaion of the octal value of the simple sum of all bytes in the header record. Each 8-bit byte in the header is treated as an unsigned value. These values are added to an unsigned integer, initialized to zero, the precision of which shall be no less than seventeen bits. When calculating the checksum, the \fIchksum\fP field is treated as if it were all blanks. The \fItypeflag\fP field specifies the type of file archived. If a particular implementation does not recognize or permit the specified type, the file will be extracted as if it were a regular file. As this action occurs, \fItar\fP issues a warning to the standard error. .IP "LF_NORMAL or LF_OLDNORMAL" represents a regular file. For backward compatibility, a \fItypeflag\fP value of LF_OLDNORMAL should be silently recognized as a regular file. New archives should be created using LF_NORMAL. Also, for backward compatability, \fItar\fP treats a regular file whose name ends with a slash as a directory. .IP LF_LINK represents a file linked to another file, of any type, previously archived. Such files are identified in Unix by each file having the same device and inode number. The linked-to name is specified in the \fIlinkname\fP field with a trailing null. .IP LF_SYMLINK represents a symbolic link to another file. The linked-to name is specified in the \fIlinkname\fP field with a trailing null. .IP "LF_CHR or LF_BLK" represent character special files and block special files respectively. In this case the \fIdevmajor\fP and \fIdevminor\fP fields will contain the major and minor device numbers respectively. Operating systems may map the device specifications to their own local specification, or may ignore the entry. .IP LF_DIR specifies a directory or sub-directory. The directory name in the \fIname\fP field should end with a slash. On systems where disk allocation is performed on a directory basis the \fIsize\fP field will contain the maximum number of bytes (which may be rounded to the nearest disk block allocation unit) which the directory may hold. A \fIsize\fP field of zero indicates no such limiting. Systems which do not support limiting in this manner should ignore the \fIsize\fP field. .IP LF_FIFO specifies a FIFO special file. Note that the archiving of a FIFO file archives the existence of this file and not its contents. .IP LF_CONTIG specifies a contiguous file, which is the same as a normal file except that, in operating systems which support it, all its space is allocated contiguously on the disk. Operating systems which do not allow contiguous allocation should silently treat this type as a normal file. .IP "`A' \- `Z'" are reserved for custom implementations. None are used by this version of the \fItar\fP program. .IP \fIother\fP values are reserved for specification in future revisions of the P1003 standard, and should not be used by any \fItar\fP program. .LP The \fImagic\fP field indicates that this archive was output in the P1003 archive format. If this field contains TMAGIC, then the \fIuname\fP and \fIgname\fP fields will contain the ASCII representation of the owner and group of the file respectively. If found, the user and group ID represented by these names will be used rather than the values contained within the \fIuid\fP and \fIgid\fP fields. User names longer than TUNMLEN-1 or group names longer than TGNMLEN-1 characters will be truncated. .SH "SEE ALSO" tar(1), ar(5), cpio(5), dump(8), restor(8), restore(8) .SH BUGS Names or link names longer than NAMSIZ-1 characters cannot be archived. This format does not yet address multi-volume archives. .SH NOTES This manual page was adapted by John Gilmore from Draft 6 of the P1003 specification @@@ Fin de tar.5 echo tar.h cat >tar.h <<'@@@ Fin de tar.h' /* * Header file for public domain tar (tape archive) program. * * @(#)tar.h 1.24 87/11/06 Public Domain. * * Created 25 August 1985 by John Gilmore, ihnp4!hoptoad!gnu. */ /* * Kludge for handling systems that can't cope with multiple * external definitions of a variable. In ONE routine (tar.c), * we #define TAR_EXTERN to null; here, we set it to "extern" if * it is not already set. */ #ifndef TAR_EXTERN #define TAR_EXTERN extern #endif /* * Header block on tape. * * I'm going to use traditional DP naming conventions here. * A "block" is a big chunk of stuff that we do I/O on. * A "record" is a piece of info that we care about. * Typically many "record"s fit into a "block". */ #define RECORDSIZE 512 #define NAMSIZ 100 #define TUNMLEN 32 #define TGNMLEN 32 union record { char charptr[RECORDSIZE]; struct header { char name[NAMSIZ]; char mode[8]; char uid[8]; char gid[8]; char size[12]; char mtime[12]; char chksum[8]; char linkflag; char linkname[NAMSIZ]; char magic[8]; char uname[TUNMLEN]; char gname[TGNMLEN]; char devmajor[8]; char devminor[8]; } header; }; /* The checksum field is filled with this while the checksum is computed. */ #define CHKBLANKS " " /* 8 blanks, no null */ /* The magic field is filled with this if uname and gname are valid. */ #define TMAGIC "ustar " /* 7 chars and a null */ /* The linkflag defines the type of file */ #define LF_OLDNORMAL '\0' /* Normal disk file, Unix compat */ #define LF_NORMAL '0' /* Normal disk file */ #define LF_LINK '1' /* Link to previously dumped file */ #define LF_SYMLINK '2' /* Symbolic link */ #define LF_CHR '3' /* Character special file */ #define LF_BLK '4' /* Block special file */ #define LF_DIR '5' /* Directory */ #define LF_FIFO '6' /* FIFO special file */ #define LF_CONTIG '7' /* Contiguous file */ /* Further link types may be defined later. */ /* * Exit codes from the "tar" program */ #define EX_SUCCESS 0 /* success! */ #define EX_ARGSBAD 1 /* invalid args */ #define EX_BADFILE 2 /* invalid filename */ #define EX_BADARCH 3 /* bad archive */ #define EX_SYSTEM 4 /* system gave unexpected error */ /* * Global variables */ TAR_EXTERN union record *ar_block; /* Start of block of archive */ TAR_EXTERN union record *ar_record; /* Current record of archive */ TAR_EXTERN union record *ar_last; /* Last+1 record of archive block */ TAR_EXTERN char ar_reading; /* 0 writing, !0 reading archive */ TAR_EXTERN int blocking; /* Size of each block, in records */ TAR_EXTERN int blocksize; /* Size of each block, in bytes */ TAR_EXTERN char *ar_file; /* File containing archive */ TAR_EXTERN char *name_file; /* File containing names to work on */ TAR_EXTERN char *tar; /* Name of this program */ /* * Flags from the command line */ TAR_EXTERN char f_reblock; /* -B */ TAR_EXTERN char f_create; /* -c */ TAR_EXTERN char f_diff; /* -d */ TAR_EXTERN char f_dironly; /* -D */ TAR_EXTERN char f_follow_links; /* -h */ TAR_EXTERN char f_ignorez; /* -i */ TAR_EXTERN char f_keep; /* -k */ TAR_EXTERN char f_local_filesys; /* -l */ TAR_EXTERN char f_modified; /* -m */ TAR_EXTERN char f_oldarch; /* -o */ TAR_EXTERN char f_use_protection; /* -p */ TAR_EXTERN char f_sayblock; /* -R */ TAR_EXTERN char f_sorted_names; /* -s */ TAR_EXTERN char f_list; /* -t */ TAR_EXTERN char f_namefile; /* -T */ TAR_EXTERN char f_verbose; /* -v */ TAR_EXTERN char f_extract; /* -x */ TAR_EXTERN char f_compress; /* -z */ /* * We now default to Unix Standard format rather than 4.2BSD tar format. * The code can actually produce all three: * f_standard ANSI standard * f_oldarch V7 * neither 4.2BSD * but we don't bother, since 4.2BSD can read ANSI standard format anyway. * The only advantage to the "neither" option is that we can cmp(1) our * output to the output of 4.2BSD tar, for debugging. */ #define f_standard (!f_oldarch) /* * Structure for keeping track of filenames and lists thereof. */ struct name { struct name *next; short length; /* cached strlen(name) */ char found; /* A matching file has been found */ char firstch; /* First char is literally matched */ char regexp; /* This name is a regexp, not literal */ char name[NAMSIZ+1]; }; TAR_EXTERN struct name *namelist; /* Points to first name in list */ TAR_EXTERN struct name *namelast; /* Points to last name in list */ TAR_EXTERN int archive; /* File descriptor for archive file */ TAR_EXTERN int errors; /* # of files in error */ /* * * Due to the next struct declaration, each routine that includes * "tar.h" must also include . I tried to make it automatic, * but System V has no defines in , so there is no way of * knowing when it has been included. In addition, it cannot be included * twice, but must be included exactly once. Argghh! * * Thanks, typedef. Thanks, USG. */ struct link { struct link *next; dev_t dev; ino_t ino; short linkcount; char name[NAMSIZ+1]; }; TAR_EXTERN struct link *linklist; /* Points to first link in list */ /* * Error recovery stuff */ TAR_EXTERN char read_error_flag; /* * Declarations of functions available to the world. */ union record *findrec(); void userec(); union record *endofrecs(); void anno(); #define annorec(stream, msg) anno(stream, msg, 0) /* Cur rec */ #define annofile(stream, msg) anno(stream, msg, 1) /* Saved rec */ @@@ Fin de tar.h echo port.h cat >port.h <<'@@@ Fin de port.h' /* * Portability declarations for public domain tar. * * @(#)port.h 1.3 87/11/11 Public Domain by John Gilmore, 1986 */ /* * Everybody does wait() differently. There seem to be no definitions * for this in V7 (e.g. you are supposed to shift and mask things out * using constant shifts and masks.) So fuck 'em all -- my own non * standard but portable macros. Don't change to a "union wait" * based approach -- the ordering of the elements of the struct * depends on the byte-sex of the machine. Foo! */ #define TERM_SIGNAL(status) ((status) & 0x7F) #define TERM_COREDUMP(status) (((status) & 0x80) != 0) #define TERM_VALUE(status) ((status) >> 8) #ifdef MSDOS /* missing things from sys/stat.h */ #define S_ISUID 0 #define S_ISGID 0 #define S_ISVTX 0 /* device stuff */ #define makedev(ma, mi) ((ma << 8) | mi) #define major(dev) (dev) #define minor(dev) (dev) #endif /* MSDOS */ @@@ Fin de port.h echo open3.h cat >open3.h <<'@@@ Fin de open3.h' /* * @(#)open3.h 1.4 87/11/11 Public Domain. * * open3.h -- #defines for the various flags for the Sys V style 3-argument * open() call. On BSD or System 5, the system already has this in an * include file. This file is needed for V7 and MINIX systems for the * benefit of open3() in port.c, a routine that emulates the 3-argument * call using system calls available on V7/MINIX. * * This file is needed by PD tar even if we aren't using the * emulator, since the #defines for O_WRONLY, etc. are used in * a couple of places besides the open() calls, (e.g. in the assignment * to openflag in extract.c). We just #include this rather than * #ifdef them out. * * Written 6/10/87 by rmtodd@uokmax (Richard Todd). * * The names have been changed by John Gilmore, 31 July 1987, since * Richard called it "bsdopen", and really this change was introduced in * AT&T Unix systems before BSD picked it up. */ /* Only one of the next three should be specified */ #define O_RDONLY 0 /* only allow read */ #define O_WRONLY 1 /* only allow write */ #define O_RDWR 2 /* both are allowed */ /* The rest of these can be OR-ed in to the above. */ /* * O_NDELAY isn't implemented by the emulator. It's only useful (to tar) on * systems that have named pipes anyway; it prevents tar's hanging by * opening a named pipe. We #ifndef it because some systems already have * it defined. */ #ifndef O_NDELAY #define O_NDELAY 4 /* don't block on opening devices that would * block on open -- ignored by emulator. */ #endif #define O_CREAT 8 /* create file if needed */ #define O_EXCL 16 /* file cannot already exist */ #define O_TRUNC 32 /* truncate file on open */ #define O_APPEND 64 /* always write at end of file -- ignored by emul */ #ifdef EMUL_OPEN3 /* * make emulation transparent to rest of file -- redirect all open() calls * to our routine */ #define open open3 #endif @@@ Fin de open3.h echo msd_dir.h cat >msd_dir.h <<'@@@ Fin de msd_dir.h' /* * @(#)msd_dir.h 1.4 87/11/06 Public Domain. * * A public domain implementation of BSD directory routines for * MS-DOS. Written by Michael Rendell ({uunet,utai}michael@garfield), * August 1897 */ #define rewinddir(dirp) seekdir(dirp, 0L) #define MAXNAMLEN 12 struct direct { ino_t d_ino; /* a bit of a farce */ int d_reclen; /* more farce */ int d_namlen; /* length of d_name */ char d_name[MAXNAMLEN + 1]; /* garentee null termination */ }; struct _dircontents { char *_d_entry; struct _dircontents *_d_next; }; typedef struct _dirdesc { int dd_id; /* uniquely identify each open directory */ long dd_loc; /* where we are in directory entry is this */ struct _dircontents *dd_contents; /* pointer to contents of dir */ struct _dircontents *dd_cp; /* pointer to current position */ } DIR; extern DIR *opendir(); extern struct direct *readdir(); extern void seekdir(); extern long telldir(); extern void closedir(); @@@ Fin de msd_dir.h echo msd_dir.c cat >msd_dir.c <<'@@@ Fin de msd_dir.c' /* * @(#)msd_dir.c 1.4 87/11/06 Public Domain. * * A public domain implementation of BSD directory routines for * MS-DOS. Written by Michael Rendell ({uunet,utai}michael@garfield), * August 1897 */ #include #include #include #include #include #include #ifndef NULL # define NULL 0 #endif /* NULL */ #ifndef MAXPATHLEN # define MAXPATHLEN 255 #endif /* MAXPATHLEN */ /* attribute stuff */ #define A_RONLY 0x01 #define A_HIDDEN 0x02 #define A_SYSTEM 0x04 #define A_LABEL 0x08 #define A_DIR 0x10 #define A_ARCHIVE 0x20 /* dos call values */ #define DOSI_FINDF 0x4e #define DOSI_FINDN 0x4f #define DOSI_SDTA 0x1a #define Newisnull(a, t) ((a = (t *) malloc(sizeof(t))) == (t *) NULL) #define ATTRIBUTES (A_DIR | A_HIDDEN | A_SYSTEM) /* what find first/next calls look use */ typedef struct { char d_buf[21]; char d_attribute; unsigned short d_time; unsigned short d_date; long d_size; char d_name[13]; } Dta_buf; static char *getdirent(); static void setdta(); static void free_dircontents(); static Dta_buf dtabuf; static Dta_buf *dtapnt = &dtabuf; static union REGS reg, nreg; #if defined(M_I86LM) static struct SREGS sreg; #endif DIR * opendir(name) char *name; { struct stat statb; DIR *dirp; char c; char *s; struct _dircontents *dp; char nbuf[MAXPATHLEN + 1]; if (stat(name, &statb) < 0 || (statb.st_mode & S_IFMT) != S_IFDIR) return (DIR *) NULL; if (Newisnull(dirp, DIR)) return (DIR *) NULL; if (*name && (c = name[strlen(name) - 1]) != '\\' && c != '/') (void) strcat(strcpy(nbuf, name), "\\*.*"); else (void) strcat(strcpy(nbuf, name), "*.*"); dirp->dd_loc = 0; setdta(); dirp->dd_contents = dirp->dd_cp = (struct _dircontents *) NULL; if ((s = getdirent(nbuf)) == (char *) NULL) return dirp; do { if (Newisnull(dp, struct _dircontents) || (dp->_d_entry = malloc((unsigned) (strlen(s) + 1))) == (char *) NULL) { if (dp) free((char *) dp); free_dircontents(dirp->dd_contents); return (DIR *) NULL; } if (dirp->dd_contents) dirp->dd_cp = dirp->dd_cp->_d_next = dp; else dirp->dd_contents = dirp->dd_cp = dp; (void) strcpy(dp->_d_entry, s); dp->_d_next = (struct _dircontents *) NULL; } while ((s = getdirent((char *) NULL)) != (char *) NULL); dirp->dd_cp = dirp->dd_contents; return dirp; } void closedir(dirp) DIR *dirp; { free_dircontents(dirp->dd_contents); free((char *) dirp); } struct direct * readdir(dirp) DIR *dirp; { static struct direct dp; if (dirp->dd_cp == (struct _dircontents *) NULL) return (struct direct *) NULL; dp.d_namlen = dp.d_reclen = strlen(strcpy(dp.d_name, dirp->dd_cp->_d_entry)); dp.d_ino = 0; dirp->dd_cp = dirp->dd_cp->_d_next; dirp->dd_loc++; return &dp; } void seekdir(dirp, off) DIR *dirp; long off; { long i = off; struct _dircontents *dp; if (off < 0) return; for (dp = dirp->dd_contents ; --i >= 0 && dp ; dp = dp->_d_next) ; dirp->dd_loc = off - (i + 1); dirp->dd_cp = dp; } long telldir(dirp) DIR *dirp; { return dirp->dd_loc; } static void free_dircontents(dp) struct _dircontents *dp; { struct _dircontents *odp; while (dp) { if (dp->_d_entry) free(dp->_d_entry); dp = (odp = dp)->_d_next; free((char *) odp); } } static char * getdirent(dir) char *dir; { if (dir != (char *) NULL) { /* get first entry */ reg.h.ah = DOSI_FINDF; reg.h.cl = ATTRIBUTES; #if defined(M_I86LM) reg.x.dx = FP_OFF(dir); sreg.ds = FP_SEG(dir); #else reg.x.dx = (unsigned) dir; #endif } else { /* get next entry */ reg.h.ah = DOSI_FINDN; #if defined(M_I86LM) reg.x.dx = FP_OFF(dtapnt); sreg.ds = FP_SEG(dtapnt); #else reg.x.dx = (unsigned) dtapnt; #endif } #if defined(M_I86LM) intdosx(®, &nreg, &sreg); #else intdos(®, &nreg); #endif if (nreg.x.cflag) return (char *) NULL; return dtabuf.d_name; } static void setdta() { reg.h.ah = DOSI_SDTA; #if defined(M_I86LM) reg.x.dx = FP_OFF(dtapnt); sreg.ds = FP_SEG(dtapnt); intdosx(®, &nreg, &sreg); #else reg.x.dx = (int) dtapnt; intdos(®, &nreg); #endif } @@@ Fin de msd_dir.c exit 0