Go to the first, previous, next, last section, table of contents.

@macro FIXME{string} @allow-recursion @quote-arg FIXME: \string\

@macro FIXME-ref{entry} @quote-arg @FIXME{[ref \entry\]}

@macro FIXME-pxref{entry} @quote-arg @FIXME{[pxref \entry\]}

@macro FIXME-xref{entry} @quote-arg @FIXME{[xref \entry\]}

@macro UNREVISED

(This message will disappear, once this node revised.)

Copyright (C) 1992, 1994, 1995 Free Software Foundation, Inc.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation.

Introduction

Welcome to the GNU tar manual. GNU tar is used to create and manipulate files (archives) which are actually collections of many other files. GNU tar provides users with an organized and systematic method for controlling a large amount of data.

This chapter is meant to be read in diagonal by all people. The first sections introduce you to various terms that will recur throught the book. They also tell you who has worked on GNU tar and its documentation, and where you should send bug reports or comments. The remaining sections are for maintainer comments, somewhat extraneous to pure technical information, not belonging to the tutorial either, but not worth a chapter on their own.

The tutorial (see section Tutorial Introduction to tar) should ease people who need to be introduced to tar. This chapter is meant to be self contained, not requiring any reading from subsequent chapters to make sense. A more discursive tone is used, for helping newcomers to digest contained information. For the expert reader, the tutorial may be safely skipped, as all information it contains is also formally found elsewhere. Other chapters may absolutely not refer to the tutorial.

All the other chapters are meant to be a reference, that is, to present for each topic everything that needs to be said about it. Complexity is not an issue in that sense that things which should be said have to be said under the proper topic, even if sometimes, all details accumulate into something complex. However, concision should not destroy clarity: let's take as many words as necessary to describe something which needs to be. But again, of course, when clarity is not at stake, concision and simplicity is always better.

A special chapter (see section Date input formats) is wholly repeated in other GNU manuals, so it is somewhat self-contained. One section of this manual (see section The Standard Format) contains a big quote which is mechanically extracted from tar sources.

A previous version of this manual was pushing heavily so mnemonic option names were always used, while common practice is using old style for options. If the tutorial, in particular, was overstressing unusual ways, people will discover after the fact, by comparing what they know to what others do, that the tutorial was not that good, after all. So, we tried hard to systematically correct that particular mistake, at least by always giving both the long and short names for options, each time an option is referred to in the manual. We even changed tar itself a few times, so the manual may be better.

What tar Does

The tar program is used to create and manipulate tar archives. An archive is a single file which contains the contents of many files, while still identifying the names of the files, their owner(s), and so forth. (In addition, archives record access permissions, user and group, size in bytes, and last modification time. Some archives also record the file names in each archived directory, as well as other file and directory information.)

The files inside an archive are called members. Within this manual, we use the term file to refer only to files accessible in the normal ways (by ls, cat, and so forth), and the term members to refer only to the members of an archive. Similarly, a file name is the name of a file, as it resides in the filesystem, and a member name is the name of an archive member within the archive.

Initially, tar archives were used to store files conveniently on magnetic tape. The name `tar' comes from this use; it stands for tape archiver. Despite the utility's name, tar can direct its output to any available device, as well as store it in a file or direct it to another program via a pipe. tar may even access remote devices or files (as archives).

You can use tar archives in many ways. We want to stress a few of them: storage, backup or transportation.

Storage
Often, tar archives are used to store related files for convenient file transfer over a network. For example, the GNU Project distributes its software bundled into tar archives, so that all the files relating to a particular program (or set of related programs) can be transferred as a single unit. A magnetic tape can store several files in sequence, but has no names for them, just relative position on the tape. A tar archive or something like it is one way to store several files on one tape and retain their names. Even when the basic transfer mechanism can keep track of names, as FTP can, the nuisance of handling multiple files, directories, and multiple links, makes tar archives an attractive method. Archive files are also used for long-term storage. You can think of this as transportation from one time to another. (It is a science-fiction idiom that you can move through time as well as in space, the idea here is that tar can be used to move archives in all dimensions, even time!)
Backup
Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks. A backup puts a collection of files (possibly pertaining to many users and projects) together on a disk or a tape. This guards against accidental destruction of the information in those files. GNU tar has special features that allow it to be used to make incremental and full dumps of all the files in a filesystem.
Transportation
Archive files can be used for transporting a group of files from one system to another. To do this, put all relevant files into an archive on one computer system, transfer the archive to another system, and extract the contents there. The basic transfer medium can be magnetic tape, FTP, or even electronic mail; however, in order to transport an archive safely by electronic mail, you must encode the archive with uuencode or some functional equivalent. As long as both machines both support the tar program, they do not have to use the same operating system. Piping one tar to another is an easy way to copy a directory's contents from one disk to another, while preserving the dates, modes, owners, and link structure of all the files therein. tar is also ideal for transferring directories over networks. Occasionally, people use tar to archive many files on one machine and then pipe the archive over the network to another machine where the files in the archive are unpacked using tar.

The tar program provides the ability to create tar archives, as well as various other kinds of manipulation. For example, you can use tar on previously created archives to extract files, to store additional files, or to update or list files which were already stored. The term extraction is used to refer to the process of copying an archive member into a file in the filesystem. One might speak of extracting a single member. Extracting all the members of an archive is often called extracting the archive. The term unpack can also be used to refer to the extraction of many or all the members of an archive.

Conventionally, tar archives are given names ending with `.tar'. This is not necessary for tar to operate properly, but this manual follows the convention in order to get the reader used to seeing it.

Occasionally, tar archives are referred to as tar files, and archive members are referred to as files, or entries. For people familiar with the operation of tar, this causes no difficulty. However, this manual consistently uses the terminology above in referring to archives and archive members to make it easier to learn how to use tar.

GNU tar Authors

GNU tar was originally written by John Gilmore, and modified by many people. The GNU enhancements were written by Jay Fenlason, then Joy Kendall, and the whole package has been further maintained by Michael Bushnell, and finally Pinard, with the help of numerous and kind users. I wish to stress that tar is somewhat a collective work, and owe much to all those people who reported problems, offerred solutions and other insights, or shared their thoughts and suggestions. Even if we lost track of many of those contributors, a partial list can be found in the `THANKS' file from the GNU tar distribution.

Jay Fenlason put together a draft of a GNU tar manual, also borrowing notes from the original man page from John Gilmore. This draft has been distributed in tar versions 1.04 (or even before?) through 1.10, then withdrawn in version 1.11. Michael Bushnell and Amy Gorin worked on a tutorial and manual for GNU tar, and left a few unpublished versions of each. For version 1.11.8, Pinard put together a new manual by taking information from all these sources and merging them in a single manual. Melissa Weisshaus edited the whole book and redesigned some parts.

I heard that there is another manual in the works, by another team, which should say everything about archives and related utilities, and which will surely be nicer than this one. In the meantime, please consider this manual as a placeholder for tar option list and a few random notes the maintainer wants to save somewhere, so users can read them. I hope GNU tar users will be happier with this imperfect manual than they would be with no documentation at all.

I collected everything I could around the various manuals who existed or were planned, made some structure to contain the material, and pourred everything into the structure, not being afraid of duplication. Then, I made some intense editing to uniformize the result, added some paragraphs of my own, and corrected many errors. I found it quite ashaming for GNU that we publish tar without any documentation for it. So, I rushed myself into attempting something.

Reporting bugs or suggestions

Please report problems or suggestions about this program to `bug-gnu-utils@prep.ai.mit.edu'. You may also write directly, and less officially, to `pinard@iro.umontreal.ca'. There is a lot of mail flowing about tar, and some has accumulated in the past years. You might expect a quick acknowledgement of your invoices, but the proper handling of your reports may be delayed for a long while.

Many nodes of this document have not been revised much; all of these start with a little comment saying this. I accept documentation bug reports, of course. But please do not torture yourself into systematically reporting all inadequacies for the unrevised nodes of this document, unless you really feel like revising them.

Support considerations

This informal appendix is for the maintainer to share a few words and thoughts, while considering GNU tar support.

Stability of GNU tar

User reports mainly fall in three categories: portability problems, execution bugs, and requests for enhancements. For 1.11.X, the emphasis has been on solving portability problems, then trying to make GNU tar more solid. Enhancements have fairly low priority, yet I sometime slip one in just for taking a kind of rest :-).

Many bugs have been corrected since 1.11.2. If you are curious, glance through ChangeLog. I had only very few reports for things that might be new bugs not present in 1.11.2. If you are really curious, and have access to the FSF machines, see `/gd/gnu/tar/rmail/' hierarchy for all reports. Subdirectories `0', `1', `2' and `3' represent decreasing levels in priority. Most problems in there were reported against 1.10, 1.11 or 1.11.2 and still exist. The only thing I have consciously broken between 1.11.2 and 1.11.5 is `--block-number' (`-R'), because I wanted some modification to be done to `gnulib/error.c', which is outside my control. This modification is now done, but I did not revisit this area yet.

Here is my candid opinion. GNU tar has many areas of unreliability. See `BACKLOG' for the horrorful picture of the situation. Yet, for most users and usages, GNU tar looks very dependable. For me as a mere user, GNU tar did not give problems in years. And I think it offers a lot of functionality. Many problems have been solved since 1.11.2, even if true that many more remain to be solved. I'm not discouraged myself and feel positive about maintaining it, simply because when I bite, that usually lasts for quite long. I might not have all the time I would want, but I surely have good will and am happily surrounded by many collaborating pretesters. So, I still think GNU tar is on the winning side in the long run.

Should we rewrite the thing?

Working in tar sources is not always pleasurable. The problem is that tar sources are very fragile. Just cleaning around breaks things. The current sequence of prereleases is for slowly trying to solidify it, so tar becomes more maintainable. I think that the ugliness of sources could be corrected to a certain extent, too.

A few efforts to replace GNU tar have been done already and it seems that all failed so far. A toy program, for me, is another kind of failure. I think people underestimate the number of portability problems such a program can raise. This is not only a matter of programming style, there is really a wide variability in systems out there. GNU tar has a long history, met a rich variety of porting problems, machine peculiarities, system idiosyncrasies, which are unrelated to programming style. My own opinion is that we cannot dismiss all the experience gleaned along the years, and saved (if not hidden) in GNU tar sources, pretending to start anew, from scratch.

Even if a new program replacing GNU tar would be marvelous, GNU tar stalled for a few years waiting for such a program, and we are now faced to nothing, with hundreds of user reports to catch on. We need a working archiver now, and cannot live on promises. Any new program will take hundreds of user reports, and many years, to stabilize enough to become a plausible tar replacement. I rather plan to clean up GNU tar. This alone is a big task for me, because GNU tar coding is not ideal, and I have to find ways to transform it slowly, while having it fully working at all times.

Why am I maintaining it?

Someone asked me: "How did you get tar?". Take the following reply as anecdotical, or caricatural. Involved people might object at the formulation I use here! :-)

I submitted myself maybe 20 reports about various tar bugs that were not corrected, release after release, and which I kept resubmitting. Other people were complaining too. In fact, the previous maintainer has been promised the ultimate and marvelous tar by a teacher and his class as a team, you know, with many GUI's (SGI, Athena's, Xlib, Motif, name it :-), other CUI's, supporting cpio, pax and various tar formats, etc. The class wrote hundreds of pages at documenting the project, GNU central were completely seduced, and tar maintenance stopped, just waiting the wonder for years. Of course, it never came.

One day, Michael wrote me about a problem with m4 (if I remember correctly) and in my reply, I added in a short P.S. saying that he was not giving us the impression of being very happy with tar maintenance. He replied an unexpectedly long letter, saying that tar became a monster, that he had no time nor any inclination for it, that many other maintainers declined taking it, and that I will have his aeternal gratitude if I was accepting tar on my shoulders, and away from him. Nevertheless, he was very kind in not applying any kind of pressure, and offering his help for the transition.

I pondered the idea, then told Michael that tar was not a really tempting package, but that on the other hand, I was really tempted by the aeternal gratitude of the Hurd developer ;-), and that I thought I could find the courage of cleaning tar up to the point it might become a breeze to maintain.

The accumulated mail backlog was enormous, and it took me several months just to sort it out and establish priorities. So far (in June 1995), I replied to something like 400 reports, and I have nearly another 1000 waiting, while others are still coming in with regularity. There is a lot of repetition of course, and many messages addressing many problems at once; I got to develop new techniques for handling this quantity while giving proper feedback to all users, and this experience revealed quite interesting so far, and now even serves the other GNU packages I maintain.

But, to be honest, I confess that I am a little afraid of tar maintainance. It is difficult for many reasons, the first three being more evident than the others:

However, even if difficult, I do feel like doing a careful cleanup, so tar would become less painful to maintain after a while (and less subject to criticism). And besides, I'm surrounded by a marvelous team of pretesters and by many other collaborating users, which I should learn to serve better. Getting more experience with maintainance in GNU, I hope being careful enough modifying tar so not hurting users too much, being aware that tar is a sensitive product in GNU. Once cleaned up, I might be happy to return tar maintainance to someone else...

tar requires more work alone that all my other things together, and I have to resist being swallowed whole in it. This resistance makes tar development somewhat slower. Sorry!

MS-DOS and other systems?

GNU does not necessarily support non-UNIX systems; that is to say, MS-DOS is not supported. It is very true that ports can sometimes be very intrusive in the sources, those sources them significantly with conditionals and extra code. Extra ports can also distract GNU maintainers from the main development line, and the GNU project needs to stay very directed in order to accomplish its goal.

However, a special argument might be made for tar. Both tar and gzip are required tools for getting something out of the GNU archives. tar should be more opened to ports than the GNU rule states. Jean-Loup Gailly did a tremendous job at porting gzip to smaller systems. It would be beneficial if a few other GNU tools be available to be used on MS-DOS and other operating systems; tar should definitely be one of those tools. These ports for tar theoretically have no priority at all. Nevertheless, a port would be interesting, because tar is so central in GNU distributions, and gzip is already ported.

Some porting efforts have been done in the past. There are traces of a few exchanges on this subject in `BACKLOG'. GNU tar sources have been modified a lot recently at a cosmetic level, and I would certainly have a hard time integrating older diffs provided by someone else. If people want porting tar to MSDOS or other non-UNIX systems, they should be committed in supporting their ports after the fact, as I cannot do it myself. When such commitment exists, we could favorably accept unobtrusive porting code and integrate it into the GNU tar development main stream. Non integrated ports are not otherwise documented in this manual.


Go to the first, previous, next, last section, table of contents.