QuentinC QuentinC - 2 months ago 14
Linux Question

Tar a directory, but don't store full absolute paths in the archive

I have the following command in the part of a backup shell script:

tar -cjf site1.bz2 /var/www/site1/


When I list the contents of the archive, I get:

tar -tf site1.bz2
var/www/site1/style.css
var/www/site1/index.html
var/www/site1/page2.html
var/www/site1/page3.html
var/www/site1/images/img1.png
var/www/site1/images/img2.png
var/www/site1/subdir/index.html


But I would like to remove the part
/var/www/site1
from directory and file names within the archive, in order to simplify extraction and avoid useless constant directory structure. Never know, in case I would extract backuped websites in a place where web data weren't stored under
/var/www
.

For the example above, I would like to have :

tar -tf site1.bz2
style.css
index.html
page2.html
page3.html
images/img1.png
images/img2.png
subdir/index.html


So, that when I extract, files are extracted in the current directory and I don't need to move extracted files afterwards, and so that sub-directory structures is preserved.

There are already many questions about tar and backuping in
stackoverflow
and at other places on the web, but most of them ask for dropping the entire sub-directory structure (flattening), or just add or remove the initial / in the names (I don't know what it changes exactly when extracting), but no more.

After having read some of the solutions found here and there as well as the manual, I tried :

tar -cjf site1.bz2 -C . /var/www/site1/
tar -cjf site1.bz2 -C / /var/www/site1/
tar -cjf site1.bz2 -C /var/www/site1/ /var/www/site1/
tar -cjf site1.bz2 --strip-components=3 /var/www/site1/


But none of them worked the way I want. Some do nothing, some others don't archive sub-directories anymore.

It's inside a backup shell script launched by a Cron, so I don't know well, which user runs it, what is the path and the current directory, so always writing absolute path is required for everything, and would prefer not changing current directory to avoid breaking something further in the script (because it doesn't only backup websites, but also databases, then send all that to FTP etc.)

How to achieve this?

Have I just misunderstood how the option -C works?

Answer
tar -cjf site1.tar.bz2 -C /var/www/site1 .

In the above example, tar will change to directory /var/www/site1 before doing its thing because the option -C /var/www/site1 was given.

From man tar:

OTHER OPTIONS

  -C, --directory DIR
       change to directory DIR