ATAD #22 – File Archives on Linux
The two main formats for file archiving are tar and cpio. Both a used in tandem with compression utilities like the gzip and bzip. The archive formats were initially used for tape and other sequential access devices for backup purposes, it is now commonly used to collate collections of files into one larger file, for distribution or archiving, while preserving file system information such as user and group permissions, dates, and directory structures.
An important point you need to consider while using these archives to create backups is that, in the unfortunate event where a portion of the archive gets corrupted, tar will skip a corrupted archive portion and proceed to the next whereas cpio is going to quit with an error.
Tar accepts explicit file names to make the archive whereas cpio seems to give better control on the files that need to be archived.
Eg
# find . -type f -name '*.txt' -print | cpio -o | gzip >all_my_txt.cpio.gz
A useful way to use archives and transfer multiple files and directories to a target directory/machine is to create an archive in stdout and then, in the target directory/machine, extracting the tar file from the piped stdin.
Eg
# using cpio
# find . -type f -name '*.txt' -print | cpio -o | ssh target_machine "cd /target_dir && cpio -idum"
# using a tarpipe
# tar -cf - "source_dir" | ( cd "target_dir" && tar -xvf - )
Many software vendors like RedHat and Oracle ship their products in cpio file format. Eg, RPM uses cpio and can be extracted using the rpm2cpio utility
# rpm2cpio rpm_name | cpio -ivd
__tipped__
–vinaydeep
Leave a Reply