Context

I often face the situation that, when transferring files to a remote server, it is much more convenient to send the files as one archive, rather than all of them individually.

There are many ways to accomplish this but my favorite solution is to use zip. As long as both the sending and receiving side come with some flavor of Unix (or Cygwin / Git Bash under Windows), the zip utility is pretty much guaranteed to be available. And, particularly relevant to me, zip also comes pre-installed on the Linux VMs underlying the Databricks Cluster nodes on MS Azure, whereas tar is not directly available.

This post is intended to just serve as a quick reference on the usage of the unix zip command.

Usage of the zip Command

# Zip up a directory
zip –r archive.zip directory_to_zip

# Unzip archive into the current directory
unzip archive.zip

# Delete some files from an existing archive
zip –d archive.zip unwanted_file.txt

Bonus: Using the Python zipfile module

Python's standard library comes with the zipfile module which can be used to handle operations on .zip files directly from within Python. For one-off tasks I personally prefer to use the command line. But when working with many zip files it would make sense to automate the process using Python. Here is a quick pointer how this can be done:

import zipfile
with zipfile.ZipFile("archive.zip","r") as zip_file:
    zip_file.extractall("target_directory")

While zipfile is purpose-built for handling .zip files, the shutil module Link has options for packing and unpackaging archives as well and can be preferable in certain situations as it can handle multiple types of archives and is able to automatically detect the correct type and compression format from the file extension. (See Stackoverflow)

# Make archive
import shutil
shutil.make_archive('archive.zip', 'zip', 'directory_to_zip')

# Unpack an archive
import shutil
shutil.unpack_archive('archive.zip', 'destination_directory')
# Note: Can use with `pathlib.Path` objects instead of strings

Bonus: How to use tar instead

On some more exotic systems like Alpine Linux - which powers iSH, a Linux Shell for iOS - however zip doesn't come pre-installed but tar does.

So for completeness here is how we can package and unpackage a collection of files with tar if zip is not available or we if are faced with a tar file from another source:

# Create a tar archive from a directory
tar -cf archive_name.tar directory_to_archive
# c: create 
# f: filename of the new tar archive

# Unpack a tar archive
tar -xf archive_name.tar
# x: extract

Tar files by default don't use any compression. They are simply a way to package multiple files together. If we want to add compression, we can do so using a compression algorithm with gzip or bzip2:

# Optional: Create a tar file and compress it to save space

# Using gzip
tar -cf archive.tar.gz directory_to_archive
gzip archive.tar.gz

# Using bzip2 (slower but higher compression)
tar -cf archive.tar.bz
bzip2 archive.tar.bz

 # Uncompress and unpack the archive
gunzip archive.tar.gz
tar -xf archive.tar.gz

bunzip2 archive.tar.bz2
tar -xf archive.tar.bz2

When I started writing this post my intention was to only provide a quick reference on how to use the zip/unzip unix commands.

Now it turned out to cover a few more approaches. I am happy to have them all here in one place for reference. That said, for simple transfers zip remains my go-to solution.

Reference / Further Details


Published

Category

Utility

Tags

Contact