Skip to main content
Glossary Term

ZIP (file format)

Introduction, History, and Standardization of ZIP file format - ZIP is an archive file format that supports lossless data compression. - ZIP files may contain compressed files or directories. - The format was created in 1989 as a replacement for the ARC compression format. - The ZIP file format was designed by Phil Katz and Gary Conway. - The format proliferated widely on the internet during the 1990s. - ISO/IEC 21320-1 Document Container File — Part 1: Core was published in 2015, stating that document container files are conforming Zip files. - The standard restricts certain features and prohibits encryption, digital signatures, and patched data features. Version history and features of ZIP file format - The ZIP File Format Specification has its own version number. - Key advances in various versions include support for Deflate64 compression, AES encryption, Unicode filename storage, and expanded compression algorithms. - The latest version is 6.3.10, released in 2022, which added z/OS attribute values and additional 3rd party Extra Field mappings. ZIP file structure and internal layout - ZIP files allow for random-access processing without compressing or decompressing the entire archive. - A directory is placed at the end of a ZIP file, identifying the files and their locations. - ZIP archives can include extra data unrelated to the archive itself, allowing for self-extracting archives. - The order of file entries in the central directory does not have to coincide with the order in the archive. - ZIP files are identified by the presence of an end of central directory record. - Local file headers provide redundancy by including the same information as the central directory. - ZIP files can be updated by appending new files and an updated central directory. ZIP file signatures, headers, and timestamps - Each entry in a ZIP archive is introduced by a local file header with file information and optional extra data fields. - Extra data fields are used to support ZIP64 format, encryption, file attributes, and higher-resolution timestamps. - The ZIP format uses specific 4-byte signatures to denote various structures in the file. - ZIP archives can be spread across multiple file-system files for storage or transmission purposes. - The timestamp resolution of files in a ZIP archive is limited to two seconds, matching the FAT filesystem of DOS. Compression methods, encryption, ZIP64 support, and compatibility - The ZIP format supports various compression methods, with DEFLATE being the most common. - ZIP supports a password-based symmetric encryption system known as ZipCrypto. - ZIP64 format extensions were introduced to overcome the 4GB limit of the original ZIP format. - Support for ZIP64 varies across different software and operating systems. - Some extension libraries and tools support ZIP64, while others may not.