/programs/fs/unzip60/proginfo/3rdparty.bug |
---|
0,0 → 1,114 |
Known, current PKZIP bugs/limitations: |
------------------------------------- |
- PKUNZIP 2.04g is reported to corrupt some files when compressing them with |
the -ex option; when tested, the files fail the CRC check, and comparison |
with the original file shows bogus data (6K in one case) embedded in the |
middle. PKWARE apparently characterized this as a "known problem." |
- PKUNZIP 2.04g considers volume labels valid only if originated on a FAT |
file system, but other OSes and file systems (e.g., Amiga and OS/2 HPFS) |
support volume labels, too. |
- PKUNZIP 2.04g can restore volume labels created by Zip 2.x but not by |
PKZIP 2.04g (OS/2 DOS box only??). |
- PKUNZIP 2.04g gives an error message for stored directory entries created |
under other OSes (although it creates the directory anyway), and PKZIP -vt |
does not report the directory attribute bit as being set, even if it is. |
- PKZIP 2.04g mangles unknown extra fields (especially OS/2 extended attri- |
butes) when adding new files to an existing zipfile [example: Walnut Creek |
Hobbes March 1995 CD-ROM, FILE_ID.DIZ additions]. |
- PKUNZIP 2.04g is unable to detect or deal with prepended junk in a zipfile, |
reporting CRC errors in valid compressed data. |
- PKUNZIP 2.04g (registered version) incorrectly updates/freshens the AV extra |
field in authenticated archives. The resultant extra block length and total |
extra field length are inconsistent. |
- [Windows version 2.01] Win95 long filenames (VFAT) are stored OK, but the |
file system is always listed as ordinary DOS FAT. |
- [Windows version 2.50] NT long filenames (NTFS) are stored OK, but the |
file system is always listed as ordinary DOS FAT. |
- PKZIP 2.04 for DOS encrypts using the OEM code page for 8-bit passwords, |
while PKZIP 2.50 for Windows uses Latin-1 (ISO 8859-1). This means an |
archive encrypted with an 8-bit password with one of the two PKZIP versions |
cannot be decrypted with the other version. |
- PKZIP for Windows GUI (v 2.60), PKZIP for Windows command line (v 2.50) and |
PKZIP for Unix (v 2.51) save the host's native file timestamps, but |
only in a local extra field. Thus, timestamp-related selections (update |
or freshen, both in extraction or archiving operations) use the DOS-format |
localtime records in the Zip archives for comparisons. This may result |
in wrong decisions of the program when updating archives that were |
previously created in a different local time zone. |
- PKZIP releases newer than PKZIP for DOS 2.04g (PKZIP for Windows, both |
GUI v 2.60 and console v 2.50; PKZIP for Unix v 2.51; probably others too) |
use different code pages for storing filenames in central (OEM Codepage) |
and local (ANSI / ISO 8859-1 Codepage) headers. When a stored filename |
contains extended-ASCII characters, the local and central filename fields |
do not match. As a consequence, Info-ZIP's Zip program considers such |
archives as being corrupt and does not allow to modify them. Beginning |
with release 5.41, Info-ZIP's UnZip contains a workaround to list AND |
extract such archives with the correct filenames. |
Maybe PKWARE has implemented this "feature" to allow extraction of their |
"made-by-PKZIP for Unix/Windows" archives using old (v5.2 and earlier) |
versions of Info-ZIP's UnZip for Unix/WinNT ??? (UnZip versions before |
v 5.3 assumed that all archive entries were encoded in the codepage of |
the UnZip program's host system.) |
- PKUNZIP 2.04g is reported to have problems with archives created on and/or |
copied from Iomega ZIP drives (irony, eh?). |
Known, current WinZip bugs/limitations: |
-------------------------------------- |
- [16-bit version 6.1a] NT short filenames (FAT) are stored OK, but the |
file system is always listed as NTFS. |
- WinZip doesn't allow 8-bit passwords, which means it cannot decrypt an |
archive created with an 8-bit password (by PKZIP or Info-ZIP's Zip). |
- WinZip (at least Versions 6.3 PL1, 7.0 SR1) fails to remove old extra |
fields when freshening existing archive entries. When updating archives |
created by Info-ZIP's Zip that contain UT time stamp extra field blocks, |
UnZip cannot display or restore the updated (DOS) time stamps of the |
freshened archive members. |
Known, current other third-party Zip utils bugs/limitations: |
------------------------------------------------------------ |
- Asi's PKZip clones for Macintosh (versions 2.3 and 2.10d) are thoroughly |
broken. They create invalid Zip archives! |
a) For the first entry, both compressed size and uncompressed length |
are recorded as 0, despite the fact that compressed data of non-zero |
length has been added. |
b) Their program creates extra fields with an (undocumented) internal |
structure that violates the requirements of PKWARE's Zip format |
specification document "appnote.txt": Their extra field seems to |
contain pure data; the 4-byte block header consisting of block ID |
and data length is missing. |
Possibly current PKZIP bugs: |
--------------------------- |
- PKZIP (2.04g?) can silently ignore read errors on network drives, storing |
the correct CRC and compressed length but an incorrect and inconsistent |
uncompressed length. |
- PKZIP (2.04g?), when deleting files from within a zipfile on a Novell |
drive, sometimes only zeros out the data while failing to shrink the |
zipfile. |
Other limitations: |
----------------- |
- PKZIP 1.x and 2.x encryption has been cracked (known-plaintext approach; |
see http://www.cryptography.com/ for details). |
[many other bugs in PKZIP 1.0, 1.1, 1.93a, 2.04c and 2.04e] |
/programs/fs/unzip60/proginfo/CONTRIBS |
---|
0,0 → 1,262 |
This is a partial list of contributors to Info-ZIP's UnZip and the code upon |
which it is based. Others have also contributed, and if you are among them, |
please let us know (don't be shy!). Everyone who contributed via the Info- |
ZIP digest *should* now be listed here, but oversights are entirely possible. |
Mark Adler decryption, inflate, explode, funzip code; misc. casts |
Steve Alpert VMS rms.h bugfix |
Jeffrey Altman inflate.c huft_build() bugfix |
Glenn Andrews MS-DOS makefiles; prototyping bugfix; bogus main() fix |
Andrei Arkhipov Solaris 2.x package files |
Joel Aycock descrip.mms bugfix |
Vance Baarda Novell Netware 4.x NLM port |
Eric Baatz Borland version() info; Solaris zipgrep packaging fix |
Bob Babcock DOS volume-label code (FCBs) |
Charles Bailey VMS_SEVERITY fix; VMSWILD () extension |
Audrey Beck "WHERE" file info for AOL OS/2 forum |
Myles Bennet Initial start of Zip64 and Large-File handling |
Mike Bernardi Unix makefile entry; CIX uploads |
James Birdsall extract.c/makefile/NT stuff, etc.; awesome beta tester |
Allan Bjorklund in misc.c |
Denise Blakeley Unix makefile entry |
Wim Bonner original OS/2 port; Unix makefile entry |
Paul Borman BSD/386 (BSDI) fixes; Unix makefile entry |
Carlton Brewster mapname bugfix |
Marcus Brinkmann Unix configuration fix for GNU/Hurd |
Rodney Brown stdin-/dev/null bugfix; VMS error levels; CRC optimiz. |
Stan Brown "zipinfo -M"/isatty(1) bugfix |
Jens von Buelow port to MPE/iX, a Unix variant running on HP 3000 |
John Bush first full Amiga port; FileDate; Amiga fixes; etc. |
Christian Carey Unix makefile bugfix for install target (create dirs) |
Valter Cavecchia Unix makefile entry |
Rudolf Cejka Unix UID/GID extraction bugfix |
Peter Chang optional UNIXBACKUP option (-B) |
Kevin Cheng windll MBCS fix (setlocale initialization) |
Andrey Chernov BSD 4.4 utime fix |
Brad Clarke Win32 XX_flag bugfix; Borland debug code removal |
Mark Clayton LynxOS (unix/Makefile update) |
John Cowan mods to original match.c; other stuff? |
Frank da Cruz xxu.c, on which original mapname.c was based |
Bill Davidsen -q(q); mapname stuff; envargs; Xenix stuff; opts; etc. |
Karl Davis Acorn RISC OS port |
Jim Delahanty NTSD fixes |
Harald Denker major Atari update/fixes |
Matt "Doc" D'Errico AIX stuff, Unix makefile entry |
Kim DeVaughn Unix makefile entry |
Arjan de Vet various things, but I don't remember exactly what... |
Frank Donahoe djgpp v2.x makefile; documentation updates |
Jean-Michel Dubois THEOS port |
James Dugal ZMEM stuff; unshrink bugfix; file perms stuff; etc. |
Jim Dumser -z stuff; umask, opendir/Borland, UID fixes; etc. |
Peter Eckel DOS buffer-overrun fix |
Mark Edwards mapname.c, misc.c fixes; Unix makefile entry |
Paul Eggert man pages update for POSIX compatibility |
Gershon Elber Unix makefile entry |
Patrick Ellis VMS usage fix (`-' vs. `/' options) |
Shane Erstad Borland makefile bugfix |
Thomas Esken Acorn typo fix |
Bruce Evans Unix makefile entry |
Derek Fawcus FlexOS port |
David Feinleib Windows NT port |
David Fenyes Unix makefile entry |
Scott Field Windows NT security-descriptor support; CRC opts |
Greg Flint Unix makefile entry |
Carl Forde VM/CMS port debugging (with Christian Spieler) |
Craig Forbes "UnZipToMem with no ucsize in local header" bugfix |
Joe Foster Unix makefile bugfix |
Gordon Fox Unix makefile bugfix for apollo target |
Jeffrey Foy OS/2 stuff(?); [CP/M] |
Mike Freeman VMS gcc makefiles; VMS bugfixes; etc. |
Kevin Fritz Borland bugfixes; MS-DOS makefile fixes; etc. |
Aaron Gaalswyk OS/2 checkdir() fix |
Jean-loup Gailly decryption code; ReadByte replacement; much nagging :-) |
Forrest Gehrke Unix makefile entry |
Tim Geibelhaus Unix makefile entry |
Henry Gessau flush/Fwrite/outcnt fixes; new NT port |
Christian Ghisler inflate tweaks |
Filip Gieszczykiewicz Unix makefile entry |
Paul Gilmartin work-around for systems with broken catman/makewhatis |
Hunter Goatley VMSCLI interface; VMS help/RUNOFF; list maintainer |
Ian E. Gorman VM/CMS & MVS support |
Bill Gould MVS file-format fixes |
Michael Graff Unix makefile entry |
Juan Manuel Guerrero DOS/WIN32 filename mapping fixes, device name handling |
Giuseppe Guerrini LynxOS variant of Unix port |
Richard H. Gumpertz Unix makefile entry |
Walter Haidinger Amiga SAS/C fixes |
Steve Hanna Macintosh stuff |
Mark Hanning-Lee docs corrections, Unix Makefile fixes, "check" target |
Guy Harris ZipInfo man-page typo fix |
Greg Hartwig finished VM/CMS port |
Robert Heath Windows GUI port (WizUnZip) |
Dave Heiland new usage screen |
Ron Henderson -a bugfix |
Chris Herborth new Atari port; Atari fixes |
Greg Hill docs update |
Lon Hohberger security fix; security advice in man-page |
John Hollow "WHERE" file path corrections |
Jason Hood DOS screen-width support |
Phil Howard Unix makefile entry |
Jonathan Hudson SMS/QDOS port |
Joe Isuzu Unix makefile entry |
Kimio Itoh ZipInfo DIR_END bugfix for MSVC 4.0 |
Aubrey Jaffer pixel, v7 targets |
"jelmer" directory traversal security fix |
Graham Jenkins Sequent Dynix/ptx bugfix |
Peter Jones Unix makefile entry |
Larry Jones ZMEM stuff; unimplod fix; crc_i386.S improvements; etc. |
Warren Jones MKS bugfix |
Kjetil J{\o}rgenson Makefile, OSF/1, NetBSD fixes; djgpp v2 mods; USE_VFAT |
Bruce Kahn DOS floppy detection?; Unix makefile entry |
Bob Kemp NOTINT16 rewrite; Unix makefile entry |
J. Kercheval filmatch.c, on which second match.c was based |
Paul Kienitz continuing general Amiga porting; Aztec C support; ASM |
Raymond L. King WINDLL VB example maintenance |
Mike Kincer AIX "ps2" bugfix |
David Kirschbaum mapname port; general-purpose meddling; Python jokes |
Paul Klahr Regulus port |
Jim Knoble Turbo C++ makefile fix |
Alvin Koh Borland C++ bugfixes |
D. Krumbholz Acorn filetime conversion bug |
Karel Kubat Linux strncasecmp bugfix |
Bo Kullmar -z code; umask, do_string, BSD time, echo fixes; etc. |
Peter Kunath DLL bugfixes, MSVC __asm support |
Russell Lang OS/2 DLL calling-convention bugfix |
Michael Lawler Borland version() info; process.c string fix; DOS fixes |
Rudolf Lechleitner inflate memory leak fix |
Johnny Lee Macintosh port; Win3.1 port; far strings; fixes; etc. |
Alexander Lehmann makefile.tc bugfix; MS-DOS mapname() bugfix |
Marty Leisner Unix perms fix for non-Unix dirs; man pages fonts; etc. |
Fred Lenk docs e-mail bugfix |
Daniel Lewart AIX stuff; compiler warnings |
Jim Lill SCO Unix SYSNDIR bugfix |
John Limpert Unix makefile entry |
Hogan Long Borland preprocessor bugfix |
Mike Long Unix Makefile installation bugfix |
Warner Losh in misc.c |
Dave Lovelace Data General AOS/VS port |
Stew Loving-Gibbard original Windows 16-bit DLL port (non-WizUnZip version) |
Dale Lutz \-to-/ conversion argv/argc bugfix |
Tony Luu NT timezone bugfix |
Igor Mandrichenko vms.c; many improvements and VMS modifications |
Javier Manero fileio.c bugfix; MS-DOS version() bugfix; Watcom fix |
Paul Manno makefile.tc fixes |
Claude Marinier Unix makefile recursive fix |
Fulvio Marino revised UnZip and ZipInfo man pages; Makefile entry |
Carl Mascott original Unix port |
Rafal Maszkowski Convex unzip.h fixes; Unix makefile entry |
Jim Mathies signal handler installing bugfix |
Eberhard Mattes handler() bugfix; docs update |
Adrian Maull .NET C# example projects for Zip and UnZip dll |
Peter Mauzey Unix makefile entry |
Scott Maxwell version.h; massive reentrancy fixes; OS/2 DLL port |
Bob Maynard 16-bit OS/2 pathname bugfix |
Randy McCaskile Unix makefile entry |
John McDonald OS/2 zip2exe script |
Gene McManus -o code |
Joe Meadows file.c, on which VMSmunch.c (timestamps) was based |
Jason Merrill Sequent patches |
Tom Metro corrupted-zipfile handler bugfix |
Ian Miller VMS makefile portability bugfix (non-standard "edit") |
Steve Miller Windows CE GUI port; memory leak bugfix; etc. |
Ricky Mobley Unix makefile entry |
Navin Modi Unix makefile entry |
Sergio Monesi Acorn RISC OS port |
Paul Motsuk Borland _rtl_chmod() fix |
Anthony Naggs MS-DOS error handling stuff |
Jim Neeland unused-variables fix; Unix makefile entry |
Harry Nyberg Macintosh INSTALL info |
Mauricio Ponzo UNIXBACKUP fix |
NIIMI Satoshi Human68k port |
Mike O'Carroll early OS/2 stuff |
Michael D. O'Connor DOS ifdef/elif mismatch fix; makefile.tc fixes |
"Moby" Dick O'Connor Unix makefile entry |
Thomas Opheys Watcom C stat() bugfix |
Humberto Ortiz-Zuazaga Linux port; permissions bugfix; missing declarations |
Keith Owens MVS support and extensions |
Fernando Papa inflate memory leaks |
Rafael Pappalardo Convex CRYPT bugfix; Convex Makefile entry, useful info |
Trevor Paquette Unix makefile entry |
Keith Petersen Pyramid fixes; former Info-ZIP list maintainer |
George Petrov initial MVS, VM/CMS ports (!) |
Alan Phillips Unix makefile entry |
Art Pina C Set/2 crypt.c optimization bug |
Piet W. Plomp Unix chmod()/chown() fix; msc_dos fixes; much testing |
Norbert Pueschel Amiga timelib |
Clint Pulley Unix makefile entry |
Antonio Querubin, Jr. descrip.mms (VMS makefile) |
Alistair Rae Encore preprocessor bugfix |
Eric S. Raymond manpage tweaks for DocBook compatibility |
Wally Reiher timezone bugfix |
Stephen Ritcey vms/README installation correction |
Phil Ritzenthaler ANSIfication bugfix |
Simon Roberts Windows CE 2.1x/3.0 cmdline port |
David Robinson MSC 6.0 stat() bugfix |
Jochen Roderburg floating-point BSD4_4 fix, Borland _timezone fix; etc. |
Greg Roelofs maintainer/principal author; ZipInfo; unshrink; etc. |
Kai Uwe Rommel "real" OS/2 port; many new compilers; bugfixes; etc. |
Paul Roub first self-extracting code |
Shimazaki Ryo human68k port updates |
Steve Salisbury Win32 fixes; dual-mode SFX instruct.; variable INBUFSIZ |
Darren Salt Acorn filetype <-> "Acorn NFS filetype" translation |
Georg Sassen Amiga DICE compiler support |
Jon Saxton date formats, OS/2 fixes |
Tom Schmidt Unix makefile entry; Xenix and SunOS 3 target bugfixes |
Hugh Schmidt VMS stuff |
Doug Schuessler Tandem/NSK port fixes |
Steven M. Schweda VMS: adapt new 7+ features, I/O performance enhanced |
Martin Schulz original Atari port, symlinks bugfix |
Charles Scripter various bug reports and bugfixes |
Chris Seaman Unix time stuff |
Richard Seay MS-DOS Quick C makefile |
Peter Seebach fUnZip int main() bugfix |
Matthew Seitz keep inherited SGID attrib for created dirs on Unix |
Gisbert Selke Unix makefile entry |
Alex Sergejew fileio.c, stat(), Makefile fixes; Down Under jokes :-) |
Jim Seymour Borland OS/2 fixes |
Mark Shadley Unix -X, FGETCH, DESTROYGLOBALS & Unix makefile fixes |
Miki Shapiro DLL: zipfilehandle leak (unhandled lseek errors) |
Timur Shaporev inflate optimizations |
Eric Siegerman bugfix for Unix' port attribute mapper |
Paul Slootman partial fix >2G handling on 64bit file offset systems |
Dave Smith Tandem/NSK port |
Fred Smith Coherent 4.0 fixes |
Nick Smith return code for user abort (control-C) |
Samuel H. Smith original unzip code (Pascal and C) for MS-DOS |
Tuomo Soini file-not-matched bugfix |
Jim Spath zipinfo -T century bugfix |
Christian Spieler VMS, DOS, WIN32, VM/CMS, portability & performance |
Cliff Stanford fileio.c umask bug |
Jack Stansbury DEC Alpha NT makefile fix |
Dan Statkus OS/2, MS-DOS mapname() ASCII 255 bugfix |
Jochen Stein Unix makefile entry |
Jim Steiner Unix makefile entry |
Richard Stephen Unix makefile entry |
Wayne Stewart "WHERE" file MS-DOS correction |
Mike Strock Win32 MSVC 5.0 "build file"; typo fixes |
E-Yen Tan djgpp1/GNUmake 3.71 bug work-around; DOS makefile.emx |
Brian Tillman "WHERE" file VMS fix; make_unz.com portability bugfix |
Cosmin Truta Cygwin support; various C & ASM fixes |
Onno van der Linden many fixes, esp. Intel Unix and 386 DOS |
Jim Van Zandt one of original man pages |
Geraldo Veiga Pyramid strrchr/rindex |
Erik-Jan Vens Unix makefile entry |
Antoine Verheijen new Mac port; Mac bugfixes; MTS/EBCDIC stuff; etc. |
Santiago Vila -t stderr/stdout fix |
Rich Wales former Info-ZIP moderator and zip guy; MKS stuff |
Frank Wancho original TOPS-20 port |
Douglas Wegscheid djgpp 2.x USE_LFN compatibility fix |
Yohanan Weininger docs update |
Paul Weiss unzipsfx bugfix |
Paul Wells original Amiga port for SAS/C and Lattice C (?) |
Mike White Windows GUI port version 3; 16- and 32-bit Windows DLLs |
Rainer Wilcke HP/UX termios bugfix; man-page fixes |
Charles Wilson Cygwin support |
Greg Woods man-pages bugfixes |
Mark Wright original Netware 3.11 NLM port |
Randy Wright Unix makefile entry |
Meiwei Wu open() return bugfix |
Steve Youngs win32 timestamp conversion bugfix |
Clay Zahrobsky .zip/wildcard bugfix |
Eli Zaretskii improvements to DOS-mode VFAT support; djgpp 2.x fixes |
Martin P.J. Zinser VMS .hlp file for unzipsfx; MAKESFX.COM command file |
/programs/fs/unzip60/proginfo/Contents |
---|
0,0 → 1,13 |
Contents of the "proginfo" subdirectory for UnZip 5.42 and later: |
Contents this file |
CONTRIBS list of contributors to UnZip |
ZipPorts Info-ZIP rules and guidelines on contributions to the cause |
3rdparty.bug known bugs in PK[UN]ZIP, WinZip, etc. |
defer.in info about the NEXTBYTE macro and defer/undefer_input functions |
extrafld.txt info about all known "extra field" types |
fileinfo.cms info about the VM/CMS file system, including record formats |
nt.sd info about support for Windows NT's Security Descriptors (ACLs) |
perform.dos relative performance of Zip and UnZip with various DOS compilers |
timezone.txt explanation of the TZ environment variable for timezones |
ziplimit.txt limits of the Zip archive format and InfoZip's implementation |
/programs/fs/unzip60/proginfo/ZipPorts |
---|
0,0 → 1,285 |
__________________________________________________________________________ |
This is the Info-ZIP file ZipPorts, last updated on 17 February 1996. |
__________________________________________________________________________ |
This document defines a set of rules and guidelines for those who wish to |
contribute patches to Zip and UnZip (or even entire ports to new operating |
systems). The list below is something between a style sheet and a "Miss |
Manners" etiquette guide. While Info-ZIP encourages contributions and |
fixes from anyone who finds something worth changing, we are also aware |
of the fact that no two programmers have the programming style and that |
unrestrained changes by a few dozen contributors would result in hideously |
ugly (and unmaintainable) Frankenstein code. So consider the following an |
attempt by the maintainers to maintain sanity as well as useful code. |
(The first version of this document was called either "ZipRules" or the |
"No Feelthy ..." file and was compiled by David Kirschbaum in consulta- |
tion with Mark Adler, Cave McNewt and others. The current incarnation |
expands upon the original with insights gained from a few more years of |
happy hacking...) |
Summary: |
(0) The Platinum Rule: DON'T BREAK EXISTING PORTS |
(0.1) The Golden Rule: DO UNTO THE CODE AS OTHERS HAVE DONE BEFORE |
(0.2) The Silver Rule: DO UNTO THE LATEST BETA CODE |
(0.3) The Bronze Rule: NO FEELTHY PIGGYBACKS |
(1) NO FEELTHY TABS |
(2) NO FEELTHY CARRIAGE RETURNS |
(3) NO FEELTHY 8-BIT CHARS |
(4) NO FEELTHY LEFT-JUSTIFIED DASHES |
(5) NO FEELTHY FANCY_FILENAMES |
(6) NO FEELTHY NON-ZIPFILES AND NO FEELTHY E-MAIL BETAS |
(7) NO FEELTHY E-MAIL BINARIES |
Explanations: |
(0) The Platinum Rule: DON'T BREAK EXISTING PORTS |
No doubt about it, this is the one which really pisses us off and |
pretty much guarantees that your port or patch will be ignored and/ |
or laughed at. Examples range from the *really* severe cases which |
"port" by ripping out all of the existing multi-OS code, to more |
subtle oopers like relying on a local capability which doesn't exist |
on other OSes or in older compilers (e.g., the use of ANSI "#elif" |
or "#pragma" or "##" constructs, C++ comments, GNU extensions, etc.). |
As to the former, use #ifdefs for your new code (see rule 0.3). And |
as to the latter, trust us--there are few things we'd like better |
than to be able to use some of the elegant "new" features out there |
(many of which have been around for a decade or more). But our code |
still compiles on machines dating back even longer, at least in spirit |
--e.g., the AT&T 3B1 family and Dynix/ptx. Until we say otherwise, |
dinosaurs are supported. |
(0.1) The Golden Rule: DO UNTO THE CODE AS OTHERS HAVE DONE BEFORE |
In other words, try to fit into the local style of programming--no |
matter how painful it may be. This includes cosmetic aspects like |
indenting the same amount (both in the main C code and in the in- |
clude files), using braces and comments similarly, NO TABS (see rule |
#1), etc.; but also more substantive things like (for UnZip) putting |
character strings into static (far) variables and using the LoadFar- |
String macros to avoid overflowing limited MS-DOS data segments, and |
using the ugly Info() macro instead of the more usual *printf() |
functions so that dynamic-link-library ports are simpler. NEVER put |
single-OS code (e.g., OS/2) of more than two or three lines into the |
main (generic) modules; those are shared by everybody, and nobody else |
cares about it or wants to see it. |
Note that not only do Zip and UnZip differ in these respects, so do |
individual parts of each program. While it would be nice to have |
global consistency, cosmetic changes are not a high priority; for |
now we'll settle for local consistency--i.e., don't make things any |
worse than they already are. |
Exception (BIG exception): single-letter variable names. Despite |
the prevailing practice in much of Zip and parts of UnZip, and de- |
spite the fact that one-letter variables allow you to pack really |
cool, compact and complicated expressions onto one line, they also |
make the code very difficult to maintain and are therefore *strongly* |
discouraged. Don't ask us who is responsible in the first place; |
while this sort of brain damage is not uncommon among former BASIC |
programmers, it is nevertheless a lifelong embarrassment, and we do |
try to pity the poor sod (that is, when we're not chasing bugs and |
cursing him). :-) |
(0.2) The Silver Rule: DO UNTO THE LATEST BETA CODE |
Few things are as annoying as receiving a large patch which obviously |
represents a lot of time and careful work but which is relative to |
an old version of Info-ZIP code. As wonderful as Larry Wall's patch |
program is at applying context diffs to modified code, we regularly |
make near-global changes and/or reorganize big chunks of the sources |
(particularly in UnZip), and "patch" can't work miracles--big changes |
invariably break any patch which is relative to an old version of the |
code. |
Bottom line: contact the Info-ZIP core team FIRST (via the zip-bugs |
e-mail address) and get up to date with the latest code before begin- |
ning a big new port. And try to *stay* up to date while working on |
your port--at least, as much as possible. |
(0.3) The Bronze Rule: NO FEELTHY PIGGYBACKS |
UnZip is currently ported to something like 12 operating systems |
(a few more or less depending on how one counts), and each of these, |
with the possible exception of VM/CMS, has a unique macro identifying |
it: AMIGA, ATARI_ST, __human68k__, MACOS, MSDOS, MVS, OS2, TOPS20, |
UNIX, VMS, WIN32. Zip is moving in the same direction. New ports |
should NOT piggyback one of the existing ports unless they are sub- |
stantially similar--for example, Minix and Coherent are basically Unix |
and therefore are included in the UNIX macro, but DOS djgpp ports and |
OS/2 emx ports (both of which use the Unix-originated GNU C compiler |
and often have "unix" defined by default) are obviously *not* Unix. |
[The existing MTS port is a special exception; basically only one per- |
son knows what MTS really is, and he's not telling. Presumably it's |
not very close to Unix, but it's not worth arguing about it now.] |
Along the same lines, neither OS/2 nor Human68K is the same as (or |
even close to) MS-DOS. MVS and VM/CMS, on the other hand, are quite |
similar to each other and are therefore combined in most places. |
Bottom line: when adding a new port (e.g., QDOS), create a new macro |
for it ("QDOS"), a new subdirectory ("qdos") and a new source file for |
OS-specific code ("qdos/qdos.c"). Use #ifdefs to fit any OS-specific |
changes into the existing code (e.g., unzpriv.h). If it's close enough |
to an existing port that piggybacking is a temptation, define a new |
"combination macro" (e.g., "CMS_MVS") and replace the old macros as |
required. (This last applies to UnZip, at least; the old preference |
in Zip was fewer macros and long #ifdef lines, so talk to Onno or Jean- |
loup about that.) See also rule 0.1. |
(Note that, for UnZip, new ports need not attempt to deal with all |
features. Among other things, the wildcard-zipfile code in do_wild() |
may be replaced with a supplied dummy version, since opendir/readdir/ |
closedir() or the equivalent can be difficult to implement.) |
(1) NO FEELTHY TABS |
Some editors and e-mail systems either have no capability to use |
and/or display tab characters (ASCII 9) correctly, or they use non- |
standard or variable-width tab columns, or other horrors. Some edi- |
tors auto-convert spaces to tabs, after which the blind use of "diff |
-c" results in a huge and mostly useless patch. Yes, *we* know about |
diff's "-b" option, but not everyone does. And yes, we also know this |
makes the source files bigger, even after compression; so be it. If |
we *really* cared that much about the size of the sources, we'd still |
be writing Unix-only utilities. |
Bottom line: use spaces, not tabs. |
Exception: some of the makefiles (the Unix one in particular) require |
tabs as part of the syntax. |
Related utility programs: |
Unix, OS/2 and MS-DOS: expand, unexpand. |
MS-DOS: Buerg's TABS; Toad Hall's TOADSOFT. |
And some editors have the conversion built-in. |
(2) NO FEELTHY CARRIAGE RETURNS |
All source, documentation and other text files shall have Unix style |
line endings (LF only, a.k.a. ctrl-J), not the DOS/OS2/NT CR+LF or Mac |
CR-only line endings. |
Reason: "real programmers" in any environment can convert back and |
forth between Unix and DOS/Mac style. All PC compilers but a few old |
Borland versions can use either Unix or MS-DOS end-of-lines. Buerg's |
LIST (file-display utility) for MS-DOS can use Unix or MS-DOS EOLs. |
Both Zip and UnZip can convert line-endings as appropriate. But Unix |
utilities like diff and patch die a horrible death (or produce horrible |
output) if the target files have CRs. |
Related utilities: flip for Unix, OS/2 and MS-DOS; Unix "tr". |
Exceptions: documentation in pre-compiled binary distributions should |
be in the local (target) format. |
(3) NO FEELTHY 8-BIT CHARS |
Do all your editing in a plain-text ASCII editor. No WordPerfect, MS |
Word, WordStar document mode, or other word processor files, thenkyew. |
No desktop publishing. *Especially* no EBCDIC. No TIFFs, no GIFs, no |
embedded pictures or dancing ladies (too bad, Cave Newt). [Sigh... -CN] |
Reason: compatibility with different consoles. My old XT clone is |
the most limited! |
Exceptions: some Macintosh makefiles apparently require some 8-bit |
characters; the Human68k port uses 8-bit characters for Kanji or Kana |
comments (I think); etc. |
Related utilities: vi, emacs, EDLIN, Turbo C editor, other programmers' |
editors, various word processor -> text conversion utilities. |
(4) NO FEELTHY LEFT-JUSTIFIED DASHES |
Always precede repeated dashes (------) with one or more leading non- |
dash characters: spaces, tabs, pound signs (#), comments (/*), what- |
ever. |
Reason: sooner or later your source file will be e-mailed through an |
undigestifier utility, most of which treat leading dashes as end-of- |
message separators. We'd rather not have your code broken up into a |
dozen separate untitled messages, thank you. |
(5) NO FEELTHY FANCY_FILENAMES |
Assume the worst: that someone on a brain-damaged DOS system has to |
work with everything your magic fingers produced. Keep the filenames |
unimaginative and within MS-DOS limits (i.e., ordinary A..Z, 1..9, |
"-$_!"-type characters, in the 8.3 "filename.ext" format). Mac and |
Unix users, giggle all you want, but no spaces or multiple dots. |
Reason: compatibility with different file systems. MS-DOS FAT is the |
most limited, with the exception of CompuServe (6.3, argh). |
Exceptions: slightly longer names are occasionally acceptable within |
OS-specific subdirectories, but don't do that unless there's a good |
reason for it. |
(6) NO FEELTHY NON-ZIPFILES AND NO FEELTHY E-MAIL BETAS |
Beta testers and developers are in general expected to have both |
ftp capability and the ability to deal with zipfiles. Those without |
should either find a friend who does or else learn about ftp-mailers. |
Reason: the core development team barely has time to work on the |
code, much less prepare oddball formats and/or mail betas out (and |
the situation is getting worse, sigh). |
Exceptions: anyone seriously proposing to do a new port will be |
given special treatment, particularly with respect to UnZip; we |
obviously realize that bootstrapping a completely new port can be |
quite difficult and have no desire to make it even harder due to |
lack of access to the latest code (rule 0.2). |
Public releases of UnZip, on the other hand, will be available in |
two formats: .tar.Z (16-bit compress'd tar) and .zip (either "plain" |
or self-extracting). Zip sources and executables will generally only |
be distributed in .zip format, since Zip is pretty much useless without |
UnZip. |
(7) NO FEELTHY E-MAIL BINARIES |
Binary files (e.g., executables, test zipfiles, etc.) should NEVER |
be mailed raw. Where possible, they should be uploaded via ftp in |
BINARY mode; if that's impossible, Mark's "ship" ASCII-encoder should |
be used; and if that's unavailable, uuencode or xxencode should be |
used. Weirdo NeXTmail, mailtool and MIME formats are also Right Out. |
Files larger than 50KB may need to be broken into pieces for mailing |
(be sure to label them in order!), unless "ship" is used (it can |
auto-split, label and mail files if told to do so). If Down Under |
is involved, files must be broken into under-20KB chunks. |
Reasons: to prevent sounds of gagging mailers from resounding through- |
out the land. To be relatively efficient in the binary->ASCII conver- |
sion. (Yeah, yeah, I know, there's better conversions out there. But |
not as widely known, and they often break on BITNET gateways.) |
Related utilities: ship, uuencode, uudecode, uuxfer20, quux, others. |
Just make sure they don't leave embedded or trailing spaces (that is, |
they should use the "`" character in place of ASCII 32). Otherwise |
mailers are prone to truncate or whatever. |
Greg Roelofs (a.k.a. Cave Newt) |
Info-ZIP UnZip maintainer |
David Kirschbaum |
former Info-ZIP Coordinator |
/programs/fs/unzip60/proginfo/defer.in |
---|
0,0 → 1,45 |
[Regarding an optimization to the bounds-checking code in the core |
NEXTBYTE macro, which code is absolutely vital to the proper processing |
of corrupt zipfiles (lack of checking can result in an infinite loop) |
but which also slows processing.] |
The key to the solution is a pair of small functions called |
defer_leftover_input() and undefer_input(). The idea is, whenever |
you are going to be processing input using NEXTBYTE, you call |
defer_leftover_input(), and whenever you are going to process input by |
any other means, such as readbuf(), ZLSEEK, or directly reading stuff |
into G.inbuf, you call undefer_input(). What defer_leftover_input() |
does is adjust G.incnt so that any data beyond the current end of file |
is not visible. undefer_input() restores it to visibility. So when |
you're calling NEXTBYTE (or NEEDBITS or READBITS), an end-of-data |
condition only occurs at the same time as an end-of-buffer condition, |
and can be handled inside readbyte() instead of needing a check in the |
NEXTBYTE macro. Note: none of this applies to fUnZip. |
In order for this to work, certain conditions have to be met: |
1) NEXTBYTE input must not be mixed with other forms of input involving |
G.inptr and G.incnt. They must be separated by defer/undefer. |
I believe this condition is fully met by simply bracketing the central |
part of extract_or_test_member with defer/undefer, around the part |
where the actual decompression is done, and another defer/undefer pair |
in decrypt() around the reading of the RAND_HEADER_LEN password bytes. |
When USE_ZLIB is defined, I think that calls of fillinbuf() must be |
bracketed by defer/undefer. |
2) G.csize must not be assumed to contain the number of bytes left to |
process, when decompressing with NEXTBYTE. Instead, it contains |
the number of bytes left after the current buffer is exausted. To |
check the number of bytes remaining, use (G.csize + G.incnt). |
I believe the only places this change was needed were in explode.c, |
mostly in the check at the end of each explode function that tests |
whether the correct number of bytes has been read. G.incnt will |
normally be zero at that time anyway. The other place is the line |
that says "bd = G.csize > 200000L ? 8 : 7;" but that's just a rough |
heuristic anyway. |
[Paul Kienitz] |
/programs/fs/unzip60/proginfo/extrafld.txt |
---|
0,0 → 1,1608 |
The following are the known types of zipfile extra fields as of this |
writing. Extra fields are documented in PKWARE's appnote.txt and are |
intended to allow for backward- and forward-compatible extensions to |
the zipfile format. Multiple extra-field types may be chained together, |
provided that the total length of all extra-field data is less than 64KB. |
(In fact, PKWARE requires that the total length of the entire file header, |
including timestamp, file attributes, filename, comment, extra field, etc., |
be no more than 64KB.) |
Each extra-field type (or subblock) must contain a four-byte header con- |
sisting of a two-byte header ID and a two-byte length (little-endian) for |
the remaining data in the subblock. If there are additional subblocks |
within the extra field, the header for each one will appear immediately |
following the data for the previous subblock (i.e., with no padding for |
alignment). |
All integer fields in the descriptions below are in little-endian (Intel) |
format unless otherwise specified. Note that "Short" means two bytes, |
"Long" means four bytes, and "Long-Long" means eight bytes, regardless |
of their native sizes. Unless specifically noted, all integer fields should |
be interpreted as unsigned (non-negative) numbers. |
Christian Spieler, Ed Gordon, 20080717 |
------------------------- |
Header ID's of 0 thru 31 are reserved for use by PKWARE. |
The remaining ID's can be used by third party vendors for |
proprietary usage. |
The current Header ID mappings defined by PKWARE are: |
0x0001 Zip64 extended information extra field |
0x0007 AV Info |
0x0008 Reserved for extended language encoding data (PFS) |
0x0009 OS/2 extended attributes (also Info-ZIP) |
0x000a NTFS (Win9x/WinNT FileTimes) |
0x000c OpenVMS (also Info-ZIP) |
0x000d UNIX |
0x000e Reserved for file stream and fork descriptors |
0x000f Patch Descriptor |
0x0014 PKCS#7 Store for X.509 Certificates |
0x0015 X.509 Certificate ID and Signature for |
individual file |
0x0016 X.509 Certificate ID for Central Directory |
0x0017 Strong Encryption Header |
0x0018 Record Management Controls |
0x0019 PKCS#7 Encryption Recipient Certificate List |
0x0065 IBM S/390 (Z390), AS/400 (I400) attributes |
- uncompressed |
0x0066 Reserved for IBM S/390 (Z390), AS/400 (I400) |
attributes - compressed |
0x4690 POSZIP 4690 (reserved) |
The Header ID mappings defined by Info-ZIP and third parties are: |
0x07c8 Info-ZIP Macintosh (old, J. Lee) |
0x2605 ZipIt Macintosh (first version) |
0x2705 ZipIt Macintosh v 1.3.5 and newer (w/o full filename) |
0x2805 ZipIt Macintosh 1.3.5+ |
0x334d Info-ZIP Macintosh (new, D. Haase's 'Mac3' field) |
0x4154 Tandem NSK |
0x4341 Acorn/SparkFS (David Pilling) |
0x4453 Windows NT security descriptor (binary ACL) |
0x4704 VM/CMS |
0x470f MVS |
0x4854 Theos, old inofficial port |
0x4b46 FWKCS MD5 (see below) |
0x4c41 OS/2 access control list (text ACL) |
0x4d49 Info-ZIP OpenVMS (obsolete) |
0x4d63 Macintosh SmartZIP, by Macro Bambini |
0x4f4c Xceed original location extra field |
0x5356 AOS/VS (binary ACL) |
0x5455 extended timestamp |
0x554e Xceed unicode extra field |
0x5855 Info-ZIP UNIX (original; also OS/2, NT, etc.) |
0x6375 Info-ZIP UTF-8 comment field |
0x6542 BeOS (BeBox, PowerMac, etc.) |
0x6854 Theos |
0x7075 Info-ZIP UTF-8 name field |
0x7441 AtheOS (AtheOS/Syllable attributes) |
0x756e ASi UNIX |
0x7855 Info-ZIP UNIX (16-bit UID/GID info) |
0x7875 Info-ZIP UNIX 3rd generation (generic UID/GID, ...) |
0xa220 Microsoft Open Packaging Growth Hint |
0xfb4a SMS/QDOS |
The following are detailed descriptions of the known extra-field block types: |
-Zip64 Extended Information Extra Field (0x0001): |
=============================================== |
The following is the layout of the zip64 extended |
information "extra" block. If one of the size or |
offset fields in the Local or Central directory |
record is too small to hold the required data, |
a zip64 extended information record is created. |
The order of the fields in the zip64 extended |
information record is fixed, but the fields will |
only appear if the corresponding Local or Central |
directory record field is set to 0xFFFF or 0xFFFFFFFF. |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(ZIP64) 0x0001 2 bytes Tag for this "extra" block type |
Size 2 bytes Size of this "extra" block |
Original |
Size 8 bytes Original uncompressed file size |
Compressed |
Size 8 bytes Size of compressed data |
Relative Header |
Offset 8 bytes Offset of local header record |
Disk Start |
Number 4 bytes Number of the disk on which |
this file starts |
This entry in the Local header must include BOTH original |
and compressed file size fields. If encrypting the |
central directory and bit 13 of the general purpose bit |
flag is set indicating masking, the value stored in the |
Local Header for the original file size will be zero. |
-OS/2 Extended Attributes Extra Field (0x0009): |
============================================= |
The following is the layout of the OS/2 extended attributes "extra" |
block. (Last Revision 19960922) |
Note: all fields stored in Intel low-byte/high-byte order. |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(OS/2) 0x0009 Short tag for this extra block type |
TSize Short total data size for this block |
BSize Long uncompressed EA data size |
CType Short compression type |
EACRC Long CRC value for uncompressed EA data |
(var.) variable compressed EA data |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(OS/2) 0x0009 Short tag for this extra block type |
TSize Short total data size for this block (4) |
BSize Long size of uncompressed local EA data |
The value of CType is interpreted according to the "compression |
method" section above; i.e., 0 for stored, 8 for deflated, etc. |
The OS/2 extended attribute structure (FEA2LIST) is |
compressed and then stored in its entirety within this |
structure. There will only ever be one "block" of data in |
the variable-length field. |
-OS/2 Access Control List Extra Field: |
==================================== |
The following is the layout of the OS/2 ACL extra block. |
(Last Revision 19960922) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(ACL) 0x4c41 Short tag for this extra block type ("AL") |
TSize Short total data size for this block |
BSize Long uncompressed ACL data size |
CType Short compression type |
EACRC Long CRC value for uncompressed ACL data |
(var.) variable compressed ACL data |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(ACL) 0x4c41 Short tag for this extra block type ("AL") |
TSize Short total data size for this block (4) |
BSize Long size of uncompressed local ACL data |
The value of CType is interpreted according to the "compression |
method" section above; i.e., 0 for stored, 8 for deflated, etc. |
The uncompressed ACL data consist of a text header of the form |
"ACL1:%hX,%hd\n", where the first field is the OS/2 ACCINFO acc_attr |
member and the second is acc_count, followed by acc_count strings |
of the form "%s,%hx\n", where the first field is acl_ugname (user |
group name) and the second acl_access. This block type will be |
extended for other operating systems as needed. |
-Windows NT Security Descriptor Extra Field (0x4453): |
=================================================== |
The following is the layout of the NT Security Descriptor (another |
type of ACL) extra block. (Last Revision 19960922) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(SD) 0x4453 Short tag for this extra block type ("SD") |
TSize Short total data size for this block |
BSize Long uncompressed SD data size |
Version Byte version of uncompressed SD data format |
CType Short compression type |
EACRC Long CRC value for uncompressed SD data |
(var.) variable compressed SD data |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(SD) 0x4453 Short tag for this extra block type ("SD") |
TSize Short total data size for this block (4) |
BSize Long size of uncompressed local SD data |
The value of CType is interpreted according to the "compression |
method" section above; i.e., 0 for stored, 8 for deflated, etc. |
Version specifies how the compressed data are to be interpreted |
and allows for future expansion of this extra field type. Currently |
only version 0 is defined. |
For version 0, the compressed data are to be interpreted as a single |
valid Windows NT SECURITY_DESCRIPTOR data structure, in self-relative |
format. |
-PKWARE Win95/WinNT Extra Field (0x000a): |
======================================= |
The following description covers PKWARE's "NTFS" attributes |
"extra" block, introduced with the release of PKZIP 2.50 for |
Windows. (Last Revision 20001118) |
(Note: At this time the Mtime, Atime and Ctime values may |
be used on any WIN32 system.) |
[Info-ZIP note: In the current implementations, this field has |
a fixed total data size of 32 bytes and is only stored as local |
extra field.] |
Value Size Description |
----- ---- ----------- |
(NTFS) 0x000a Short Tag for this "extra" block type |
TSize Short Total Data Size for this block |
Reserved Long for future use |
Tag1 Short NTFS attribute tag value #1 |
Size1 Short Size of attribute #1, in bytes |
(var.) SubSize1 Attribute #1 data |
. |
. |
. |
TagN Short NTFS attribute tag value #N |
SizeN Short Size of attribute #N, in bytes |
(var.) SubSizeN Attribute #N data |
For NTFS, values for Tag1 through TagN are as follows: |
(currently only one set of attributes is defined for NTFS) |
Tag Size Description |
----- ---- ----------- |
0x0001 2 bytes Tag for attribute #1 |
Size1 2 bytes Size of attribute #1, in bytes (24) |
Mtime 8 bytes 64-bit NTFS file last modification time |
Atime 8 bytes 64-bit NTFS file last access time |
Ctime 8 bytes 64-bit NTFS file creation time |
The total length for this block is 28 bytes, resulting in a |
fixed size value of 32 for the TSize field of the NTFS block. |
The NTFS filetimes are 64-bit unsigned integers, stored in Intel |
(least significant byte first) byte order. They determine the |
number of 1.0E-07 seconds (1/10th microseconds!) past WinNT "epoch", |
which is "01-Jan-1601 00:00:00 UTC". |
-PKWARE OpenVMS Extra Field (0x000c): |
=================================== |
The following is the layout of PKWARE's OpenVMS attributes |
"extra" block. (Last Revision 12/17/91) |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(VMS) 0x000c Short Tag for this "extra" block type |
TSize Short Total Data Size for this block |
CRC Long 32-bit CRC for remainder of the block |
Tag1 Short OpenVMS attribute tag value #1 |
Size1 Short Size of attribute #1, in bytes |
(var.) Size1 Attribute #1 data |
. |
. |
. |
TagN Short OpenVMS attribute tag value #N |
SizeN Short Size of attribute #N, in bytes |
(var.) SizeN Attribute #N data |
Rules: |
1. There will be one or more of attributes present, which |
will each be preceded by the above TagX & SizeX values. |
These values are identical to the ATR$C_XXXX and |
ATR$S_XXXX constants which are defined in ATR.H under |
OpenVMS C. Neither of these values will ever be zero. |
2. No word alignment or padding is performed. |
3. A well-behaved PKZIP/OpenVMS program should never produce |
more than one sub-block with the same TagX value. Also, |
there will never be more than one "extra" block of type |
0x000c in a particular directory record. |
-Info-ZIP VMS Extra Field: |
======================== |
The following is the layout of Info-ZIP's VMS attributes extra |
block for VAX or Alpha AXP. The local-header and central-header |
versions are identical. (Last Revision 19960922) |
Value Size Description |
----- ---- ----------- |
(VMS2) 0x4d49 Short tag for this extra block type ("JM") |
TSize Short total data size for this block |
ID Long block ID |
Flags Short info bytes |
BSize Short uncompressed block size |
Reserved Long (reserved) |
(var.) variable compressed VMS file-attributes block |
The block ID is one of the following unterminated strings: |
"VFAB" struct FAB |
"VALL" struct XABALL |
"VFHC" struct XABFHC |
"VDAT" struct XABDAT |
"VRDT" struct XABRDT |
"VPRO" struct XABPRO |
"VKEY" struct XABKEY |
"VMSV" version (e.g., "V6.1"; truncated at hyphen) |
"VNAM" reserved |
The lower three bits of Flags indicate the compression method. The |
currently defined methods are: |
0 stored (not compressed) |
1 simple "RLE" |
2 deflated |
The "RLE" method simply replaces zero-valued bytes with zero-valued |
bits and non-zero-valued bytes with a "1" bit followed by the byte |
value. |
The variable-length compressed data contains only the data corre- |
sponding to the indicated structure or string. Typically multiple |
VMS2 extra fields are present (each with a unique block type). |
-Info-ZIP Macintosh Extra Field: |
============================== |
The following is the layout of the (old) Info-ZIP resource-fork extra |
block for Macintosh. The local-header and central-header versions |
are identical. (Last Revision 19960922) |
Value Size Description |
----- ---- ----------- |
(Mac) 0x07c8 Short tag for this extra block type |
TSize Short total data size for this block |
"JLEE" beLong extra-field signature |
FInfo 16 bytes Macintosh FInfo structure |
CrDat beLong HParamBlockRec fileParam.ioFlCrDat |
MdDat beLong HParamBlockRec fileParam.ioFlMdDat |
Flags beLong info bits |
DirID beLong HParamBlockRec fileParam.ioDirID |
VolName 28 bytes volume name (optional) |
All fields but the first two are in native Macintosh format |
(big-endian Motorola order, not little-endian Intel). The least |
significant bit of Flags is 1 if the file is a data fork, 0 other- |
wise. In addition, if this extra field is present, the filename |
has an extra 'd' or 'r' appended to indicate data fork or resource |
fork. The 28-byte VolName field may be omitted. |
-ZipIt Macintosh Extra Field (long): |
================================== |
The following is the layout of the ZipIt extra block for Macintosh. |
The local-header and central-header versions are identical. |
(Last Revision 19970130) |
Value Size Description |
----- ---- ----------- |
(Mac2) 0x2605 Short tag for this extra block type |
TSize Short total data size for this block |
"ZPIT" beLong extra-field signature |
FnLen Byte length of FileName |
FileName variable full Macintosh filename |
FileType Byte[4] four-byte Mac file type string |
Creator Byte[4] four-byte Mac creator string |
-ZipIt Macintosh Extra Field (short, for files): |
============================================== |
The following is the layout of a shortened variant of the |
ZipIt extra block for Macintosh (without "full name" entry). |
This variant is used by ZipIt 1.3.5 and newer for entries of |
files (not directories) that do not have a MacBinary encoded |
file. The local-header and central-header versions are identical. |
(Last Revision 20030602) |
Value Size Description |
----- ---- ----------- |
(Mac2b) 0x2705 Short tag for this extra block type |
TSize Short total data size for this block (min. 12) |
"ZPIT" beLong extra-field signature |
FileType Byte[4] four-byte Mac file type string |
Creator Byte[4] four-byte Mac creator string |
fdFlags beShort attributes from FInfo.frFlags, |
may be omitted |
0x0000 beShort reserved, may be omitted |
-ZipIt Macintosh Extra Field (short, for directories): |
==================================================== |
The following is the layout of a shortened variant of the |
ZipIt extra block for Macintosh used only for directory |
entries. This variant is used by ZipIt 1.3.5 and newer to |
save some optional Mac-specific information about directories. |
The local-header and central-header versions are identical. |
Value Size Description |
----- ---- ----------- |
(Mac2c) 0x2805 Short tag for this extra block type |
TSize Short total data size for this block (12) |
"ZPIT" beLong extra-field signature |
frFlags beShort attributes from DInfo.frFlags, may |
be omitted |
View beShort ZipIt view flag, may be omitted |
The View field specifies ZipIt-internal settings as follows: |
Bits of the Flags: |
bit 0 if set, the folder is shown expanded (open) |
when the archive contents are viewed in ZipIt. |
bits 1-15 reserved, zero; |
-Info-ZIP Macintosh Extra Field (new): |
==================================== |
The following is the layout of the (new) Info-ZIP extra |
block for Macintosh, designed by Dirk Haase. |
All values are in little-endian. |
(Last Revision 19981005) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(Mac3) 0x334d Short tag for this extra block type ("M3") |
TSize Short total data size for this block |
BSize Long uncompressed finder attribute data size |
Flags Short info bits |
fdType Byte[4] Type of the File (4-byte string) |
fdCreator Byte[4] Creator of the File (4-byte string) |
(CType) Short compression type |
(CRC) Long CRC value for uncompressed MacOS data |
Attribs variable finder attribute data (see below) |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(Mac3) 0x334d Short tag for this extra block type ("M3") |
TSize Short total data size for this block |
BSize Long uncompressed finder attribute data size |
Flags Short info bits |
fdType Byte[4] Type of the File (4-byte string) |
fdCreator Byte[4] Creator of the File (4-byte string) |
The third bit of Flags in both headers indicates whether |
the LOCAL extra field is uncompressed (and therefore whether CType |
and CRC are omitted): |
Bits of the Flags: |
bit 0 if set, file is a data fork; otherwise unset |
bit 1 if set, filename will be not changed |
bit 2 if set, Attribs is uncompressed (no CType, CRC) |
bit 3 if set, date and times are in 64 bit |
if zero date and times are in 32 bit. |
bit 4 if set, timezone offsets fields for the native |
Mac times are omitted (UTC support deactivated) |
bits 5-15 reserved; |
Attributes: |
Attribs is a Mac-specific block of data in little-endian format with |
the following structure (if compressed, uncompress it first): |
Value Size Description |
----- ---- ----------- |
fdFlags Short Finder Flags |
fdLocation.v Short Finder Icon Location |
fdLocation.h Short Finder Icon Location |
fdFldr Short Folder containing file |
FXInfo 16 bytes Macintosh FXInfo structure |
FXInfo-Structure: |
fdIconID Short |
fdUnused[3] Short unused but reserved 6 bytes |
fdScript Byte Script flag and number |
fdXFlags Byte More flag bits |
fdComment Short Comment ID |
fdPutAway Long Home Dir ID |
FVersNum Byte file version number |
may be not used by MacOS |
ACUser Byte directory access rights |
FlCrDat ULong date and time of creation |
FlMdDat ULong date and time of last modification |
FlBkDat ULong date and time of last backup |
These time numbers are original Mac FileTime values (local time!). |
Currently, date-time width is 32-bit, but future version may |
support be 64-bit times (see flags) |
CrGMTOffs Long(signed!) difference "local Creat. time - UTC" |
MdGMTOffs Long(signed!) difference "local Modif. time - UTC" |
BkGMTOffs Long(signed!) difference "local Backup time - UTC" |
These "local time - UTC" differences (stored in seconds) may be |
used to support timestamp adjustment after inter-timezone transfer. |
These fields are optional; bit 4 of the flags word controls their |
presence. |
Charset Short TextEncodingBase (Charset) |
valid for the following two fields |
FullPath variable Path of the current file. |
Zero terminated string (C-String) |
Currently coded in the native Charset. |
Comment variable Finder Comment of the current file. |
Zero terminated string (C-String) |
Currently coded in the native Charset. |
-SmartZIP Macintosh Extra Field: |
==================================== |
The following is the layout of the SmartZIP extra |
block for Macintosh, designed by Marco Bambini. |
Local-header version: |
Value Size Description |
----- ---- ----------- |
0x4d63 Short tag for this extra block type ("cM") |
TSize Short total data size for this block (64) |
"dZip" beLong extra-field signature |
fdType Byte[4] Type of the File (4-byte string) |
fdCreator Byte[4] Creator of the File (4-byte string) |
fdFlags beShort Finder Flags |
fdLocation.v beShort Finder Icon Location |
fdLocation.h beShort Finder Icon Location |
fdFldr beShort Folder containing file |
CrDat beLong HParamBlockRec fileParam.ioFlCrDat |
MdDat beLong HParamBlockRec fileParam.ioFlMdDat |
frScroll.v Byte vertical pos. of folder's scroll bar |
fdScript Byte Script flag and number |
frScroll.h Byte horizontal pos. of folder's scroll bar |
fdXFlags Byte More flag bits |
FileName Byte[32] full Macintosh filename (pascal string) |
All fields but the first two are in native Macintosh format |
(big-endian Motorola order, not little-endian Intel). |
The extra field size is fixed to 64 bytes. |
The local-header and central-header versions are identical. |
-Acorn SparkFS Extra Field: |
========================= |
The following is the layout of David Pilling's SparkFS extra block |
for Acorn RISC OS. The local-header and central-header versions are |
identical. (Last Revision 19960922) |
Value Size Description |
----- ---- ----------- |
(Acorn) 0x4341 Short tag for this extra block type ("AC") |
TSize Short total data size for this block (20) |
"ARC0" Long extra-field signature |
LoadAddr Long load address or file type |
ExecAddr Long exec address |
Attr Long file permissions |
Zero Long reserved; always zero |
The following bits of Attr are associated with the given file |
permissions: |
bit 0 user-writable ('W') |
bit 1 user-readable ('R') |
bit 2 reserved |
bit 3 locked ('L') |
bit 4 publicly writable ('w') |
bit 5 publicly readable ('r') |
bit 6 reserved |
bit 7 reserved |
-VM/CMS Extra Field: |
================== |
The following is the layout of the file-attributes extra block for |
VM/CMS. The local-header and central-header versions are |
identical. (Last Revision 19960922) |
Value Size Description |
----- ---- ----------- |
(VM/CMS) 0x4704 Short tag for this extra block type |
TSize Short total data size for this block |
flData variable file attributes data |
flData is an uncompressed fldata_t struct. |
-MVS Extra Field: |
=============== |
The following is the layout of the file-attributes extra block for |
MVS. The local-header and central-header versions are identical. |
(Last Revision 19960922) |
Value Size Description |
----- ---- ----------- |
(MVS) 0x470f Short tag for this extra block type |
TSize Short total data size for this block |
flData variable file attributes data |
flData is an uncompressed fldata_t struct. |
-PKWARE Unix Extra Field (0x000d): |
================================ |
The following is the layout of PKWARE's Unix "extra" block. |
It was introduced with the release of PKZIP for Unix 2.50. |
Note: all fields are stored in Intel low-byte/high-byte order. |
(Last Revision 19980901) |
This field has a minimum data size of 12 bytes and is only stored |
as local extra field. |
Value Size Description |
----- ---- ----------- |
(Unix0) 0x000d Short Tag for this "extra" block type |
TSize Short Total Data Size for this block |
AcTime Long time of last access (UTC/GMT) |
ModTime Long time of last modification (UTC/GMT) |
UID Short Unix user ID |
GID Short Unix group ID |
(var) variable Variable length data field |
The variable length data field will contain file type |
specific data. Currently the only values allowed are |
the original "linked to" file names for hard or symbolic |
links, and the major and minor device node numbers for |
character and block device nodes. Since device nodes |
cannot be either symbolic or hard links, only one set of |
variable length data is stored. Link files will have the |
name of the original file stored. This name is NOT NULL |
terminated. Its size can be determined by checking TSize - |
12. Device entries will have eight bytes stored as two 4 |
byte entries (in little-endian format). The first entry |
will be the major device number, and the second the minor |
device number. |
[Info-ZIP note: The fixed part of this field has the same layout as |
Info-ZIP's abandoned "Unix1 timestamps & owner ID info" extra field; |
only the two tag bytes are different.] |
-PATCH Descriptor Extra Field (0x000f): |
===================================== |
The following is the layout of the Patch Descriptor "extra" |
block. |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(Patch) 0x000f Short Tag for this "extra" block type |
TSize Short Size of the total "extra" block |
Version Short Version of the descriptor |
Flags Long Actions and reactions (see below) |
OldSize Long Size of the file about to be patched |
OldCRC Long 32-bit CRC of the file about to be patched |
NewSize Long Size of the resulting file |
NewCRC Long 32-bit CRC of the resulting file |
Actions and reactions |
Bits Description |
---- ---------------- |
0 Use for auto detection |
1 Treat as a self-patch |
2-3 RESERVED |
4-5 Action (see below) |
6-7 RESERVED |
8-9 Reaction (see below) to absent file |
10-11 Reaction (see below) to newer file |
12-13 Reaction (see below) to unknown file |
14-15 RESERVED |
16-31 RESERVED |
Actions |
Action Value |
------ ----- |
none 0 |
add 1 |
delete 2 |
patch 3 |
Reactions |
Reaction Value |
-------- ----- |
ask 0 |
skip 1 |
ignore 2 |
fail 3 |
Patch support is provided by PKPatchMaker(tm) technology and is |
covered under U.S. Patents and Patents Pending. The use or |
implementation in a product of certain technological aspects set |
forth in the current APPNOTE, including those with regard to |
strong encryption, patching, or extended tape operations requires |
a license from PKWARE. Please contact PKWARE with regard to |
acquiring a license. |
-PKCS#7 Store for X.509 Certificates (0x0014): |
============================================ |
This field contains information about each of the certificates |
files may be signed with. When the Central Directory Encryption |
feature is enabled for a ZIP file, this record will appear in |
the Archive Extra Data Record, otherwise it will appear in the |
first central directory record and will be ignored in any |
other record. |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(Store) 0x0014 2 bytes Tag for this "extra" block type |
TSize 2 bytes Size of the store data |
SData TSize Data about the store |
SData |
Value Size Description |
----- ---- ----------- |
Version 2 bytes Version number, 0x0001 for now |
StoreD (variable) Actual store data |
The StoreD member is suitable for passing as the pbData |
member of a CRYPT_DATA_BLOB to the CertOpenStore() function |
in Microsoft's CryptoAPI. The SSize member above will be |
cbData + 6, where cbData is the cbData member of the same |
CRYPT_DATA_BLOB. The encoding type to pass to |
CertOpenStore() should be |
PKCS_7_ANS_ENCODING | X509_ASN_ENCODING. |
-X.509 Certificate ID and Signature for individual file (0x0015): |
=============================================================== |
This field contains the information about which certificate in |
the PKCS#7 store was used to sign a particular file. It also |
contains the signature data. This field can appear multiple |
times, but can only appear once per certificate. |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(CID) 0x0015 2 bytes Tag for this "extra" block type |
CSize 2 bytes Size of Method |
Method (variable) |
Method |
Value Size Description |
----- ---- ----------- |
Version 2 bytes Version number, for now 0x0001 |
AlgID 2 bytes Algorithm ID used for signing |
IDSize 2 bytes Size of Certificate ID data |
CertID (variable) Certificate ID data |
SigSize 2 bytes Size of Signature data |
Sig (variable) Signature data |
CertID |
Value Size Description |
----- ---- ----------- |
Size1 4 bytes Size of CertID, should be (IDSize - 4) |
Size1 4 bytes A bug in version one causes this value |
to appear twice. |
IssSize 4 bytes Issuer data size |
Issuer (variable) Issuer data |
SerSize 4 bytes Serial Number size |
Serial (variable) Serial Number data |
The Issuer and IssSize members are suitable for creating a |
CRYPT_DATA_BLOB to be the Issuer member of a CERT_INFO |
struct. The Serial and SerSize members would be the |
SerialNumber member of the same CERT_INFO struct. This |
struct would be used to find the certificate in the store |
the file was signed with. Those structures are from the MS |
CryptoAPI. |
Sig and SigSize are the actual signature data and size |
generated by signing the file with the MS CryptoAPI using a |
hash created with the given AlgID. |
-X.509 Certificate ID and Signature for central directory (0x0016): |
================================================================= |
This field contains the information about which certificate in |
the PKCS#7 store was used to sign the central directory structure. |
When the Central Directory Encryption feature is enabled for a |
ZIP file, this record will appear in the Archive Extra Data Record, |
otherwise it will appear in the first central directory record, |
along with the store. The data structure is the |
same as the CID, except that SigSize will be 0, and there |
will be no Sig member. |
This field is also kept after the last central directory |
record, as the signature data (ID 0x05054b50, it looks like |
a central directory record of a different type). This |
second copy of the data is the Signature Data member of the |
record, and will have a SigSize that is non-zero, and will |
have Sig data. |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(CDID) 0x0016 2 bytes Tag for this "extra" block type |
TSize 2 bytes Size of data that follows |
TData TSize Data |
-Strong Encryption Header (0x0017): |
================================= |
Value Size Description |
----- ---- ----------- |
0x0017 2 bytes Tag for this "extra" block type |
TSize 2 bytes Size of data that follows |
Format 2 bytes Format definition for this record |
AlgID 2 bytes Encryption algorithm identifier |
Bitlen 2 bytes Bit length of encryption key |
Flags 2 bytes Processing flags |
CertData TSize-8 Certificate decryption extra field data |
(refer to the explanation for CertData |
in the section describing the |
Certificate Processing Method under |
the Strong Encryption Specification) |
-Record Management Controls (0x0018): |
=================================== |
Value Size Description |
----- ---- ----------- |
(Rec-CTL) 0x0018 2 bytes Tag for this "extra" block type |
CSize 2 bytes Size of total extra block data |
Tag1 2 bytes Record control attribute 1 |
Size1 2 bytes Size of attribute 1, in bytes |
Data1 Size1 Attribute 1 data |
. |
. |
. |
TagN 2 bytes Record control attribute N |
SizeN 2 bytes Size of attribute N, in bytes |
DataN SizeN Attribute N data |
-PKCS#7 Encryption Recipient Certificate List (0x0019): |
===================================================== |
This field contains information about each of the certificates |
used in encryption processing and it can be used to identify who is |
allowed to decrypt encrypted files. This field should only appear |
in the archive extra data record. This field is not required and |
serves only to aide archive modifications by preserving public |
encryption key data. Individual security requirements may dictate |
that this data be omitted to deter information exposure. |
Note: all fields stored in Intel low-byte/high-byte order. |
Value Size Description |
----- ---- ----------- |
(CStore) 0x0019 2 bytes Tag for this "extra" block type |
TSize 2 bytes Size of the store data |
TData TSize Data about the store |
TData: |
Value Size Description |
----- ---- ----------- |
Version 2 bytes Format version number - must 0x0001 at this time |
CStore (var) PKCS#7 data blob |
-MVS Extra Field (PKWARE, 0x0065): |
================================ |
The following is the layout of the MVS "extra" block. |
Note: Some fields are stored in Big Endian format. |
All text is in EBCDIC format unless otherwise specified. |
Value Size Description |
----- ---- ----------- |
(MVS) 0x0065 2 bytes Tag for this "extra" block type |
TSize 2 bytes Size for the following data block |
ID 4 bytes EBCDIC "Z390" 0xE9F3F9F0 or |
"T4MV" for TargetFour |
(var) TSize-4 Attribute data |
-OS/400 Extra Field (0x0065): |
=========================== |
The following is the layout of the OS/400 "extra" block. |
Note: Some fields are stored in Big Endian format. |
All text is in EBCDIC format unless otherwise specified. |
Value Size Description |
----- ---- ----------- |
(OS400) 0x0065 2 bytes Tag for this "extra" block type |
TSize 2 bytes Size for the following data block |
ID 4 bytes EBCDIC "I400" 0xC9F4F0F0 or |
"T4MV" for TargetFour |
(var) TSize-4 Attribute data |
-Info-ZIP Unicode Path Extra Field: |
================================= |
Stores the UTF-8 version of the entry path as stored in the |
local header and central directory header. |
(Last Revision 20070912) |
Value Size Description |
----- ---- ----------- |
(UPath) 0x7075 Short tag for this extra block type ("up") |
TSize Short total data size for this block |
Version Byte version of this extra field, currently 1 |
NameCRC32 Long CRC-32 checksum of standard name field |
UnicodeName variable UTF-8 version of the entry file name |
Currently Version is set to the number 1. If there is a need |
to change this field, the version will be incremented. Changes |
may not be backward compatible so this extra field should not be |
used if the version is not recognized. |
The NameCRC32 is the standard zip CRC32 checksum of the File Name |
field in the header. This is used to verify that the header |
File Name field has not changed since the Unicode Path extra field |
was created. This can happen if a utility renames the entry but |
does not update the UTF-8 path extra field. If the CRC check fails, |
this UTF-8 Path Extra Field should be ignored and the File Name field |
in the header should be used instead. |
The UnicodeName is the UTF-8 version of the contents of the File |
Name field in the header, without any trailing NUL. The standard |
name field in the Zip entry header remains filled with the entry |
name coded in the local machine's extended ASCII system charset. |
As UnicodeName is defined to be UTF-8, no UTF-8 byte order mark |
(BOM) is used. The length of this field is determined by |
subtracting the size of the previous fields from TSize. |
If both the File Name and Comment fields are UTF-8, the new General |
Purpose Bit Flag, bit 11 (Language encoding flag (EFS)), should be |
used to indicate that both the header File Name and Comment fields |
are UTF-8 and, in this case, the Unicode Path and Unicode Comment |
extra fields are not needed and should not be created. Note that, |
for backward compatibility, bit 11 should only be used if the native |
character set of the paths and comments being zipped up are already |
in UTF-8. The same method, either general purpose bit 11 or extra |
fields, should be used in both the Local and Central Directory Header |
for a file. |
Utilisation rules: |
1. This field shall never be created for names consisting solely of |
7-bit ASCII characters. |
2. On a system that already uses UTF-8 as system charset, this field |
shall not repeat the string pattern already stored in the Zip |
entry's standard name field. Instead, a field of exactly 9 bytes |
(70 75 05 00 01 and 4 bytes CRC) should be created. |
In this form with 5 data bytes, the field serves as indicator |
for the UTF-8 encoding of the standard Zip header's name field. |
3. This field shall not be used whenever the calculated CRC-32 of |
the entry's standard name field does not match the provided |
CRC checksum value. A mismatch of the CRC check indicates that |
the standard name field was changed by some non-"up"-aware |
utility without synchronizing this UTF-8 name e.f. block. |
-Info-ZIP Unicode Comment Extra Field: |
==================================== |
Stores the UTF-8 version of the entry comment as stored in the |
central directory header. |
(Last Revision 20070912) |
Value Size Description |
----- ---- ----------- |
(UCom) 0x6375 Short tag for this extra block type ("uc") |
TSize Short total data size for this block |
Version 1 byte version of this extra field, currently 1 |
ComCRC32 4 bytes Comment Field CRC32 Checksum |
UnicodeCom Variable UTF-8 version of the entry comment |
Currently Version is set to the number 1. If there is a need |
to change this field, the version will be incremented. Changes |
may not be backward compatible so this extra field should not be |
used if the version is not recognized. |
The ComCRC32 is the standard zip CRC32 checksum of the Comment |
field in the central directory header. This is used to verify that |
the comment field has not changed since the Unicode Comment extra |
field was created. This can happen if a utility changes the Comment |
field but does not update the UTF-8 Comment extra field. If the CRC |
check fails, this Unicode Comment extra field should be ignored and |
the Comment field in the header used. |
The UnicodeCom field is the UTF-8 version of the entry comment field |
in the header. As UnicodeCom is defined to be UTF-8, no UTF-8 byte |
order mark (BOM) is used. The length of this field is determined by |
subtracting the size of the previous fields from TSize. If both the |
File Name and Comment fields are UTF-8, the new General Purpose Bit |
Flag, bit 11 (Language encoding flag (EFS)), can be used to indicate |
both the header File Name and Comment fields are UTF-8 and, in this |
case, the Unicode Path and Unicode Comment extra fields are not |
needed and should not be created. Note that, for backward |
compatibility, bit 11 should only be used if the native character set |
of the paths and comments being zipped up are already in UTF-8. The |
same method, either bit 11 or extra fields, should be used in both |
the local and central directory headers. |
-Extended Timestamp Extra Field: |
============================== |
The following is the layout of the extended-timestamp extra block. |
(Last Revision 19970118) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(time) 0x5455 Short tag for this extra block type ("UT") |
TSize Short total data size for this block |
Flags Byte info bits |
(ModTime) Long time of last modification (UTC/GMT) |
(AcTime) Long time of last access (UTC/GMT) |
(CrTime) Long time of original creation (UTC/GMT) |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(time) 0x5455 Short tag for this extra block type ("UT") |
TSize Short total data size for this block |
Flags Byte info bits (refers to local header!) |
(ModTime) Long time of last modification (UTC/GMT) |
The central-header extra field contains the modification time only, |
or no timestamp at all. TSize is used to flag its presence or |
absence. But note: |
If "Flags" indicates that Modtime is present in the local header |
field, it MUST be present in the central header field, too! |
This correspondence is required because the modification time |
value may be used to support trans-timezone freshening and |
updating operations with zip archives. |
The time values are in standard Unix signed-long format, indicating |
the number of seconds since 1 January 1970 00:00:00. The times |
are relative to Coordinated Universal Time (UTC), also sometimes |
referred to as Greenwich Mean Time (GMT). To convert to local time, |
the software must know the local timezone offset from UTC/GMT. |
The lower three bits of Flags in both headers indicate which time- |
stamps are present in the LOCAL extra field: |
bit 0 if set, modification time is present |
bit 1 if set, access time is present |
bit 2 if set, creation time is present |
bits 3-7 reserved for additional timestamps; not set |
Those times that are present will appear in the order indicated, but |
any combination of times may be omitted. (Creation time may be |
present without access time, for example.) TSize should equal |
(1 + 4*(number of set bits in Flags)), as the block is currently |
defined. Other timestamps may be added in the future. |
-Info-ZIP Unix Extra Field (type 1): |
================================== |
The following is the layout of the old Info-ZIP extra block for |
Unix. It has been replaced by the extended-timestamp extra block |
(0x5455) and the Unix type 2 extra block (0x7855). |
(Last Revision 19970118) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(Unix1) 0x5855 Short tag for this extra block type ("UX") |
TSize Short total data size for this block |
AcTime Long time of last access (UTC/GMT) |
ModTime Long time of last modification (UTC/GMT) |
UID Short Unix user ID (optional) |
GID Short Unix group ID (optional) |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(Unix1) 0x5855 Short tag for this extra block type ("UX") |
TSize Short total data size for this block |
AcTime Long time of last access (GMT/UTC) |
ModTime Long time of last modification (GMT/UTC) |
The file access and modification times are in standard Unix signed- |
long format, indicating the number of seconds since 1 January 1970 |
00:00:00. The times are relative to Coordinated Universal Time |
(UTC), also sometimes referred to as Greenwich Mean Time (GMT). To |
convert to local time, the software must know the local timezone |
offset from UTC/GMT. The modification time may be used by non-Unix |
systems to support inter-timezone freshening and updating of zip |
archives. |
The local-header extra block may optionally contain UID and GID |
info for the file. The local-header TSize value is the only |
indication of this. Note that Unix UIDs and GIDs are usually |
specific to a particular machine, and they generally require root |
access to restore. |
This extra field type is obsolete, but it has been in use since |
mid-1994. Therefore future archiving software should continue to |
support it. Some guidelines: |
An archive member should either contain the old "Unix1" |
extra field block or the new extra field types "time" and/or |
"Unix2". |
If both the old "Unix1" block type and one or both of the new |
block types "time" and "Unix2" are found, the "Unix1" block |
should be considered invalid and ignored. |
Unarchiving software should recognize both old and new extra |
field block types, but the info from new types overrides the |
old "Unix1" field. |
Archiving software should recognize "Unix1" extra fields for |
timestamp comparison but never create it for updated, freshened |
or new archive members. When copying existing members to a new |
archive, any "Unix1" extra field blocks should be converted to |
the new "time" and/or "Unix2" types. |
-Info-ZIP UNIX Extra Field (type 2): |
================================== |
The following is the layout of the new Info-ZIP extra block for |
Unix. (Last Revision 19960922) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(Unix2) 0x7855 Short tag for this extra block type ("Ux") |
TSize Short total data size for this block (4) |
UID Short Unix user ID |
GID Short Unix group ID |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(Unix2) 0x7855 Short tag for this extra block type ("Ux") |
TSize Short total data size for this block (0) |
The data size of the central-header version is zero; it is used |
solely as a flag that UID/GID info is present in the local-header |
extra field. If additional fields are ever added to the local |
version, the central version may be extended to indicate this. |
Note that Unix UIDs and GIDs are usually specific to a particular |
machine, and they generally require root access to restore. |
-Info-ZIP New Unix Extra Field: |
==================================== |
Currently stores Unix UIDs/GIDs up to 32 bits. |
(Last Revision 20080509) |
Value Size Description |
----- ---- ----------- |
(UnixN) 0x7875 Short tag for this extra block type ("ux") |
TSize Short total data size for this block |
Version 1 byte version of this extra field, currently 1 |
UIDSize 1 byte Size of UID field |
UID Variable UID for this entry |
GIDSize 1 byte Size of GID field |
GID Variable GID for this entry |
Currently Version is set to the number 1. If there is a need |
to change this field, the version will be incremented. Changes |
may not be backward compatible so this extra field should not be |
used if the version is not recognized. |
UIDSize is the size of the UID field in bytes. This size should |
match the size of the UID field on the target OS. |
UID is the UID for this entry in standard little endian format. |
GIDSize is the size of the GID field in bytes. This size should |
match the size of the GID field on the target OS. |
GID is the GID for this entry in standard little endian format. |
If both the old 16-bit Unix extra field (tag 0x7855, Info-ZIP Unix2) |
and this extra field are present, the values in this extra field |
supercede the values in that extra field. |
-ASi UNIX Extra Field: |
==================== |
The following is the layout of the ASi extra block for Unix. The |
local-header and central-header versions are identical. |
(Last Revision 19960916) |
Value Size Description |
----- ---- ----------- |
(Unix3) 0x756e Short tag for this extra block type ("nu") |
TSize Short total data size for this block |
CRC Long CRC-32 of the remaining data |
Mode Short file permissions |
SizDev Long symlink'd size OR major/minor dev num |
UID Short user ID |
GID Short group ID |
(var.) variable symbolic link filename |
Mode is the standard Unix st_mode field from struct stat, containing |
user/group/other permissions, setuid/setgid and symlink info, etc. |
If Mode indicates that this file is a symbolic link, SizDev is the |
size of the file to which the link points. Otherwise, if the file |
is a device, SizDev contains the standard Unix st_rdev field from |
struct stat (includes the major and minor numbers of the device). |
SizDev is undefined in other cases. |
If Mode indicates that the file is a symbolic link, the final field |
will be the name of the file to which the link points. The file- |
name length can be inferred from TSize. |
[Note that TSize may incorrectly refer to the data size not counting |
the CRC; i.e., it may be four bytes too small.] |
-BeOS Extra Field: |
================ |
The following is the layout of the file-attributes extra block for |
BeOS. (Last Revision 19970531) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(BeOS) 0x6542 Short tag for this extra block type ("Be") |
TSize Short total data size for this block |
BSize Long uncompressed file attribute data size |
Flags Byte info bits |
(CType) Short compression type |
(CRC) Long CRC value for uncompressed file attribs |
Attribs variable file attribute data |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(BeOS) 0x6542 Short tag for this extra block type ("Be") |
TSize Short total data size for this block (5) |
BSize Long size of uncompr. local EF block data |
Flags Byte info bits |
The least significant bit of Flags in both headers indicates whether |
the LOCAL extra field is uncompressed (and therefore whether CType |
and CRC are omitted): |
bit 0 if set, Attribs is uncompressed (no CType, CRC) |
bits 1-7 reserved; if set, assume error or unknown data |
Currently the only supported compression types are deflated (type 8) |
and stored (type 0); the latter is not used by Info-ZIP's Zip but is |
supported by UnZip. |
Attribs is a BeOS-specific block of data in big-endian format with |
the following structure (if compressed, uncompress it first): |
Value Size Description |
----- ---- ----------- |
Name variable attribute name (null-terminated string) |
Type Long attribute type (32-bit unsigned integer) |
Size Long Long data size for this sub-block (64 bits) |
Data variable attribute data |
The attribute structure is repeated for every attribute. The Data |
field may contain anything--text, flags, bitmaps, etc. |
-AtheOS Extra Field: |
================== |
The following is the layout of the file-attributes extra block for |
AtheOS. This field is a very close spin-off from the BeOS e.f. |
The only differences are: |
- a new extra field signature |
- numeric field in the attributes data are stored in little-endian |
format ("i386" was initial hardware for AtheOS) |
(Last Revision 20040908) |
Local-header version: |
Value Size Description |
----- ---- ----------- |
(AtheOS) 0x7441 Short tag for this extra block type ("At") |
TSize Short total data size for this block |
BSize Long uncompressed file attribute data size |
Flags Byte info bits |
(CType) Short compression type |
(CRC) Long CRC value for uncompressed file attribs |
Attribs variable file attribute data |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(AtheOS) 0x7441 Short tag for this extra block type ("At") |
TSize Short total data size for this block (5) |
BSize Long size of uncompr. local EF block data |
Flags Byte info bits |
The least significant bit of Flags in both headers indicates whether |
the LOCAL extra field is uncompressed (and therefore whether CType |
and CRC are omitted): |
bit 0 if set, Attribs is uncompressed (no CType, CRC) |
bits 1-7 reserved; if set, assume error or unknown data |
Currently the only supported compression types are deflated (type 8) |
and stored (type 0); the latter is not used by Info-ZIP's Zip but is |
supported by UnZip. |
Attribs is a AtheOS-specific block of data in little-endian format |
with the following structure (if compressed, uncompress it first): |
Value Size Description |
----- ---- ----------- |
Name variable attribute name (null-terminated string) |
Type Long attribute type (32-bit unsigned integer) |
Size Long Long data size for this sub-block (64 bits) |
Data variable attribute data |
The attribute structure is repeated for every attribute. The Data |
field may contain anything--text, flags, bitmaps, etc. |
-SMS/QDOS Extra Field: |
==================== |
The following is the layout of the file-attributes extra block for |
SMS/QDOS. The local-header and central-header versions are identical. |
(Last Revision 19960929) |
Value Size Description |
----- ---- ----------- |
(QDOS) 0xfb4a Short tag for this extra block type |
TSize Short total data size for this block |
LongID Long extra-field signature |
(ExtraID) Long additional signature/flag bytes |
QDirect 64 bytes qdirect structure |
LongID may be "QZHD" or "QDOS". In the latter case, ExtraID will |
be present. Its first three bytes are "02\0"; the last byte is |
currently undefined. |
QDirect contains the file's uncompressed directory info (qdirect |
struct). Its elements are in native (big-endian) format: |
d_length beLong file length |
d_access byte file access type |
d_type byte file type |
d_datalen beLong data length |
d_reserved beLong unused |
d_szname beShort size of filename |
d_name 36 bytes filename |
d_update beLong time of last update |
d_refdate beLong file version number |
d_backup beLong time of last backup (archive date) |
-AOS/VS Extra Field: |
================== |
The following is the layout of the extra block for Data General |
AOS/VS. The local-header and central-header versions are identical. |
(Last Revision 19961125) |
Value Size Description |
----- ---- ----------- |
(AOSVS) 0x5356 Short tag for this extra block type ("VS") |
TSize Short total data size for this block |
"FCI\0" Long extra-field signature |
Version Byte version of AOS/VS extra block (10 = 1.0) |
Fstat variable fstat packet |
AclBuf variable raw ACL data ($MXACL bytes) |
Fstat contains the file's uncompressed fstat packet, which is one of |
the following: |
normal fstat packet (P_FSTAT struct) |
DIR/CPD fstat packet (P_FSTAT_DIR struct) |
unit (device) fstat packet (P_FSTAT_UNIT struct) |
IPC file fstat packet (P_FSTAT_IPC struct) |
AclBuf contains the raw ACL data; its length is $MXACL. |
-Tandem NSK Extra Field: |
====================== |
The following is the layout of the file-attributes extra block for |
Tandem NSK. The local-header and central-header versions are |
identical. (Last Revision 19981221) |
Value Size Description |
----- ---- ----------- |
(TA) 0x4154 Short tag for this extra block type ("TA") |
TSize Short total data size for this block (20) |
NSKattrs 20 Bytes NSK attributes |
-THEOS Extra Field: |
================= |
The following is the layout of the file-attributes extra block for |
Theos. The local-header and central-header versions are identical. |
(Last Revision 19990206) |
Value Size Description |
----- ---- ----------- |
(Theos) 0x6854 Short 'Th' signature |
size Short size of extra block |
flags Byte reserved for future use |
filesize Long file size |
fileorg Byte type of file (see below) |
keylen Short key length for indexed and keyed files, |
data segment size for 16 bits programs |
reclen Short record length for indexed,keyed and direct, |
text segment size for 16 bits programs |
filegrow Byte growing factor for indexed,keyed and direct |
protect Byte protections (see below) |
reserved Short reserved for future use |
File types |
========== |
0x80 library (keyed access list of files) |
0x40 directory |
0x10 stream file |
0x08 direct file |
0x04 keyed file |
0x02 indexed file |
0x0e reserved |
0x01 16 bits real mode program (obsolete) |
0x21 16 bits protected mode program |
0x41 32 bits protected mode program |
Protection codes |
================ |
User protection |
--------------- |
0x01 non readable |
0x02 non writable |
0x04 non executable |
0x08 non erasable |
Other protection |
---------------- |
0x10 non readable |
0x20 non writable |
0x40 non executable Theos before 4.0 |
0x40 modified Theos 4.x |
0x80 not hidden |
-THEOS old inofficial Extra Field: |
================================ |
The following is the layout of an inoffical former version of a |
Theos file-attributes extra blocks. This layout was never published |
and is no longer created. However, UnZip can optionally support it |
when compiling with the option flag OLD_THEOS_EXTRA defined. |
Both the local-header and central-header versions are identical. |
(Last Revision 19990206) |
Value Size Description |
----- ---- ----------- |
(THS0) 0x4854 Short 'TH' signature |
size Short size of extra block |
flags Short reserved for future use |
filesize Long file size |
reclen Short record length for indexed,keyed and direct, |
text segment size for 16 bits programs |
keylen Short key length for indexed and keyed files, |
data segment size for 16 bits programs |
filegrow Byte growing factor for indexed,keyed and direct |
reserved 3 Bytes reserved for future use |
-FWKCS MD5 Extra Field (0x4b46): |
============================== |
The FWKCS Contents_Signature System, used in automatically |
identifying files independent of filename, optionally adds |
and uses an extra field to support the rapid creation of |
an enhanced contents_signature. |
There is no local-header version; the following applies |
only to the central header. (Last Revision 19961207) |
Central-header version: |
Value Size Description |
----- ---- ----------- |
(MD5) 0x4b46 Short tag for this extra block type ("FK") |
TSize Short total data size for this block (19) |
"MD5" 3 bytes extra-field signature |
MD5hash 16 bytes 128-bit MD5 hash of uncompressed data |
(low byte first) |
When FWKCS revises a .ZIP file central directory to add |
this extra field for a file, it also replaces the |
central directory entry for that file's uncompressed |
file length with a measured value. |
FWKCS provides an option to strip this extra field, if |
present, from a .ZIP file central directory. In adding |
this extra field, FWKCS preserves .ZIP file Authenticity |
Verification; if stripping this extra field, FWKCS |
preserves all versions of AV through PKZIP version 2.04g. |
FWKCS, and FWKCS Contents_Signature System, are |
trademarks of Frederick W. Kantor. |
(1) R. Rivest, RFC1321.TXT, MIT Laboratory for Computer |
Science and RSA Data Security, Inc., April 1992. |
ll.76-77: "The MD5 algorithm is being placed in the |
public domain for review and possible adoption as a |
standard." |
-Microsoft Open Packaging Growth Hint (0xa220): |
============================================= |
Value Size Description |
----- ---- ----------- |
0xa220 Short tag for this extra block type |
TSize Short size of Sig + PadVal + Padding |
Sig Short verification signature (A028) |
PadVal Short Initial padding value |
Padding variable filled with NULL characters |
/programs/fs/unzip60/proginfo/fileinfo.cms |
---|
0,0 → 1,231 |
[Quoting from a C/370 manual, courtesy of Carl Forde.] |
C/370 supports three types of input and output: text streams, binary |
streams, and record I/O. Text and binary streams are both ANSI |
standards; record I/O is a C/370 extension. |
[...] |
Record I/O is a C/370 extension to the ANSI standard. For files |
opened in record format, C/370 reads and writes one record at a |
time. If you try to write more data to a record than the record |
can hold, the data is truncated. For record I/O, C/370 only allows |
the use of fread() and fwrite() to read and write to the files. Any |
other functions (such as fprintf(), fscanf(), getc(), and putc()) |
fail. For record-orientated files, records do not change size when |
you update them. If the new data has fewer characters than the |
original record, the new data fills the first n characters, where |
n is the number of characters of the new data. The record will |
remain the same size, and the old characters (those after) n are |
left unchanged. A subsequent update begins at the next boundary. |
For example, if you have the string "abcdefgh": |
abcdefgh |
and you overwrite it with the string "1234", the record will look |
like this: |
1234efgh |
C/370 record I/O is binary. That is, it does not interpret any of |
the data in a record file and therefore does not recognize control |
characters. |
The record model consists of: |
* A record, which is the unit of data transmitted to and from a |
program |
* A block, which is the unit of data transmitted to and from a |
device. Each block may contain one or more records. |
In the record model of I/O, records and blocks have the following |
attributes: |
RECFM Specifies the format of the data or how the data is organized |
on the physical device. |
LRECL Specifies the length of logical records (as opposed to |
physical ones). |
BLKSIZE Specifies the length of physical records (blocks on the |
physical device). |
Opening a File by Filename |
The filename that you specify on the call to fopen() or freopen() |
must be in the following format: |
>> ----filename---- ----filetype-------------------- |
| | | | |
--.-- -- --filemode-- |
| | |
--.-- |
where |
filename is a 1- to 8-character string of any of the characters, |
A-Z, a-z, 0-9, and +, -, $, #, @, :, and _. You can separate it |
from the filetype with one or more spaces, or with a period. |
[Further note: filenames are fully case-sensitive, as in Unix.] |
filetype is a 1- to 8-character string of any of the characters, |
A-Z, a-z, 0-9, and +, -, $, #, @, :, and _. You can separate it |
from the filemode with one or more spaces, or with a period. The |
separator between filetype and filemode must be the same as the |
one between filename and filetype. |
filemode is a 1- to 2-character string. The first must be any of |
the characters A-Z, a-z, or *. If you use the asis parameter on |
the fopen() or freopen() call, the first character of the filemode |
must be a capital letter or an asterisk. Otherwise, the function |
call fails. The second character of filemode is optional; if you |
specify it, it must be any of the digits 0-6. You cannot specify |
the second character if you have specified * for the first one. |
If you do not use periods as separators, there is no limit to how |
much whitespace you can have before and after the filename, the |
filetype, and filemode. |
Opening a File without a File Mode Specified |
If you omit the file mode or specify * for it, C/370 does one |
of the following when you call fopen() or freopen(): |
* If you have specified a read mode, C/370 looks for the named file |
on all the accessed readable disks, in order. If it does not find |
the file, the fopen() or freopen() call fails. |
* If you have specified any of the write modes, C/370 writes the file |
on the first writable disk you have accessed. Specifying a write |
mode on an fopen() or freopen() call that contains the filename of |
an existing file destroys that file. If you do not have any |
writable disks accessed, the call fails. |
fopen() and freopen() parameters |
recfm |
CMS supports only two RECFMs, V and F. [note that MVS supports |
27(!) different RECFMs.] If you do not specify the RECFM for a |
file, C/370 determines whether is is in fixed or variable format. |
lrecl and blksize |
For files in fixed format, CMS allows records to be read and |
written in blocks. To have a fixed format CMS file treated as a |
fixed blocked CMS file, you can open the file with recfm=fb and |
specify the lrecl and blksize. If you do not specify a recfm on |
the open, the blksize can be a multiple of the lrecl, and the |
file is treated as if it were blocked. |
For files in variable format, the CMS LRECL is different from the |
LRECL for the record model. In the record model, the LRECL is |
equal to the data length plus 4 bytes (for the record descriptor |
word), and the BLKSIZE is equal to the LRECL plus 4 bytes (for |
the block descriptor word). In CMS, BDWs and RDWs do not exist, |
but because CMS follows the record model, you must still account |
for them. When you specify V, you must still allocate the record |
descriptor word and block descriptor word. That is, if you want |
a maximum of n bytes per record, you must specify a minimum LRECL |
of n+4 and a minimum BLKSIZE of n+8. |
When you are appending to V files, you can enlarge the record size |
dynamically, but only if you have not specified LRECL or BLKSIZE |
on the fopen() or freopen() command that opened the file. |
type |
If you specify this parameter, the only valid value for CMS disk |
files is type =record. This opens a file for record I/O. |
asis |
If you use this parameter, you can open files with mixed-case |
filenames such as JaMeS dAtA or pErCy.FILE. If you specify this |
parameter, the file mode that you specify must be a capital letter |
(if it is not an asterisk); otherwise; the function call fails and |
the value returned is NULL. |
Reading from Record I/O Files |
fread() is the only interface allowed for reading record I/O files. |
Each time you call fread() for a record I/O file, fread() reads |
one record from the system. If you call fread() with a request for |
less than a complete record, the requested bytes are copied to your |
buffer, and the file position is set to the start fo the next |
record. If the request is for more bytes that are in the record, |
one record is read and the position is set to the start of the next |
record. C/370 does not strip any blank characters or interpret any |
data. |
fread() returns the number of items read successfully, so if you |
pass a size argument equal to 1 and a count argument equal to the |
maximum expected length of the record, fread() returns the length, |
in bytes, of the record read. If you pass a size argument equal |
to the maximum expected length of the record, and a count argument |
equal to 1, fread() returns either 0 or 1, indicating whether a |
record of length size read. If a record is read successfully but |
is less than size bytes long, fread() returns 0. |
Writing to Record I/O Files |
fwrite() is the only interface allowed for writing to a file |
opened for record I/O. Only one record is written at a time. If |
you attempt to write more new data than a full record can hold or |
try to update a record with more data than it currently has, C/370 |
truncates your output at the record boundary. When C/370 performs |
a truncation, it sets errno and raises SIGIOERR, if SIGIOERR is not |
set to SIG_IGN. |
When you are writing new records to a fixed-record I/O file, if you |
try to write a short record, C/370 pads the record with nulls out |
to LRECL. |
At the completion of an fwrite(), the file position is at the start |
of the next record. For new data, the block is flushed out to the |
system as soon as it is full. |
fldata() Behavior |
When you call the fldata() function for an open CMS minidisk file, |
it returns a data structure that looks like this: |
struct __filedata { |
unsigned int __recfmF : 1, /* fixed length records */ |
__recfmV : 1, /* variable length records */ |
__recfmU : 1, /* n/a */ |
__recfmS : 1, /* n/a */ |
__recfmBlk : 1, /* n/a */ |
__recfmASA : 1, /* text mode and ASA */ |
__recfmM : 1, /* n/a */ |
__dsorgPO : 1, /* n/a */ |
__dsorgPDSmem : 1, /* n/a */ |
__dsorgPDSdir : 1, /* n/a */ |
__dsorgPS : 1, /* sequential data set */ |
__dsorgConcat : 1, /* n/a */ |
__dsorgMem : 1, /* n/a */ |
__dsorgHiper : 1, /* n/a */ |
__dsorgTemp : 1, /* created with tmpfile() */ |
__dsorgVSAM : 1, /* n/a */ |
__reserve1 : 1, /* n/a */ |
__openmode : 2, /* see below 1 */ |
__modeflag : 4, /* see below 2 */ |
__reserve2 : 9, /* n/a */ |
char __device; __DISK |
unsigned long __blksize, /* see below 3 */ |
__maxreclen; /* see below 4 */ |
unsigned short __vsamtype; /* n/a */ |
unsigned long __vsamkeylen; /* n/a */ |
unsigned long __vsamRKP; /* n/a */ |
char * __dsname; /* fname ftype fmode */ |
unsigned int __reserve4; /* n/a */ |
/* note 1: values are: __TEXT, __BINARY, __RECORD |
note 2: values are: __READ, __WRITE, __APPEND, __UPDATE |
these values can be added together to determine |
the return value; for example, a file opened with |
a+ will have the value __READ + __APPEND. |
note 3: total block size of the file, including ASA |
characters as well as RDW information |
note 4: maximum record length of the data only (includes |
ASA characters but excludes RDW information). |
*/ |
}; |
/programs/fs/unzip60/proginfo/nt.sd |
---|
0,0 → 1,111 |
Info-ZIP portable Zip/UnZip Windows NT security descriptor support |
================================================================== |
Scott Field (sfield@microsoft.com), 8 October 1996 |
This version of Info-ZIP's Win32 code allows for processing of Windows |
NT security descriptors if they were saved in the .zip file using the |
appropriate Win32 Zip running under Windows NT. This also requires |
that the file system that Zip/UnZip operates on supports persistent |
Acl storage. When the operating system is not Windows NT and the |
target file system does not support persistent Acl storage, no security |
descriptor processing takes place. |
A Windows NT security descriptor consists of any combination of the |
following components: |
an owner (Sid) |
a primary group (Sid) |
a discretionary ACL (Dacl) |
a system ACL (Sacl) |
qualifiers for the preceding items |
By default, Zip will save all aspects of the security descriptor except |
for the Sacl. The Sacl contains information pertaining to auditing of |
the file, and requires a security privilege be granted to the calling |
user in addition to being enabled by the calling application. In order |
to save the Sacl during Zip, the user must specify the -! switch on the |
Zip commandline. The user must also be granted either the SeBackupPrivilege |
"Backup files and directories" or the SeSystemSecurityPrivilege "Manage |
auditing and security log". |
By default, UnZip will not restore any aspects of the security descriptor. |
If the -X option is specified to UnZip, the Dacl is restored to the file. |
The other items in the security descriptor on the new file will receive |
default values. If the -XX option is specified to UnZip, as many aspects |
of the security descriptor as possible will be restored. If the calling |
user is granted the SeRestorePrivilege "Restore files and directories", |
all aspects of the security descriptor will be restored. If the calling |
user is only granted the SeSystemSecurityPrivilege "Manage auditing and |
security log", only the Dacl and Sacl will be restored to the new file. |
Note that when operating on files that reside on remote volumes, the |
privileges specified above must be granted to the calling user on that |
remote machine. Currently, there is no way to directly test what privileges |
are present on a remote machine, so Zip and UnZip make a remote privilege |
determination based on an indirect method. |
UnZip considerations |
-------------------- |
In order for file security to be processed correctly, any directory entries |
that have a security descriptor will be processed at the end of the unzip |
cycle. This allows for unzip to process files within the newly created |
directory regardless of the security descriptor associated with the directory |
entry. This also prevents security inheritance problems that can occur as |
a result of creating a new directory and then creating files in that directory |
that will inherit parent directory permissions; such inherited permissions may |
prevent the security descriptor taken from the zip file from being applied |
to the new file. |
If directories exist which match directory/extract paths in the .zip file, |
file security is not updated on the target directory. It is assumed that if |
the target directory already exists, then appropriate security has already |
been applied to that directory. |
"unzip -t" will test the integrity of stored security descriptors when |
present and the operating system is Windows NT. |
ZipInfo (unzip -Z) will display information on stored security descriptor |
when "unzip -Zv" is specifed. |
Potential uses |
============== |
The obvious use for this new support is to better support backup and restore |
operations in a Windows NT environment where NTFS file security is utilized. |
This allows individuals and organizations to archive files in a portable |
fashion and transport these files across the organization. |
Another potential use of this support is setup and installation. This |
allows for distribution of Windows NT based applications that have preset |
security on files and directories. For example, prior to creation of the |
.zip file, the user can set file security via File Manager or Explorer on |
the files to be contained in the .zip file. In many cases, it is appropriate |
to only grant Everyone Read access to .exe and .dll files, while granting |
Administrators Full control. Using this support in conjunction with the |
unzipsfx.exe self-extractor stub can yield a useful and powerful way to |
install software with preset security (note that -X or -XX should be |
specified on the self-extractor commandline). |
When creating .zip files with security which are intended for transport |
across systems, it is important to take into account the relevance of |
access control entries and the associated Sid of each entry. For example, |
if a .zip file is created on a Windows NT workstation, and file security |
references local workstation user accounts (like an account named Fred), |
this access entry will not be relevant if the .zip file is transported to |
another machine. Where possible, take advantage of the built-in well-known |
groups, like Administrators, Everyone, Network, Guests, etc. These groups |
have the same meaning on any Windows NT machine. Note that the names of |
these groups may differ depending on the language of the installed Windows |
NT, but this isn't a problem since each name has well-known ID that, upon |
restore, translates to the correct group name regardless of locale. |
When access control entries contain Sid entries that reference Domain |
accounts, these entries will only be relevant on systems that recognize |
the referenced domain. Generally speaking, the only side effects of |
irrelevant access control entries is wasted space in the stored security |
descriptor and loss of complete intended access control. Such irrelevant |
access control entries will show up as "Account Unknown" when viewing file |
security with File Manager or Explorer. |
/programs/fs/unzip60/proginfo/perform.dos |
---|
0,0 → 1,183 |
Date: Wed, 27 Mar 1996 01:31:50 CET +0100 |
From: Christian Spieler (IKDA, THD, D-64289 Darmstadt) |
Subject: More detailed comparison of MSDOS Info-ZIP programs' performance |
Hello all, |
In response to some additional questions and requests concerning |
my previous message about DOS performance of 16/32-bit Info-ZIP programs, |
I have produced a more detailed comparison: |
System: |
Cx486DX-40, VL-bus, 8MB; IDE hard disk; |
DOS 6.2, HIMEM, EMM386 NOEMS NOVCPI, SMARTDRV 3MB, write back. |
I have used the main directory of UnZip 5.20p as source, including the |
objects and executable of an EMX compile for unzip.exe (to supply some |
binary test files). |
Tested programs were (my current updated sources!) Zip 2.0w and UnZip 5.20p |
- 16-bit MSC 5.1, compressed with LZEXE 0.91e |
- 32-bit Watcom C 10.5, as supplied by Kai Uwe Rommel (PMODE 1.22) |
- 32-bit EMX 0.9b |
- 32-bit DJGPP v2 |
- 32-bit DJGPP v1.12m4 |
The EMX and DJ1 (GO32) executables were bound with the full extender, to |
create standalone executables. |
A) Tests of Zip |
Command : "<system>\zip.exe -q<#> tes.zip unz/*" (unz/*.* for Watcom!!) |
where <#> was: 0, 1, 6, 9. |
The test archive "tes.zip" was never deleted, this test |
measured "time to update archive". |
The following table contains average execution seconds (averaged over |
at least 3 runs, with the first run discarted to fill disk cache); |
numbers in parenteses specify the standard deviation of the last |
digits. |
cmpr level| 0 | 1 | 6 | 9 |
=============================================================== |
EMX win95 | 7.77 | 7.97 | 12.82 | 22.31 |
--------------------------------------------------------------- |
EMX | 7.15(40) | 8.00(6) | 12.52(25) | 20.93 |
DJ2 | 13.50(32) | 14.20(7) | 19.05 | 28.48(9) |
DJ1 | 13.56(30) | 14.48(3) | 18.70 | 27.43(13) |
WAT | 6.94(22) | 8.93 | 15.73(34) | 30.25(6) |
MSC | 5.99(82) | 9.40(4) | 13.59(9) | 20.77(4) |
=============================================================== |
The "EMX win95" line was created for comparison, to check the performance |
of emx 0.9 with the RSX extender in a DPMI environment. (This line was |
produced by applying the "stubbed" EMX executable in a full screen DOS box.) |
B) Tests of UnZip |
Commands : <system>\unzip.exe -qt tes.zip (testing performance) |
<system>\unzip.exe -qo tes.zip -dtm (extracting performance) |
The tes.zip archive created by maximum compression with the Zip test |
was used as example archive. Contents (archive size was 347783 bytes): |
1028492 bytes uncompressed, 337235 bytes compressed, 67%, 85 files |
The extraction directory tm was not deleted between the individual runs, |
thus this measurement checks the "overwrite all" time. |
| testing | extracting |
=================================================================== |
EMX | 1.98 | 6.43(8) |
DJ2 | 2.09 | 11.85(39) |
DJ1 | 2.09 | 7.46(9) |
WAT | 2.42 | 7.10(27) |
MSC | 4.94 | 9.57(31) |
Remarks: |
The executables compiled by me were generated with all "performance" |
options enabled (ASM_CRC, and ASMV for Zip), and with full crypt support. |
For DJ1 and DJ2, the GCC options were "-O2 -m486", for EMX "-O -m486". |
The Watcom UnZip was compiled with ASM_CRC code enabled as well, |
but the Watcom Zip example was made without any optional assembler code! |
Discussion of the results: |
In overall performance, the EMX executables clearly win. |
For UnZip, emx is by far the fastest program, and the Zip performance is |
comparable to the 16-bit "reference". |
Whenever "real" work including I/O is requested, the DJGPP versions |
lose badly because of poor I/O performance, this is the case especially |
for the "newer" DJGPP v2 !!! |
(I tried to tweak with the transfer buffer size, but without any success.) |
An interesting result is that DJ v1 UnZip works remarkably better than |
DJ v2 (in contrast to Zip, where both executables' performance is |
approximately equal). |
The Watcom C programs show a clear performance deficit in the "computational |
part" (Watcom C compiler produces code that is far from optimal), but |
the extender (which is mostly responsible for the I/O throughput) seems |
to be quite fast. |
The "natural" performance deficit of the 16-bit MSC code, which can be |
clearly seen in the "testing task" comparison for UnZip, is (mostly, |
for Zip more than) compensated by the better I/O throughput (due to the |
"direct interface" between "C RTL" and "DOS services", without any mode |
switching). |
But performance is only one aspect when choosing which compiler should |
be used for official distribution: |
Sizes of the executables: |
| Zip || UnZip |
| standalone stub || standalone | stub |
====================================================================== |
EMX | 143,364 (1) | 94,212 || 159,748 (1) | 110,596 |
DJ2 | 118,272 (2) | -- || 124,928 (2) | -- |
DJ1 | 159,744 | 88,064 || 177,152 | 105,472 |
WAT | 140,073 | -- || 116,231 | -- |
MSC | 49,212 (3) | -- || 45,510 (3) | -- |
(1) does not run in "DPMI only" environment (Windows DOS box) |
(2) requires externally supplied DPMI server |
(3) compressed with LZexe 0.91 |
Caveats/Bugs/Problems of the different extenders: |
EMX: |
- requires two different extenders to run in all DOS-compatible environments, |
EMX for "raw/himem/vcpi" and RSX for "dpmi" (Windows). |
- does not properly support time zones (no daylight savings time) |
DJv2: |
- requires an external (freely available) DPMI extender when run on plain |
DOS; this extender cannot (currently ??) be bound into the executable. |
DJv1: |
- uses up large amount of "low" dos memory (below 1M) when spawning |
another program, each instance of a DJv1 program requires its private |
GO32 extender copy in low dos memory (may be problem for the zip |
"-T" feature) |
Watcom/PMODE: |
- extended memory is allocated statically (default: ALL available memory) |
This means that a spawned program does not get any extended memory. |
You can work around this problem by setting a hard limit on the amount |
of extended memory available to the PMODE program, but this limit is |
"hard" and restricts the allocatable memory for the program itself. |
In detail: |
The Watcom zip.exe as distributed did not allow the "zip -T" feature; |
there was no extended memory left to spawn unzip. |
I could work around this problem by applying PMSETUP to change the |
amount of allocated extended memory to 2.0 MByte (I had 4MB free extended |
memory on my test system). But, this limit cannot be enlarged at |
runtime, when zip needs more memory to store "header info" while |
zipping up a huge drive, and on a system with less free memory, this |
method is not applicable, either. |
Summary: |
For Zip: |
Use the 16-bit executable whenever possible (unless you need the |
larger memory capabilities when zipping up a huge amount of files) |
As 32-bit executable, we may distribute Watcom C (after we have confirmed |
that enabling ASMV and ASM_CRC give us some better computational |
performance.) |
The alternative for 32-bit remains DJGPP v1, which shows the least problems |
(to my knowledge); v2 and EMX cannot be used because of their lack of |
"universality". |
For UnZip: |
Here, the Watcom C 32-bit executable is probably the best compromise, |
but DJ v1 could be used as well. |
And, after all, the 16-bit version does not lose badly when doing |
"real" extraction! For the SFX stub, the 16-bit version remains first |
choice because of its much smaller size! |
Best regards |
Christian Spieler |
/programs/fs/unzip60/proginfo/timezone.txt |
---|
0,0 → 1,85 |
Timezone strings: |
----------------- |
This is a description of valid timezone strings for ENV[ARC]:TZ: |
"XPG3TZ - time zone information" |
The form of the time zone information is based on the XPG3 specification of |
the TZ environment variable. Spaces are allowed only in timezone |
designations, where they are significant. The following description |
closely follows the XPG3 specification, except for the paragraphs starting |
**CLARIFICATION**. |
<std><offset>[<dst>[<offset>],<start>[/<time>],<end>[/<time>]] |
Where: |
<std> and <dst> |
Are each three or more bytes that are the designation for the |
standard (<std>) and daylight savings time (<dst>) timezones. |
Only <std> is required - if <dst> is missing, then daylight |
savings time does not apply in this locale. Upper- and |
lower-case letters are allowed. Any characters except a |
leading colon (:), digits, a comma (,), a minus (-) or a plus |
(+) are allowed. |
**CLARIFICATION** The two-byte designation `UT' is permitted. |
<offset> |
Indicates the value one must add to the local time to arrive |
at Coordinated Universal Time. The offset has the form: |
<hh>[:<mm>[:<ss>]] |
The minutes <mm> and seconds <ss> are optional. The hour <hh> |
is required and may be a single digit. The offset following |
<std> is required. If no offset follows <dst>, daylight savings |
time is assumed to be one hour ahead of standard time. One or |
more digits may be used; the value is always interpreted as a |
decimal number. The hour must be between 0 and 24, and the |
minutes (and seconds) if present between 0 and 59. Out of |
range values may cause unpredictable behavior. If preceded by |
a `-', the timezone is east of the Prime Meridian; otherwise |
it is west (which may be indicated by an optional preceding |
`+' sign). |
**CLARIFICATION** No more than two digits are allowed in any |
of <hh>, <mm> or <ss>. Leading zeros are permitted. |
<start>/<time> and <end>/<time> |
Indicates when to change to and back from daylight savings |
time, where <start>/<time> describes when the change from |
standard time to daylight savings time occurs, and |
<end>/<time> describes when the change back happens. Each |
<time> field describes when, in current local time, the change |
is made. |
**CLARIFICATION** It is recognized that in the Southern |
hemisphere <start> will specify a date later than <end>. |
The formats of <start> and <end> are one of the following: |
J<n> The Julian day <n> (1 <= <n> <= 365). Leap days are not |
counted. That is, in all years, February 28 is day 59 |
and March 1 is day 60. It is impossible to refer to |
the occasional February 29. |
<n> The zero-based Julian day (0 <= <n> <= 365). Leap days |
are counted, and it is possible to refer to February |
29. |
M<m>.<n>.<d> |
The <d>th day, (0 <= <d> <= 6) of week <n> of month <m> |
of the year (1 <= <n> <= 5, 1 <= <m> <= 12), where week |
5 means `the last <d>-day in month <m>' (which may |
occur in either the fourth or the fifth week). Week 1 |
is the first week in which the <d>th day occurs. Day |
zero is Sunday. |
**CLARIFICATION** Neither <n> nor <m> may have a |
leading zero. <d> must be a single digit. |
**CLARIFICATION** The default <start> and <end> values |
are from the first Sunday in April until the last Sunday |
in October. This allows United States users to leave out |
the <start> and <end> parts, as most are accustomed to |
doing. |
<time> has the same format as <offset> except that no leading |
sign (`-' or `+') is allowed. The default, if <time> is not |
given is 02:00:00. |
**CLARIFICATION** The number of hours in <time> may be up |
to 167, to allow encoding of rules such as `00:00hrs on the |
Sunday after the second Friday in September' |
Example (for Central Europe): |
----------------------------- |
MET-1MEST,M3.5.0,M10.5.0/03 |
Another example, for the US East Coast: |
--------------------------------------- |
EST5EDT4,M4.1.0/02,M10.5.0/02 |
This string describes the default values when no time zone is set. |
/programs/fs/unzip60/proginfo/ziplimit.txt |
---|
0,0 → 1,256 |
ziplimit.txt |
A1) Hard limits of the Zip archive format (without Zip64 extensions): |
Number of entries in Zip archive: 64 Ki (2^16 - 1 entries) |
Compressed size of archive entry: 4 GiByte (2^32 - 1 Bytes) |
Uncompressed size of entry: 4 GiByte (2^32 - 1 Bytes) |
Size of single-volume Zip archive: 4 GiByte (2^32 - 1 Bytes) |
Per-volume size of multi-volume archives: 4 GiByte (2^32 - 1 Bytes) |
Number of parts for multi-volume archives: 64 Ki (2^16 - 1 parts) |
Total size of multi-volume archive: 256 TiByte (4G * 64k) |
The number of archive entries and of multivolume parts are limited by |
the structure of the "end-of-central-directory" record, where the these |
numbers are stored in 2-Byte fields. |
Some Zip and/or UnZip implementations (for example Info-ZIP's) allow |
handling of archives with more than 64k entries. (The information |
from "number of entries" field in the "end-of-central-directory" record |
is not really neccessary to retrieve the contents of a Zip archive; |
it should rather be used for consistency checks.) |
Length of an archive entry name: 64 KiByte (2^16 - 1) |
Length of archive member comment: 64 KiByte (2^16 - 1) |
Total length of "extra field": 64 KiByte (2^16 - 1) |
Length of a single e.f. block: 64 KiByte (2^16 - 1) |
Length of archive comment: 64 KiByte (2^16 - 1) |
Additional limitation claimed by PKWARE: |
Size of local-header structure (fixed fields of 30 Bytes + filename |
local extra field): < 64 KiByte |
Size of central-directory structure (46 Bytes + filename + |
central extra field + member comment): < 64 KiByte |
A2) Hard limits of the Zip archive format with Zip64 extensions: |
In 2001, PKWARE has published version 4.5 of the Zip format specification |
(together with the release of PKZIP for Windows 4.5). This specification |
defines new extra field blocks that allow to break the size limits of the |
standard zipfile structures. This extended "Zip64" format enlarges the |
theoretical limits to the following values: |
Number of entries in Zip archive: 16 Ei (2^64 - 1 entries) |
Compressed size of archive entry: 16 EiByte (2^64 - 1 Bytes) |
Uncompressed size of entry: 16 EiByte (2^64 - 1 Bytes) |
Size of single-volume Zip archive: 16 EiByte (2^64 - 1 Bytes) |
Per-volume size of multi-volume archives: 16 EiByte (2^64 - 1 Bytes) |
Number of parts for multi-volume archives: 4 Gi (2^32 - 1 parts) |
Total size of multi-volume archive: 2^96 Byte (16 Ei * 4Gi) |
The Info-ZIP software releases (beginning with Zip 3.0 and UnZip 6.0) |
support Zip64 archives on selected environments (where the underlying |
operating system capabilities are sufficient, e.g. Unix, VMS and Win32). |
B) Implementation limits of UnZip: |
1. Size limits caused by file I/O and decompression handling: |
a) Without "Zip64" and "LargeFile" extensions: |
Size of Zip archive: 2 GiByte (2^31 - 1 Bytes) |
Compressed size of archive entry: 2 GiByte (2^31 - 1 Bytes) |
b) With "Zip64" enabled and "LargeFile" supported: |
Size of Zip archive: 8 EiByte (2^63 - 1 Bytes) |
Compressed size of archive entry: 8 EiByte (2^63 - 1 Bytes) |
Uncompressed size of entry: 8 EiByte (2^63 - 1 Bytes) |
Note: On some systems, even UnZip without "LargeFile" extensions enabled |
may support archive sizes up to 4 GiByte. To get this support, the |
target environment has to meet the following requirements: |
a) The compiler's intrinsic "long" data types must be able to hold |
integer numbers of 2^32. In other words - the standard intrinsic |
integer types "long" and "unsigned long" have to be wider than |
32 bit. |
b) The system has to supply a C runtime library that is compatible |
with the more-than-32-bit-wide "long int" type of condition a) |
c) The standard file positioning functions fseek(), ftell() (and/or |
the Unix style lseek() and tell() functions) have to be capable |
to move to absolute file offsets of up to 4 GiByte from the file |
start. |
On 32-bit CPU hardware, you generally cannot expect that a C compiler |
provides a "long int" type that is wider than 32-bit. So, many of the |
most popular systems (i386, PowerPC, 680x0, et. al) are out of luck. |
You may find environment that provide all requirements on systems |
with 64-bit CPU hardware. Examples might be Cray number crunchers, |
Compaq (former DEC) Alpha AXP machines, or Intel/AMD x64 computers. |
The number of Zip archive entries is unlimited. The "number-of-entries" |
field of the "end-of-central-dir" record is checked against the "number |
of entries found in the central directory" modulus 64k (2^16) (without |
Zip64 extension) or modulus 2^64 (with Zip64 extensions enabled for |
Zip64 archives). |
Multi-volume archive extraction is not (yet) supported. |
Memory requirements are mostly independent of the archive size |
and archive contents. |
In general, UnZip needs a fixed amount of internal buffer space |
plus the size to hold the complete information of the currently |
processed entry's local header. Here, a large extra field |
(could be up to 64 kByte) may exceed the available memory |
for MSDOS 16-bit executables (when they were compiled in small |
or medium memory model, with a fixed 64 KiByte limit on data space). |
The other exception where memory requirements scale with "larger" |
archives is the "restore directory attributes" feature. Here, the |
directory attributes info for each restored directory has to be held |
in memory until the whole archive has been processed. So, the amount |
of memory needed to keep this info scales with the number of restored |
directories and may cause memory problems when a lot of directories |
are restored in a single run. |
C) Implementation limits of the Zip executables: |
1. Size limits caused by file I/O and compression handling: |
a) Without "Zip64" and "LargeFile" extensions: |
Size of Zip archive: 2 GiByte (2^31 - 1 Bytes) |
Compressed size of archive entry: 2 GiByte (2^31 - 1 Bytes) |
Uncompressed size of entry: 2 GiByte (2^31 - 1 Bytes), |
(could/should be 4 GiBytes...) |
b) With "Zip64" enabled and "LargeFile" supported: |
Size of Zip archive: 8 EiByte (2^63 - 1 Bytes) |
Compressed size of archive entry: 8 EiByte (2^63 - 1 Bytes) |
Uncompressed size of entry: 8 EiByte (2^63 - 1 Bytes) |
Multi-volume archive creation now supported in the form of split |
archives. Currently up to 99,999 splits are supported. |
2. Limits caused by handling of archive contents lists |
2.1. Number of archive entries (freshen, update, delete) |
a) 16-bit executable: 64k (2^16 -1) or 32k (2^15 - 1), |
(unsigned vs. signed type of size_t) |
a1) 16-bit executable: <16k ((2^16)/4) |
(The smaller limit a1) results from the array size limit of |
the "qsort()" function.) |
32-bit executable: <1G ((2^32)/4) |
(usual system limit of the "qsort()" function on 32-bit systems) |
64-bit executable: <2Ei ((2^64)/8) |
(theoretical limit of 64-bit flat memory model, the actual limit of |
currently available OS implementations is several orders of magnitude |
lower) |
b) stack space needed by qsort to sort list of archive entries |
NOTE: In the current executables, overflows of limits a) and b) are NOT |
checked! |
c) amount of free memory to hold "central directory information" of |
all archive entries; one entry needs: |
128 bytes (Zip64), 96 bytes (32-bit) resp. 80 bytes (16-bit) |
+ 3 * length of entry name |
+ length of zip entry comment (when present) |
+ length of extra field(s) (when present, e.g.: UT needs 9 bytes) |
+ some bytes for book-keeping of memory allocation |
Conclusion: |
For systems with limited memory space (MSDOS, small AMIGAs, other |
environments without virtual memory), the number of archive entries |
is most often limited by condition c). |
For example, with approx. 100 kBytes of free memory after loading and |
initializing the program, a 16-bit DOS Zip cannot process more than 600 |
to 1000 (+) archive entries. (For the 16-bit Windows DLL or the 16-bit |
OS/2 port, limit c) is less important because Windows or OS/2 executables |
are not restricted to the 1024k area of real mode memory. These 16-bit |
ports are limited by conditions a1) and b), say: at maximum approx. |
16000 entries!) |
2.2. Number of "new" entries (add operation) |
In addition to the restrictions above (2.1.), the following limits |
caused by the handling of the "new files" list apply: |
a) 16-bit executable: <16k ((2^64)/4) |
b) stack size required for "qsort" operation on "new entries" list. |
NOTE: In the current executables, the overflow checks for these limits |
are missing! |
c) amount of free memory to hold the directory info list for new entries; |
one entry needs: |
32 bytes (Zip64), 24 bytes (32-bit) resp. 22 bytes (16-bit) |
+ 3 * length of filename |
NOTE: For larger systems, the actual usability limits may be more |
performance issues (how long you want to wait) rather than available |
memory and other resources. |
D) Some technical remarks: |
1. For executables without support for "Zip64" archives and "LargeFile" |
I/O extensions, the 2GiByte size limit on archive files is a consequence |
of the portable C implementation used for the Info-ZIP programs. |
Zip archive processing requires random access to the archive file for |
jumping between different parts of the archive's structure. |
In standard C, this is done via stdio functions fseek()/ftell() resp. |
unix-io functions lseek()/tell(). In many (most?) C implementations, |
these functions use "signed long" variables to hold offset pointers |
into sequential files. In most cases, this is a signed 32-bit number, |
which is limited to ca. 2E+09. There may be specific C runtime library |
implementations that interpret the offset numbers as unsigned, but for |
us, this is not reliable in the context of portable programming. |
2. Similarly, for executables without "Zip64" and "LargeFile" support, |
the 2GiByte limit on the size of a single compressed archive member |
is again a consequence of the implementation in C. |
The variables used internally to count the size of the compressed |
data stream are of type "long", which is guaranted to be at least |
32-bit wide on all supported environments. |
But, why do we use "signed" long and not "unsigned long"? |
Throughout the I/O handling of the compressed data stream, the sign bit |
of the "long" numbers is (mis-)used as a kind of overflow detection. |
In the end, this is caused by the fact that standard C lacks any |
overflow checking on integer arithmetics and does not support access |
to the underlying hardware's overflow detection (the status bits, |
especially "carry" and "overflow" of the CPU's flags-register) in a |
system-independent manner. |
So, we "misuse" the most-significant bit of the compressed data size |
counters as carry bit for efficient overflow/underflow detection. We |
could change the code to a different method of overflow detection, by |
using a bunch of "sanity" comparisons (kind of "is the calculated result |
plausible when compared with the operands"). But, this would "blow up" |
the code of the "inner loop", with remarkable loss of processing speed. |
Or, we could reduce the amount of consistency checks of the compressed |
data (e.g. detection of premature end of stream) to an absolute minimum, |
at the cost of the programs' stability when processing corrupted data. |
3. The argumentation above is somewhat out-dated. Beginning with the |
releases of Zip 3 and UnZip 6, Info-ZIP programs support archive |
sizes larger than 4GiB on systems where the required underlying |
support for 64-bit file offsets and file sizes is available from |
the OS (and the C runtime environment). |
For executables with support for "Zip64" archive format and "LargeFile" |
extension, the I/O limits are lifted by applying extended 64-bit off_t |
file offsets. All limits discussed above are then based on integer |
sizes of 64 bits instead of 32, this should allow to handle file and |
archive sizes up to the limits of manufacturable hardware for the |
foreseeable future. The reduction of the theoretical limits from |
(2^64 - 1) to (2^63 - 1) because of the throughout use of signed |
numbers can be neglected with the currently imaginable hardware. |
However, this new support partially breaks compatibility with older |
"legacy" systems. And it should be noted that the portability and |
readability of the UnZip and Zip code has suffered somehow caused |
by the extensive use of non-standard language extension needed for |
64-bit support on the major target systems. |
Please report any problems to: Zip-Bugs at www.info-zip.org |
Last updated: 25 May 2008, Ed Gordon |
02 January 2009, Christian Spieler |