rfc1952.GZIP file format specification version 4.3

合集下载

zlib 1.2.5 Manual

zlib 1.2.5 ManualContentsI. PrologueII. VersionIII. IntroductionIV. Stream Data StructuresV. Structures UsageVI. ConstantsVII. Basic FunctionsVIII. Advanced FunctionsIX. Utility FunctionsX. gzip File Access FunctionsXI. Checksum FunctionsXII. Undocumented FunctionsProloguezlib general purpose compression libraryversion 1.2.5, April 19th, 2010Copyright (C) 1995-2010 Jean-loup Gailly and Mark AdlerThis software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:1. The origin of this software must not be misrepresented; you must not claim that you wrote theoriginal software. If you use this software in a product, an acknowledgment in the productdocumentation would be appreciated but is not required.2. Altered source versions must be plainly marked as such, and must not be misrepresented as beingthe original software.3. This notice may not be removed or altered from any source distribution.Jean-loup Gailly Mark AdlerThe data format used by the zlib library is described by RFCs (Request for Comments) 1950 to 1952 in the files rfc1950.txt (zlib format), rfc1951.txt (deflate format) and rfc1952.txt (gzip format).Version#define ZLIB_VERSION "1.2.5"#define ZLIB_VERNUM 0x1250IntroductionThe zlib compression library provides in-memory compression and decompression functions, including integrity checks of the uncompressed data. This version of the library supports only one compression method (deflation) but other algorithms will be added later and will have the same stream interface.Compression can be done in a single step if the buffers are large enough (for example if an input file is mmap'ed), or can be done by repeated calls of the compression function. In the latter case, the application must provide more input and/or consume the output (providing more output space) before each call.The compressed data format used by default by the in-memory functions is the zlib format, which is a zlib wrapper documented in RFC 1950, wrapped around a deflate stream, which is itself documented in RFC 1951.The library also supports reading and writing files in gzip (.gz) format with an interface similar to that of stdio using the functions that start with "gz". The gzip format is different from the zlib format. gzip is a gzip wrapper, documented in RFC 1952, wrapped around a deflate stream.This library can optionally read and write gzip streams in memory as well.The zlib format was designed to be compact and fast for use in memory and on communications channels. The gzip format was designed for single- file compression on file systems, has a larger header than zlib to maintain directory information, and uses a different, slower check method than zlib.The library does not install any signal handler. The decoder checks the consistency of the compressed data, so the library should never crash even in case of corrupted input.Stream Data Structurestypedef voidpf (*alloc_func) OF((voidpf opaque, uInt items, uInt size)); typedef void (*free_func) OF((voidpf opaque, voidpf address));struct internal_state;typedef struct z_stream_s {Bytef *next_in; /* next input byte */uInt avail_in; /* number of bytes available at next_in */uLong total_in; /* total nb of input bytes read so far */Bytef *next_out; /* next output byte should be put there */uInt avail_out; /* remaining free space at next_out */uLong total_out; /* total nb of bytes output so far */char *msg; /* last error message, NULL if no error */struct internal_state FAR *state; /* not visible by applications */alloc_func zalloc; /* used to allocate the internal state */free_func zfree; /* used to free the internal state */voidpf opaque; /* private data object passed to zalloc and zfree */int data_type; /* best guess about the data type: binary or text */uLong adler; /* adler32 value of the uncompressed data */uLong reserved; /* reserved for future use */} z_stream;typedef z_stream FAR *z_streamp;gzip header information passed to and from zlib routines. See RFC 1952 for more details on the meanings of these fields.typedef struct gz_header_s {int text; /* true if compressed data believed to be text */uLong time; /* modification time */int xflags; /* extra flags (not used when writing a gzip file) */int os; /* operating system */Bytef *extra; /* pointer to extra field or Z_NULL if none */uInt extra_len; /* extra field length (valid if extra != Z_NULL) */uInt extra_max; /* space at extra (only when reading header) */Bytef *name; /* pointer to zero-terminated file name or Z_NULL */uInt name_max; /* space at name (only when reading header) */Bytef *comment; /* pointer to zero-terminated comment or Z_NULL */uInt comm_max; /* space at comment (only when reading header) */int hcrc; /* true if there was or will be a header crc */int done; /* true when done reading gzip header (not usedwhen writing a gzip file) */} gz_header;typedef gz_header FAR *gz_headerp;Structures UsageThe application must update next_in and avail_in when avail_in has dropped to zero. It must update next_out and avail_out when avail_out has dropped to zero. The application must initializezalloc, zfree and opaque before calling the init function. All other fields are set by the compression library and must not be updated by the application.The opaque value provided by the application will be passed as the first parameter for calls of zalloc and zfree. This can be useful for custom memory management. The compression library attaches no meaning to the opaque value.zalloc must return Z_NULL if there is not enough memory for the object. If zlib is used in a multi-threaded application, zalloc and zfree must be thread safe.On 16-bit systems, the functions zalloc and zfree must be able to allocate exactly 65536 bytes, but will not be required to allocate more than this if the symbol MAXSEG_64K is defined (see zconf.h). WARNING: On MSDOS, pointers returned by zalloc for objects of exactly 65536 bytes must have their offset normalized to zero. The default allocation function provided by this library ensures this (see zutil.c). To reduce memory requirements and avoid any allocation of 64K objects, at the expense of compressionratio, compile the library with -DMAX_WBITS=14 (see zconf.h).The fields total_in and total_out can be used for statistics or progress reports. After compression, total_in holds the total size of the uncompressed data and may be saved for use in the decompressor (particularly if the decompressor wants to decompress everything in a single step).ConstantsAllowed flush values; see deflate() and inflate() below for details.#define Z_NO_FLUSH 0#define Z_PARTIAL_FLUSH 1#define Z_SYNC_FLUSH 2#define Z_FULL_FLUSH 3#define Z_FINISH 4#define Z_BLOCK 5#define Z_TREES 6Return codes for the compression/decompression functions. Negative values are errors, positive values are used for special but normal events.#define Z_OK 0#define Z_STREAM_END 1#define Z_NEED_DICT 2#define Z_ERRNO (-1)#define Z_STREAM_ERROR (-2)#define Z_DATA_ERROR (-3)#define Z_MEM_ERROR (-4)#define Z_BUF_ERROR (-5)#define Z_VERSION_ERROR (-6)Compression levels.#define Z_NO_COMPRESSION 0#define Z_BEST_SPEED 1#define Z_BEST_COMPRESSION 9#define Z_DEFAULT_COMPRESSION (-1)Compression strategy — see deflateInit2() below for details.#define Z_FILTERED 1#define Z_HUFFMAN_ONLY 2#define Z_RLE 3#define Z_FIXED 4#define Z_DEFAULT_STRATEGY 0Possible values of the data_type field (though see inflate()).#define Z_BINARY 0#define Z_TEXT 1#define Z_ASCII Z_TEXT /* for compatibility with 1.2.2 and earlier */#define Z_UNKNOWN 2The deflate compression method (the only one supported in this version).#define Z_DEFLATED 8For initializing zalloc, zfree, opaque.#define Z_NULL 0For compatibility with versions < 1.0.2.#define zlib_version zlibVersion()Basic FunctionsZEXTERN const char * ZEXPORT zlibVersion OF((void));The application can compare zlibVersion and ZLIB_VERSION for consistency. If the first character differs, the library code actually used is not compatible with the zlib.h header file used by the application. This check is automatically made by deflateInit and inflateInit.ZEXTERN int ZEXPORT deflateInit OF((z_streamp strm, int level));Initializes the internal stream state for compression. The fields zalloc, zfree and opaque must be initialized before by the caller. If zalloc and zfree are set to Z_NULL, deflateInit updates them to use default allocation functions.The compression level must be Z_DEFAULT_COMPRESSION, or between 0 and 9: 1 gives best speed, 9 gives best compression, 0 gives no compression at all (the input data is simply copied a block at a time). Z_DEFAULT_COMPRESSION requests a default compromise between speed and compression (currently equivalent to level 6).deflateInit returns Z_OK if success, Z_MEM_ERROR if there was not enough memory,Z_STREAM_ERROR if level is not a valid compression level, Z_VERSION_ERROR if the zlib library version (zlib_version) is incompatible with the version assumed by the caller (ZLIB_VERSION). msg is set to null if there is no error message. deflateInit does not perform any compression: this will be done by deflate().ZEXTERN int ZEXPORT deflate OF((z_streamp strm, int flush));deflate compresses as much data as possible, and stops when the input buffer becomes empty or the output buffer becomes full. It may introduce some output latency (reading input without producing any output) except when forced to flush.The detailed semantics are as follows. deflate performs one or both of the following actions:●Compress more input starting at next_in and update next_in and avail_in accordingly. If notall input can be processed (because there is not enough room in the output buffer), next_in and avail_in are updated and processing will resume at this point for the next call of deflate().●Provide more output starting at next_out and update next_out and avail_out accordingly.This action is forced if the parameter flush is non zero. Forcing flush frequently degrades thecompression ratio, so this parameter should be set only when necessary (in interactiveapplications). Some output may be provided even if flush is not set.Before the call of deflate(), the application should ensure that at least one of the actions is possible, by providing more input and/or consuming more output, and updating avail_in or avail_out accordingly; avail_out should never be zero before the call. The application can consume the compressed output when it wants, for example when the output buffer is full (avail_out == 0), or after each call of deflate(). If deflate returns Z_OK and with zero avail_out, it must be called again after making room in the output buffer because there might be more output pending.Normally the parameter flush is set to Z_NO_FLUSH, which allows deflate to decide how much data to accumulate before producing output, in order to maximize compression.If the parameter flush is set to Z_SYNC_FLUSH, all pending output is flushed to the output buffer andthe output is aligned on a byte boundary, so that the decompressor can get all input data available so far. (In particular avail_in is zero after the call if enough output space has been provided before the call.) Flushing may degrade compression for some compression algorithms and so it should be used only when necessary. This completes the current deflate block and follows it with an empty stored block that is three bits plus filler bits to the next byte, followed by four bytes (00 00 ff ff).If flush is set to Z_PARTIAL_FLUSH, all pending output is flushed to the output buffer, but the output is not aligned to a byte boundary. All of the input data so far will be available to the decompressor, as for Z_SYNC_FLUSH. This completes the current deflate block and follows it with an empty fixed codes block that is 10 bits long. This assures that enough bytes are output in order for the decompressor to finish the block before the empty fixed code block.If flush is set to Z_BLOCK, a deflate block is completed and emitted, as for Z_SYNC_FLUSH, but the output is not aligned on a byte boundary, and up to seven bits of the current block are held to be written as the next byte after the next deflate block is completed. In this case, the decompressor may not be provided enough bits at this point in order to complete decompression of the data provided so far to the compressor. It may need to wait for the next block to be emitted. This is for advanced applications that need to control the emission of deflate blocks.If flush is set to Z_FULL_FLUSH, all output is flushed as with Z_SYNC_FLUSH, and the compression state is reset so that decompression can restart from this point if previous compressed data has been damaged or if random access is desired. Using Z_FULL_FLUSH too often can seriously degrade compression.If deflate returns with avail_out == 0, this function must be called again with the same value of the flush parameter and more output space (updated avail_out), until the flush is complete (deflate returns with non-zero avail_out). In the case of a Z_FULL_FLUSH or Z_SYNC_FLUSH, make sure that avail_out is greater than six to avoid repeated flush markers due to avail_out == 0 on return.If the parameter flush is set to Z_FINISH, pending input is processed, pending output is flushed and deflate returns with Z_STREAM_END if there was enough output space; if deflate returns with Z_OK, this function must be called again with Z_FINISH and more output space (updated avail_out) but no more input data, until it returns with Z_STREAM_END or an error. After deflate has returnedZ_STREAM_END, the only possible operations on the stream are deflateReset or deflateEnd.Z_FINISH can be used immediately after deflateInit if all the compression is to be done in a single step. In this case, avail_out must be at least the value returned by deflateBound (see below). If deflate does not return Z_STREAM_END, then it must be called again as described above.deflate() sets strm->adler to the adler32 checksum of all input read so far (that is, total_in bytes).deflate() may update strm->data_type if it can make a good guess about the input data type(Z_BINARY or Z_TEXT). In doubt, the data is considered binary. This field is only for information purposes and does not affect the compression algorithm in any manner.deflate() returns Z_OK if some progress has been made (more input processed or more output produced), Z_STREAM_END if all input has been consumed and all output has been produced (only when flush is set to Z_FINISH), Z_STREAM_ERROR if the stream state was inconsistent (for example ifnext_in or next_out was NULL), Z_BUF_ERROR if no progress is possible (for example avail_in or avail_out was zero). Note that Z_BUF_ERROR is not fatal, and deflate() can be called again with more input and more output space to continue compressing.ZEXTERN int ZEXPORT deflateEnd OF((z_streamp strm));All dynamically allocated data structures for this stream are freed. This function discards any unprocessed input and does not flush any pending output.deflateEnd returns Z_OK if success, Z_STREAM_ERROR if the stream state was inconsistent,Z_DATA_ERROR if the stream was freed prematurely (some input or output was discarded). In the error case, msg may be set but then points to a static string (which must not be deallocated).ZEXTERN int ZEXPORT inflateInit OF((z_streamp strm));Initializes the internal stream state for decompression. The fields next_in, avail_in, zalloc, zfree and opaque must be initialized before by the caller. If next_in is not Z_NULL and avail_in is large enough (the exact value depends on the compression method), inflateInit determines the compression method from the zlib header and allocates all data structures accordingly; otherwise the allocation will be deferred to the first call of inflate. If zalloc and zfree are set to Z_NULL, inflateInit updates them to use default allocation functions.inflateInit returns Z_OK if success, Z_MEM_ERROR if there was not enough memory,Z_VERSION_ERROR if the zlib library version is incompatible with the version assumed by the caller, orZ_STREAM_ERROR if the parameters are invalid, such as a null pointer to the structure. msg is set to null if there is no error message. inflateInit does not perform any decompression apart from possibly reading the zlib header if present: actual decompression will be done by inflate(). (So next_in and avail_in may be modified, but next_out and avail_out are unused and unchanged.) The current implementation of inflateInit() does not process any header information -- that is deferred until inflate() is called.ZEXTERN int ZEXPORT inflate OF((z_streamp strm, int flush));inflate decompresses as much data as possible, and stops when the input buffer becomes empty or the output buffer becomes full. It may introduce some output latency (reading input without producing any output) except when forced to flush.The detailed semantics are as follows. inflate performs one or both of the following actions:●Decompress more input starting at next_in and update next_in and avail_in accordingly. Ifnot all input can be processed (because there is not enough room in the output buffer), next_in is updated and processing will resume at this point for the next call of inflate().●Provide more output starting at next_out and update next_out and avail_out accordingly.inflate() provides as much output as possible, until there is no more input data or no morespace in the output buffer (see below about the flush parameter).Before the call of inflate(), the application should ensure that at least one of the actions is possible, by providing more input and/or consuming more output, and updating the next_* and avail_* values accordingly. The application can consume the uncompressed output when it wants, for example when the output buffer is full (avail_out == 0), or after each call of inflate(). If inflate returns Z_OK and with zero avail_out, it must be called again after making room in the output buffer because there might be more output pending.The flush parameter of inflate() can be Z_NO_FLUSH, Z_SYNC_FLUSH, Z_FINISH, Z_BLOCK, or Z_TREES. Z_SYNC_FLUSH requests that inflate() flush as much output as possible to the output buffer. Z_BLOCK requests that inflate() stop if and when it gets to the next deflate block boundary. When decoding the zlib or gzip format, this will cause inflate() to return immediately after the header and before the first block. When doing a raw inflate, inflate() will go ahead and process the first block, and will return when it gets to the end of that block, or when it runs out of data.The Z_BLOCK option assists in appending to or combining deflate streams. Also to assist in this, on return inflate() will set strm->data_type to the number of unused bits in the last byte taken from strm->next_in, plus 64 if inflate() is currently decoding the last block in the deflate stream, plus 128 if inflate() returned immediately after decoding an end-of-block code or decoding the complete header up to just before the first byte of the deflate stream. The end-of-block will not be indicated until all of the uncompressed data from that block has been written to strm->next_out. The number of unused bits may in general be greater than seven, except when bit 7 of data_type is set, in which case the number of unused bits will be less than eight. data_type is set as noted here every time inflate() returns for all flush options, and so can be used to determine the amount of currently consumed input in bits.The Z_TREES option behaves as Z_BLOCK does, but it also returns when the end of each deflate block header is reached, before any actual data in that block is decoded. This allows the caller to determine the length of the deflate block header for later use in random access within a deflate block. 256 is added to the value of strm->data_type when inflate() returns immediately after reaching the end of the deflate block header.inflate() should normally be called until it returns Z_STREAM_END or an error. However if all decompression is to be performed in a single step (a single call of inflate), the parameter flushshould be set to Z_FINISH. In this case all pending input is processed and all pending output is flushed; avail_out must be large enough to hold all the uncompressed data. (The size of the uncompressed data may have been saved by the compressor for this purpose.) The next operation on this stream must be inflateEnd to deallocate the decompression state. The use of Z_FINISH is never required, but canbe used to inform inflate that a faster approach may be used for the single inflate() call.In this implementation, inflate() always flushes as much output as possible to the output buffer, andalways uses the faster approach on the first call. So the only effect of the flush parameter in thisimplementation is on the return value of inflate(), as noted below, or when it returns early becauseZ_BLOCK or Z_TREES is used.If a preset dictionary is needed after this call (see inflateSetDictionary below), inflate setsstrm->adler to the adler32 checksum of the dictionary chosen by the compressor and returnsZ_NEED_DICT; otherwise it sets strm->adler to the adler32 checksum of all output produced so far (that is, total_out bytes) and returns Z_OK, Z_STREAM_END or an error code as described below. At the end of the stream, inflate() checks that its computed adler32 checksum is equal to that saved by the compressor and returns Z_STREAM_END only if the checksum is correct.inflate() will decompress and check either zlib-wrapped or gzip-wrapped deflate data. The header type is detected automatically, if requested when initializing with inflateInit2(). Any information contained in the gzip header is not retained, so applications that need that information should instead use raw inflate, see inflateInit2() below, or inflateBack() and perform their own processing of the gzip header and trailer.inflate() returns Z_OK if some progress has been made (more input processed or more output produced), Z_STREAM_END if the end of the compressed data has been reached and all uncompressed output has been produced, Z_NEED_DICT if a preset dictionary is needed at this point, Z_DATA_ERROR if the input data was corrupted (input stream not conforming to the zlib format or incorrect check value), Z_STREAM_ERROR if the stream structure was inconsistent (for example if next_in or next_out was NULL), Z_MEM_ERROR if there was not enough memory, Z_BUF_ERROR if no progress is possible or if there was not enough room in the output buffer when Z_FINISH is used. Note that Z_BUF_ERROR is not fatal, and inflate() can be called again with more input and more output space to continue decompressing. If Z_DATA_ERROR is returned, the application may then call inflateSync() to look for a good compression block if a partial recovery of the data is desired.ZEXTERN int ZEXPORT inflateEnd OF((z_streamp strm));All dynamically allocated data structures for this stream are freed. This function discards any unprocessed input and does not flush any pending output.inflateEnd returns Z_OK if success, Z_STREAM_ERROR if the stream state was inconsistent. In the error case, msg may be set but then points to a static string (which must not be deallocated).Advanced FunctionsThe following functions are needed only in some special applications.ZEXTERN int ZEXPORT deflateInit2 OF((z_streamp strm,int level,int method,int windowBits,int memLevel,int strategy));This is another version of deflateInit with more compression options. The fields next_in, zalloc, zfree and opaque must be initialized before by the caller.The method parameter is the compression method. It must be Z_DEFLATED in this version of the library.The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15 for this version of the library. Larger values of this parameter result in better compression at the expense of memory usage. The default value is 15 if deflateInit is usedinstead.windowBits can also be –8..–15 for raw deflate. In this case, -windowBits determines the window size. deflate() will then generate raw deflate data with no zlib header or trailer, and will not compute an adler32 check value.windowBits can also be greater than 15 for optional gzip encoding. Add 16 to windowBits to write a simple gzip header and trailer around the compressed data instead of a zlib wrapper. The gzip header will have no file name, no extra data, no comment, no modification time (set to zero), no header crc, and the operating system will be set to 255 (unknown). If a gzip stream is being written, strm->adler is a crc32 instead of an adler32.The memLevel parameter specifies how much memory should be allocated for the internal compression state. memLevel=1 uses minimum memory but is slow and reduces compression ratio; memLevel=9 uses maximum memory for optimal speed. The default value is 8. See zconf.h for total memory usage as a function of windowBits and memLevel.The strategy parameter is used to tune the compression algorithm. Use the valueZ_DEFAULT_STRATEGY for normal data, Z_FILTERED for data produced by a filter (or predictor),Z_HUFFMAN_ONLY to force Huffman encoding only (no string match), or Z_RLE to limit match distances to one (run-length encoding). Filtered data consists mostly of small values with a somewhat random distribution. In this case, the compression algorithm is tuned to compress them better. The effect ofZ_FILTERED is to force more Huffman coding and less string matching; it is somewhat intermediate between Z_DEFAULT_STRATEGY and Z_HUFFMAN_ONLY. Z_RLE is designed to be almost as fast asZ_HUFFMAN_ONLY, but give better compression for PNG image data. The strategy parameter only affects the compression ratio but not the correctness of the compressed output even if it is not set appropriately. Z_FIXED prevents the use of dynamic Huffman codes, allowing for a simpler decoder forspecial applications.deflateInit2 returns Z_OK if success, Z_MEM_ERROR if there was not enough memory,Z_STREAM_ERROR if a parameter is invalid (such as an invalid method), or Z_VERSION_ERROR if thezlib library version (zlib_version) is incompatible with the version assumed by the caller(ZLIB_VERSION). msg is set to null if there is no error message. deflateInit2 does not perform anycompression: this will be done by deflate().ZEXTERN int ZEXPORT deflateSetDictionary OF((z_streamp strm,const Bytef *dictionary,uInt dictLength));Initializes the compression dictionary from the given byte sequence without producing any compressed output. This function must be called immediately after deflateInit, deflateInit2 or deflateReset, before any call of deflate. The compressor and decompressor must use exactly the same dictionary (see inflateSetDictionary).The dictionary should consist of strings (byte sequences) that are likely to be encountered later in the data to be compressed, with the most commonly used strings preferably put towards the end of the dictionary. Using a dictionary is most useful when the data to be compressed is short and can be predicted with good accuracy; the data can then be compressed better than with the default empty dictionary.Depending on the size of the compression data structures selected by deflateInit or deflateInit2, a part of the dictionary may in effect be discarded, for example if the dictionary is larger than the window size in deflate or deflate2. Thus the strings most likely to be useful should be put at the end of the dictionary, not at the front. In addition, the current implementation of deflate will use at most the window size minus 262 bytes of the provided dictionary.Upon return of this function, strm->adler is set to the adler32 value of the dictionary; the decompressor may later use this value to determine which dictionary has been used by the compressor. (The adler32 value applies to the whole dictionary even if only a subset of the dictionary is actually used by the compressor.) If a raw deflate was requested, then the adler32 value is not computed and strm->adler is not set.deflateSetDictionary returns Z_OK if success, or Z_STREAM_ERROR if a parameter is invalid (such as NULL dictionary) or the stream state is inconsistent (for example if deflate has already been called for this stream or if the compression method is bsort). deflateSetDictionary does not perform any compression: this will be done by deflate().。

【协议分析】HTTP响应头中的2种编码方式介绍

【协议分析】HTTP响应头中的2种编码⽅式介绍报⽂举例：Server: Apache-Coyote/1.1Cache-Control: no-storePragma: no-cacheExpires: Thu, 01 Jan 1970 00:00:00 GMTContent-Type: text/html;charset=GBKTransfer-Encoding: chunkedContent-Encoding: gzipVary: Accept-EncodingDate: Mon, 01 Jul 2013 02:37:55 GMTa.........200.=ks.....U..v..f......(..l....lCl..5.#5.t..{$d../X[......c.c@<,.$..^..n..7...q...v%...G....F~.T..P?.=............?.-......J.tU-..z.I.m[......h[...3K..1..U.k\3.K...<..........Oo....o.^......v).#.....(c...b..(.......3.I....R'......*...%o...9...(c....5.....V...4......NW.. .m./...]..}..L..Z.X=*.>.$=....{G7y....[f.(..M.........e..........Nh`.UU.n.....|ZE....,=.>l.JZ...v..y$5....ho.c....NB.....\m.p..[J...A .I....6..RsL.q......6>.h.]Y....J.1.F...e......&Z....w...p...P..^.z+..H..SmS..i...q.m.TS.....(..K....U.0>.k200..d)M19..}-.{....I.~mui...N....+k...j#..qdq.....x.7MaI3..K..Z....`...j...)4...^...=......B..~(...]...S........>=]9`...C:....|F+K........^.hiUGD.X.T.SY..bA...v..........O..S....f.P...IY;.oI........FD...3.Q....e..........dL...T..M.<`Z...Kf.."pR.....Y6..+.f..e..Lw&.m..t...Vt..1..].'..3.Z...'.RI5..j..;.:...J..:.~...>i.V\.v..wum....aM..V...&c+....<Sf.F|.........I...Q.Q.3.....U..F...O.....!.R.E.....X...k.....z.tf.Xz....$.>)R.2..6... f.........KP7P...92.c..e......&.[.&yS.P.S.....4.dn....p.^.N.@..{T7.Mf..jUT.200⼀、Transfer-Encoding含义介绍有时候，Web服务器⽣成HTTP Response是⽆法在Header就确定消息⼤⼩的，这时⼀般来说服务器将不会提供Content-Length的头信息，⽽采⽤Chunked编码动态的提供body内容的长度。

移动终端规范—WAP2.0功能分册

中国移动通信企业标准QB-E-006-2004中国移动通信集团公司发布目录1.适用范围 (1)2.引用标准 (1)3.名词术语与缩略语 (2)4.终端协议栈 (3)5.终端承载能力要求 (3)6.传输协议 (3)7.会话协议 (4)7.1. 会话协议 (4)7.2. 状态管理 (4)7.3. 在承载网络上的优化实现 (4)8.安全层 (5)8.1. 安全层协议 (5)8.2. WPKI支持 (5)8.3. WIM支持 (5)9.无线应用环境(WAE) (6)9.1. 浏览器 (6)9.1.1. 对标记语言的要求 (6)9.1.2. Cache和Bookmark (6)9.1.3. User Agent (6)9.1.4. 图片大小和页面大小 (6)9.1.5. 图形用户界面要求 (6)9.1.6. 状态标识 (12)9.1.7. 外部事件中断的处理 (12)9.1.8. 语言 (13)9.1.9. 图形化文字(Pictogram) (13)9.1.10. 页面另存(Snapshot)（可选） (13)9.1.11. 保存条目(Save Items) (13)9.1.12. 浏览器设定功能 (13)9.2. WML Scription要求 (13)9.3. 内容要求 (14)9.3.1. 图片格式 (14)9.3.2. 背景音 (14)9.3.3. 内容编码 (14)10.应用要求 (14)10.1. 多媒体消息（MMS） (14)10.2. 跨业务调用(Cross triggering) (14)10.3. 下载（Download） (15)10.3.1. 配置提供(Provisioning) (16)I10.3.2. 认证(Authentication) (16)10.3.3. 内容传送 (16)10.3.4. 内容处理 (16)10.4. 无线电话应用(WTA) (16)10.5. Push (17)10.6. UAProf要求 (18)10.7. 预配置功能要求(Provisioning) (18)11.基本参数配置 (20)11.1. 接入参数 (20)11.2. 代理服务器（网关）列表 (20)11.3. 默认主页地址 (20)11.4. 密钥与安全性 (20)12.编制历史 (21)附件1 对于WIM的要求 (21)附件2 WML Scription 要求 (26)终端可选项 (26)预知（Predicates） (26)字符集及编码 (27)附件3 WTA的要求 (27)WTA接口 (27)用户代理 (27)安全要求 (27)状态模型 (28)资料库(Repository) (29)通道装载 (30)事件处理 (reference from M-services – need to check if there’s conflictwith OMA Wap specification) (30)II前言本规范规定了中国移动通信集团公司WAP2.0业务对终端的要求，按照终端协议栈自底向上的顺序，对各方面的功能和处理能力进行了规定，同时对具体应用时的要求进行了说明，是集团公司开展WAP业务的依据之一。

imap rfc标准

Internet Message Access Protocol (IMAP) is an email retrieval protocol. It stores email messages on a mail server and enables the recipient to view and manipulate them as though they were stored locally on their device. IMAP was developed in the late 1980s and has since become one of the most widely used email retrieval protocols.The IMAP standard is defined in RFC 3501, which was published in 2003. This document provides a detailed description of the protocol's functionality, including its data formats, commands, and responses. The standard specifies how IMAP clients and servers should communicate with each other to enable the retrieval and manipulation of email messages.One of the key features of IMAP is its support for multiple clients accessing the same mailbox simultaneously. This is achieved through the use of a "shared" storage model, where all clients see the same set of messages and folders stored on the server. This allows users to access their email from different devices without having to worry about synchronizing their messages manually.Another important aspect of IMAP is its support for message organization and management. Clients can create, delete, and rename folders, as well as move messages between folders. They can also search for specific messages based on various criteria, such as sender, subject, or date.IMAP also provides a range of features for managing individual messages. Clients can mark messages as read or unread, flag them for follow-up, and even move them to a specific folder. They can also reply to messages, forward them to others, and generate replies or forwards with attachments.Overall, the IMAP standard provides a powerful and flexible framework for managing email messages. Its support for shared storage, message organization, and advanced message management features make it a popular choice for both personal and business email users.。

DHCP的RFC文档-RFC2131

Network Working Group R. DromsRequest for Comments: 2131 Bucknell UniversityObsoletes: 1541 March 1997Category: Standards TrackDynamic Host Configuration ProtocolStatus of this memoThis document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions forimprovements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.AbstractThe Dynamic Host Configuration Protocol (DHCP) provides a framework for passing configuration information to hosts on a TCPIP network. DHCP is based on the Bootstrap Protocol (BOOTP) [7], adding thecapability of automatic allocation of reusable network addresses and additional configuration options [19]. DHCP captures the behavior of BOOTP relay agents [7, 21], and DHCP participants can interoperate with BOOTP participants [9].Table of Contents1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1 Changes to RFC1541. . . . . . . . . . . . . . . . . . . . . . 3 1.2 Related Work. . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Problem definition and issues . . . . . . . . . . . . . . . . 4 1.4 Requirements. . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Design goals. . . . . . . . . . . . . . . . . . . . . . . . . 62. Protocol Summary. . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Configuration parameters repository . . . . . . . . . . . . . 112.2 Dynamic allocation of network addresses . . . . . . . . . . .123. The Client-Server Protocol. . . . . . . . . . . . . . . . . . 13 3.1 Client-server interaction - allocating a network address. . . 13 3.2 Client-server interaction - reusing a previously allocatednetwork address . . . . . . . . . . . . . . . . . . . . . . . 173.3 Interpretation and representation of time values. . . . . . . 20 3.4 Obtaining parameters with externally configured networkaddress . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5 Client parameters in DHCP . . . . . . . . . . . . . . . . . . 21 3.6 Use of DHCP in clients with multiple interfaces . . . . . . . 223.7 When clients should use DHCP. . . . . . . . . . . . . . . . .224. Specification of the DHCP client-server protocol. . . . . . . 22Droms Standards Track [Page 1]RFC 2131 Dynamic Host Configuration Protocol March 19974.1 Constructing and sending DHCP messages. . . . . . . . . . . .22 4.2 DHCP server administrative controls . . . . . . . . . . . . . 25 4.3 DHCP server behavior. . . . . . . . . . . . . . . . . . . . . 264.4 DHCP client behavior. . . . . . . . . . . . . . . . . . . . .345. Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . .426. References . . . . . . . . . . . . . . . . . . . . . . . . . .427. Security Considerations. . . . . . . . . . . . . . . . . . . .438. Author's Address . . . . . . . . . . . . . . . . . . . . . . .44A. Host Configuration Parameters . . . . . . . . . . . . . . . .45 List of Figures1. Format of a DHCP message . . . . . . . . . . . . . . . . . . . 92. Format of the 'flags' field. . . . . . . . . . . . . . . . . .113. Timeline diagram of messages exchanged between DHCP client and servers when allocating a new network address. . . . . . . . . 154. Timeline diagram of messages exchanged between DHCP client and servers when reusing a previously allocated network address. . 185. State-transition diagram for DHCP clients. . . . . . . . . . .34 List of Tables1. Description of fields in a DHCP message. . . . . . . . . . . .102. DHCP messages. . . . . . . . . . . . . . . . . . . . . . . . .143. Fields and options used by DHCP servers. . . . . . . . . . . .284. Client messages from various states. . . . . . . . . . . . . .335. Fields and options used by DHCP clients. . . . . . . . . . . .37 1. IntroductionThe Dynamic Host Configuration Protocol (DHCP) provides configuration parameters to Internet hosts. DHCP consists of two components: a protocol for delivering host-specific configuration parameters from aDHCP server to a host and a mechanism for allocation of networkaddresses to hosts.DHCP is built on a client-server model, where designated DHCP server hosts allocate network addresses and deliver configuration parameters to dynamically configured hosts. Throughout the remainder of this document, the term "server" refers to a host providinginitialization parameters through DHCP, and the term "client" refers to a hostrequesting initialization parameters from a DHCP server.A host should not act as a DHCP server unless explicitly configured to do so by a system administrator. The diversity of hardware and protocol implementations in the Internet would preclude reliable operation if random hosts were allowed to respond to DHCP requests. For example, IP requires the setting of many parameters within the protocol implementation software. Because IP can be used on many dissimilar kinds of network hardware, values for those parameters cannot be guessed or assumed to have correct defaults. Also,distributed address allocation schemes depend on a polling/defenseDroms Standards Track [Page 2]RFC 2131 Dynamic Host Configuration Protocol March 1997mechanism for discovery of addresses that are already in use. IP hosts may not always be able to defend their network addresses, so that such a distributed address allocation scheme cannot beguaranteed to avoid allocation of duplicate network addresses.DHCP supports three mechanisms for IP address allocation. In"automatic allocation", DHCP assigns a permanent IP address to aclient. In "dynamic allocation", DHCP assigns an IP address to a client for a limited period of time (or until the client explicitly relinquishes the address). In "manual allocation", a client's IP address is assigned by the network administrator, and DHCP is used simply to convey the assigned address to the client. A particular network will use one or more of these mechanisms, depending on the policies of the network administrator.Dynamic allocation is the only one of the three mechanisms thatallows automatic reuse of an address that is no longer needed by the client to which it was assigned. Thus, dynamic allocation isparticularly useful for assigning an address to a client that will be connected to the network only temporarily or for sharing a limited pool of IP addresses among a group of clients that do not needpermanent IP addresses. Dynamic allocation may also be a good choice for assigning an IP address to a new client being permanentlyconnected to a network where IP addresses are sufficiently scarce that it is important to reclaim them when old clients are retired. Manual allocation allows DHCP to be used to eliminate the error-prone process of manually configuring hosts with IP addresses inenvironments where (for whatever reasons) it is desirable to manage IP address assignment outside of the DHCP mechanisms.The format of DHCP messages is based on the format of BOOTP messages, to capture the BOOTP relay agent behavior described as part of the BOOTP specification [7, 21] and to allow interoperability ofexisting BOOTP clients with DHCP servers. Using BOOTP relay agents eliminates the necessity of having a DHCP server on each physical networksegment.1.1 Changes to RFC 1541This document updates the DHCP protocol specification that appears in RFC1541. A new DHCP message type, DHCPINFORM, has been added; see section 3.4, 4.3 and 4.4 for details. The classing mechanism for identifying DHCP clients to DHCP servers has been extended to include "vendor" classes as defined in sections 4.2 and 4.3. The minimum lease time restriction has been removed. Finally, many editorial changes have been made to clarify the text as a result of experience gained in DHCP interoperability tests.Droms Standards Track [Page 3]RFC 2131 Dynamic Host Configuration Protocol March 19971.2 Related WorkThere are several Internet protocols and related mechanisms that address some parts of the dynamic host configuration problem. The Reverse Address Resolution Protocol (RARP) [10] (through theextensions defined in the Dynamic RARP (DRARP) [5]) explicitlyaddresses the problem of network address discovery, and includes an automatic IP address assignment mechanism. The Trivial File Transfer Protocol (TFTP) [20] provides for transport of a boot image from a boot server. The Internet Control Message Protocol (ICMP) [16]provides for informing hosts of additional routers via "ICMPredirect" messages. ICMP also can provide subnet mask information through the "ICMP mask request" message and other information through the (obsolete) "ICMP information request" message. Hosts can locate routers through the ICMP router discovery mechanism [8].BOOTP is a transport mechanism for a collection of configuration information. BOOTP is also extensible, and official extensions [17] have been defined for several configuration parameters. Morgan has proposed extensions to BOOTP for dynamic IP address assignment [15]. The Network Information Protocol (NIP), used by the Athena project at MIT, is a distributed mechanism for dynamic IP address assignment [19]. The Resource Location Protocol RLP [1] provides for location of higher level services. Sun Microsystems diskless workstations use a boot procedure that employs RARP, TFTP and an RPC mechanism called "bootparams" to deliver configuration information and operatingsystem code to diskless hosts. (Sun Microsystems, Sun Workstation and SunOS are trademarks of Sun Microsystems, Inc.) Some Sunnetworks also use DRARP and an auto-installation mechanism toautomate the configuration of new hosts in an existing network.In other related work, the path minimum transmission unit (MTU)discovery algorithm can determine the MTU of an arbitrary internet path [14]. The Address Resolution Protocol (ARP) has been proposed as a transport protocol for resource location and selection [6]. Finally, the Host Requirements RFCs [3, 4] mention specificrequirements for host reconfiguration and suggest a scenario for继续阅读。

运维常见面试题

运维常见⾯试题转⾄：https:///hanguofei/article/details/102774930?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~sobaiduend~default-1-102774930.nonecase&utm_term=8%E4%B8%AA300g%E7%9A%84%E7%A1%AC%E7%9B%98%E5%81%9Araid5&spm=1000.2123.3001.4430 1、NAT和PAT的区别IP地址耗尽促成了CIDR的开发，但是CIDR开发的主要⽬的是为了有效的使⽤现有的INTERNET地址，⽽同时根据RFC1631(IPNETWORKADDRESSTRANSLATOR)开发的NAT却可以在多重的INTERNET⼦⽹中使⽤相同的IP地址，⽤来减少注册IP地址的使⽤。

NAT的分为：静态NAT、动态NAT、端⼝NAT(PAT)。

静态NAT:内部⽹络中的每个主机都被永久的映射成外部⽹络中的某个合法地址；动态NAT：在外部⽹络中定义了⼀系列的合法地址，采⽤动态分配的⽅法映射到内部⽹络；PAT：是⼈们⽐较熟悉的⼀种转换⽅式。

PAT普遍应⽤于接⼊设备中，它可以将中⼩型的⽹络隐藏在⼀个合法的IP地址后⾯。

PATT与动态地址NAT不同，它将内部连接映射到外部⽹络中的⼀个单独的IP地址上，同时在该地址上加上⼀个由NAT设备选定的TCP端⼝号。

也就是采⽤portmultiplexing技术，或改变外出数据的源port的技术将多个内部ip地址映射到同⼀个外部地址。

⽹络地址转换(NAT)是⼀个Internet⼯程任务组(InternetEngineeringTaskForce，IETF)标准，⽤于允许专⽤⽹络上的多台PC(使⽤专⽤地址段，例如10.0.x.x、192.168.x.x、172.x.x.x)共享单个、全局路由的IPv4地址。

e2e profile1 crc 8 计算方法

e2e profile1 crc 8 计算方法循环冗余校验（Cyclic Redundancy Check，CRC）是一种常用的数据校验算法，用于检测并纠正数据传输过程中产生的错误。

其中，CRC-8是一种8位CRC校验方法，可以检测并纠正8位数据的错误。

CRC-8的计算方法如下：1. 首先，选择一个生成多项式（Generator Polynomial）。

常用的生成多项式包括x^8 + x^2 + x^1 + 1，即0b100000111。

这个生成多项式的最高次数决定了CRC的位数，而多项式的系数则决定了运算过程中的异或操作。

2. 将待校验的数据（Message）与一个初始的CRC寄存器（Initial CRC Register）进行异或操作，并将结果保存到CRC寄存器中。

3. 从高位开始，按照位次的顺序逐个处理数据和CRC寄存器中的比特位。

算法的核心是：如果CRC寄存器的最高位是1，那么就将生成多项式（Generator Polynomial）与CRC寄存器进行异或，否则直接将CRC寄存器左移一位；然后，将数据的下一位与CRC寄存器的最低位进行异或操作。

4. 重复第3步，直至处理完所有的数据和CRC寄存器中的比特位。

5. 最后，得到的CRC寄存器的值就是计算得到的CRC码。

根据上述计算方法，以下是一些与CRC-8相关的参考内容：1. 《Internet Engineering Task Force (IETF) RFC 1952: Gzip File Format Specification Version 4.3》：该RFC文档中介绍了使用CRC-8算法对Gzip文件进行校验的细节。

该文档中提供了CRC-8生成多项式的具体定义，并详细描述了CRC-8的计算方法。

2. 《The 8051 Microcontroller and Embedded Systems: Using Assembly and C》：该书作者Muhammad Ali Mazidi详细介绍了使用CRC-8算法进行数据校验的方法。

zlib库剖析（1）：实现概览

zlib库剖析（1）：实现概览zlib库剖析(1)：实现概览本⽂整理⾃以及zlib 1.2.7的⼿册页/manual.html。

zlib是⼀套免费、通⽤、法律上不受限制的⽆损数据压缩库，可以在任何硬件及操作系统上使⽤。

zlib数据格式可以跨平台移植。

不像Unix compress(1)和GIF图像格式中使⽤的LZW压缩⽅法，当前zlib中使⽤的压缩算法不会扩充数据（LZW在极端情 zlib授权是⼀个⾃由软件授权协议，但并⾮copyleft。

协议原⽂在/zlib /zlib_license.html。

最新版本为1.2.2，2004年10⽉3⽇发布。

版权持有⼈为Jean-loup Gailly和Mark Adler(1995-2004)，类似于BSD许可。

任何⼈都可以使⽤本1、当前版本#define ZLIB_VERSION "1.2.7"#define ZLIB_VERNUM 0x12702、功能特性zlib压缩库提供内存内压缩/解压缩函数。

包括对解压数据完整性检查。

这个版本只⽀持⼀种压缩⽅式（deflation），但是以后其他的算法也会被加⼊进来，并且保持同样的流接⼝。

如果缓存区⾜够⼤，压缩被⼀次完成（例如输⼊⽂件被mmap了本压缩库使⽤的默认压缩数据格式为zlib格式（在RFC 1950中描述），它是对deflate流（在RFC 1951中描述）的⼀种封装。

本压缩库也⽀持对gzip（.gz）格式⽂件的读写操作，操作接⼝以"gz"开头，和stdio相似。

gzip格式与 zlib格式不同，在本压缩库不安装任何信号处理器，解码器检查压缩数据的⼀致性，所以，即使在有损坏的输⼊情况下，本压缩库也不会崩溃。

（1）数据头zlib能使⽤gzip数据头（header）、zlib数据头或者不使⽤数据头压缩数据。

通常情况下，数据压缩使⽤zlib数据头，因为这提供错误数据检测。

Brotli压缩格式说明书

Package‘brotli’October14,2022Type PackageTitle A Compression Format Optimized for the WebVersion1.3.0Description A lossless compressed data format that uses a combination of theLZ77algorithm and Huffman coding.Brotli is similar in speed to deﬂate(gzip)but offers more dense compression.License MIT+ﬁle LICENSEURL https:///rfc/rfc7932(spec)https:///google/brotli#readme(upstream)https:///jeroen/brotli#read(devel)BugReports https:///jeroen/brotli/issuesVignetteBuilder knitr,R.rspSuggests spelling,knitr,R.rsp,microbenchmark,rmarkdown,ggplot2RoxygenNote6.0.1Language en-USNeedsCompilation yesAuthor Jeroen Ooms[aut,cre](<https:///0000-0002-4035-0289>), Google,Inc[aut,cph](Brotli C++library)Maintainer Jeroen Ooms<*******************>Repository CRANDate/Publication2022-10-1409:22:33UTCR topics documented:brotli (2)Index41brotli Brotli CompressionDescriptionBrotli is a compression algorithm optimized for the web,in particular small text documents. Usagebrotli_compress(buf,quality=11,window=22)brotli_decompress(buf)Argumentsbuf raw vector with data to compress/decompressquality value between0and11window log of window sizeDetailsBrotli decompression is at least as fast as for gzip while signiﬁcantly improving the compression ratio.The price we pay is that compression is much slower than gzip.Brotli is therefore most effective for serving static content such as fonts and html pages.For binary(non-text)data,the compression ratio of Brotli usually does not beat bz2or xz(lzma), however decompression for these algorithms is too slow for browsers in e.g.mobile devices. ReferencesJ.Alakuijala and Z.Szabadka(July2016).Brotli Compressed Data Format.IETF Internet Draft https:///rfc/rfc7932.See AlsomemCompressExamples#Simple examplemyfile<-file.path(R.home(),"COPYING")x<-readBin(myfile,raw(),(myfile)$size)y<-brotli_compress(x)stopifnot(identical(x,brotli_decompress(y)))#Compare to other algorithmslength(x)length(brotli_compress(x))length(memCompress(x,"gzip"))length(memCompress(x,"bzip2")) length(memCompress(x,"xz"))Indexbrotli,2brotli_compress(brotli),2brotli_decompress(brotli),2 memCompress,24。

SMTP协议RFC文档中文版

RFC821 简单邮件传输协议（SMTP）(RFC821 SIMPLE MAIL TRANSFER PROTOCOL)目录1. 介绍 22. SMTP模型 33. SMTP过程 43.1. MAIL 43.2. 转发 53.3. 确认和扩展 63.4. 发送信件(mailing)和获得信件(sending) 7 3.5. 打开和关闭73.6. 转发 83.7. 域93.8. 改变角色94. SMTP说明94.1. SMTP命令94.1.1. 命令语法94.1.2. COMMAND语法格式134.2. SMTP响应154.3. 命令和应答序列164.4. 状态图174.5. 详细内容184.5.1. 最小实现184.5.2. 透明性194.5.3. 大小19附录 A TCP传输服务19附录 B NCP传输服务20附录 C NITS 20附录 D X.25传输服务 20附录 E 应答码构成方法20附录 F 一些例子22参考资料361. 介绍简单邮件传输协议（SMTP）的目标是可靠高效地传送邮件，它独立于传送子系统而且仅要求一条可以保证传送数据单元顺序的通道。

附录A，B，C和D描述了不同传送服务下SMTP的使用。

在名词表中还定义了本文档中使用的术语。

SMTP的一个重要特点是它能够在传送中接力传送邮件，传送服务提供了进程间通信环境（IPCE），此环境可以包括一个网络，几个网络或一个网络的子网。

理解到传送系统（或IPCE）不是一对一的是很重要的。

进程可能直接和其它进程通过已知的IPCE通信。

邮件是一个应用程序或进程间通信。

邮件可以通过连接在不同IPCE上的进程跨网络进行邮件传送。

更特别的是，邮件可以通过不同网络上的主机接力式传送。

2. SMTP模型SMTP设计基于以下通信模型：针对用户的邮件请求，发送SMTP建立与接收SMTP之间建立一个双向传送通道。

接收SMTP可以是最终接收者也可以是中间传送者。

SMTP命令由发送SMTP发出，由接收SMTP接收，而应答则反方面传送。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Network Working Group P. Deutsch Request for Comments: 1952 Aladdin Enterprises Category: Informational May 1996 GZIP file format specification version 4.3Status of This MemoThis memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution ofthis memo is unlimited.IESG Note:The IESG takes no position on the validity of any IntellectualProperty Rights statements contained in this document.NoticesCopyright (c) 1996 L. Peter DeutschPermission is granted to copy and distribute this document for anypurpose and without charge, including translations into otherlanguages and incorporation into compilations, provided that thecopyright notice and this notice are preserved, and that anysubstantive changes or deletions from the original are clearlymarked.A pointer to the latest version of this and related documentation in HTML format can be found at the URL<ftp:///graphics/png/documents/zlib/zdoc-index.html>. AbstractThis specification defines a lossless compressed data format that is compatible with the widely used GZIP utility. The format includes a cyclic redundancy check value for detecting data corruption. Theformat presently uses the DEFLATE method of compression but can beeasily extended to use other compression methods. The format can be implemented readily in a manner not covered by patents.Deutsch Informational [Page 1]Table of Contents1. Introduction (2)1.1. Purpose (2)1.2. Intended audience (3)1.3. Scope (3)1.4. Compliance (3)1.5. Definitions of terms and conventions used (3)1.6. Changes from previous versions (3)2. Detailed specification (4)2.1. Overall conventions (4)2.2. File format (5)2.3. Member format (5)2.3.1. Member header and trailer (6)2.3.1.1. Extra field (8)2.3.1.2. Compliance (9)3. References (9)4. Security Considerations (10)5. Acknowledgements (10)6. Author’s Address (10)7. Appendix: Jean-Loup Gailly’s gzip utility (11)8. Appendix: Sample CRC Code (11)1. Introduction1.1. PurposeThe purpose of this specification is to define a losslesscompressed data format that:* Is independent of CPU type, operating system, file system,and character set, and hence can be used for interchange;* Can compress or decompress a data stream (as opposed to arandomly accessible file) to produce another data stream,using only an a priori bounded amount of intermediatestorage, and hence can be used in data communications orsimilar structures such as Unix filters;* Compresses data with efficiency comparable to the bestcurrently available general-purpose compression methods,and in particular considerably better than the "compress"program;* Can be implemented readily in a manner not covered bypatents, and hence can be practiced freely;* Is compatible with the file format produced by the currentwidely used gzip utility, in that conforming decompressorswill be able to read data produced by the existing gzipcompressor.Deutsch Informational [Page 2]The data format defined by this specification does not attempt to: * Provide random access to compressed data;* Compress specialized data (e.g., raster graphics) as well as the best currently available specialized algorithms.1.2. Intended audienceThis specification is intended for use by implementors of software to compress data into gzip format and/or decompress data from gzip format.The text of the specification assumes a basic background inprogramming at the level of bits and other primitive datarepresentations.1.3. ScopeThe specification specifies a compression method and a file format (the latter assuming only that a file can store a sequence ofarbitrary bytes). It does not specify any particular interface to a file system or anything about character sets or encodings(except for file names and comments, which are optional).1.4. ComplianceUnless otherwise indicated below, a compliant decompressor must be able to accept and decompress any file that conforms to all thespecifications presented here; a compliant compressor must produce files that conform to all the specifications presented here. The material in the appendices is not part of the specification per se and is not relevant to compliance.1.5. Definitions of terms and conventions usedbyte: 8 bits stored or transmitted as a unit (same as an octet).(For this specification, a byte is exactly 8 bits, even onmachines which store a character on a number of bits differentfrom 8.) See below for the numbering of bits within a byte.1.6. Changes from previous versionsThere have been no technical changes to the gzip format sinceversion 4.1 of this specification. In version 4.2, someterminology was changed, and the sample CRC code was rewritten for clarity and to eliminate the requirement for the caller to do pre- and post-conditioning. Version 4.3 is a conversion of thespecification to RFC style.Deutsch Informational [Page 3]2. Detailed specification2.1. Overall conventionsIn the diagrams below, a box like this:+---+| | <-- the vertical bars might be missing+---+represents one byte; a box like this:+==============+| |+==============+represents a variable number of bytes.Bytes stored within a computer do not have a "bit order", sincethey are always treated as a unit. However, a byte considered as an integer between 0 and 255 does have a most- and least-significant bit, and since we write numbers with the most-significant digit on the left, we also write bytes with the most- significant bit on the left. In the diagrams below, we number the bits of a byte so that bit 0 is the least-significant bit, i.e.,the bits are numbered:+--------+|76543210|+--------+This document does not address the issue of the order in whichbits of a byte are transmitted on a bit-sequential medium, sincethe data format described here is byte- rather than bit-oriented. Within a computer, a number may occupy multiple bytes. Allmulti-byte numbers in the format described here are stored withthe least-significant byte first (at the lower memory address).For example, the decimal number 520 is stored as:0 1+--------+--------+|00001000|00000010|+--------+--------+^ ^| || + more significant byte = 2 x 256+ less significant byte = 8Deutsch Informational [Page 4]2.2. File formatA gzip file consists of a series of "members" (compressed datasets). The format of each member is specified in the followingsection. The members simply appear one after another in the file, with no additional information before, between, or after them.2.3. Member formatEach member has the following structure:+---+---+---+---+---+---+---+---+---+---+|ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)+---+---+---+---+---+---+---+---+---+---+(if FLG.FEXTRA set)+---+---+=================================+| XLEN |...XLEN bytes of "extra field"...| (more-->)+---+---+=================================+(if FLG.FNAME set)+=========================================+|...original file name, zero-terminated...| (more-->)+=========================================+(if FLG.FCOMMENT set)+===================================+|...file comment, zero-terminated...| (more-->)+===================================+(if FLG.FHCRC set)+---+---+| CRC16 |+---+---++=======================+|pressed blocks...| (more-->)+=======================+0 1 2 3 4 5 6 7+---+---+---+---+---+---+---+---+| CRC32 | ISIZE |+---+---+---+---+---+---+---+---+Deutsch Informational [Page 5]2.3.1. Member header and trailerID1 (IDentification 1)ID2 (IDentification 2)These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 (0x8b, \213), to identify the file as being in gzip format. CM (Compression Method)This identifies the compression method used in the file. CM = 0-7 are reserved. CM = 8 denotes the "deflate"compression method, which is the one customarily used bygzip and which is documented elsewhere.FLG (FLaGs)This flag byte is divided into individual bits as follows:bit 0 FTEXTbit 1 FHCRCbit 2 FEXTRAbit 3 FNAMEbit 4 FCOMMENTbit 5 reservedbit 6 reservedbit 7 reservedIf FTEXT is set, the file is probably ASCII text. This isan optional indication, which the compressor may set bychecking a small amount of the input data to see whether any non-ASCII characters are present. In case of doubt, FTEXTis cleared, indicating binary data. For systems which havedifferent file formats for ascii text and binary data, thedecompressor can use FTEXT to choose the appropriate format. We deliberately do not specify the algorithm used to setthis bit, since a compressor always has the option ofleaving it cleared and a decompressor always has the option of ignoring it and letting some other program handle issues of data conversion.If FHCRC is set, a CRC16 for the gzip header is present,immediately before the compressed data. The CRC16 consistsof the two least significant bytes of the CRC32 for allbytes of the gzip header up to and not including the CRC16. [The FHCRC bit was never set by versions of gzip up to1.2.4, even though it was documented with a differentmeaning in gzip 1.2.4.]If FEXTRA is set, optional extra fields are present, asdescribed in a following section.Deutsch Informational [Page 6]If FNAME is set, an original file name is present,terminated by a zero byte. The name must consist of ISO8859-1 (LATIN-1) characters; on operating systems usingEBCDIC or any other character set for file names, the namemust be translated to the ISO LATIN-1 character set. Thisis the original name of the file being compressed, with any directory components removed, and, if the file beingcompressed is on a file system with case insensitive names, forced to lower case. There is no original file name if the data was compressed from a source other than a named file;for example, if the source was stdin on a Unix system, there is no file name.If FCOMMENT is set, a zero-terminated file comment ispresent. This comment is not interpreted; it is onlyintended for human consumption. The comment must consist of ISO 8859-1 (LATIN-1) characters. Line breaks should bedenoted by a single line feed character (10 decimal).Reserved FLG bits must be zero.MTIME (Modification TIME)This gives the most recent modification time of the original file being compressed. The time is in Unix format, i.e.,seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this may cause problems for MS-DOS and other systems that uselocal rather than Universal time.) If the compressed datadid not come from a file, MTIME is set to the time at which compression started. MTIME = 0 means no time stamp isavailable.XFL (eXtra FLags)These flags are available for use by specific compressionmethods. The "deflate" method (CM = 8) sets these flags as follows:XFL = 2 - compressor used maximum compression,slowest algorithmXFL = 4 - compressor used fastest algorithmOS (Operating System)This identifies the type of file system on which compression took place. This may be useful in determining end-of-lineconvention for text files. The currently defined values are as follows:Deutsch Informational [Page 7]0 - FAT filesystem (MS-DOS, OS/2, NT/Win32)1 - Amiga2 - VMS (or OpenVMS)3 - Unix4 - VM/CMS5 - Atari TOS6 - HPFS filesystem (OS/2, NT)7 - Macintosh8 - Z-System9 - CP/M10 - TOPS-2011 - NTFS filesystem (NT)12 - QDOS13 - Acorn RISCOS255 - unknownXLEN (eXtra LENgth)If FLG.FEXTRA is set, this gives the length of the optional extra field. See below for details.CRC32 (CRC-32)This contains a Cyclic Redundancy Check value of theuncompressed data computed according to CRC-32 algorithmused in the ISO 3309 standard and in section 8.1.1.6.2 ofITU-T recommendation V.42. (See http://www.iso.ch forordering ISO documents. See gopher://info.itu.ch for anonline version of ITU-T V.42.)ISIZE (Input SIZE)This contains the size of the original (uncompressed) input data modulo 2^32.2.3.1.1. Extra fieldIf the FLG.FEXTRA bit is set, an "extra field" is present inthe header, with total length XLEN bytes. It consists of aseries of subfields, each of the form:+---+---+---+---+==================================+|SI1|SI2| LEN |... LEN bytes of subfield data ...|+---+---+---+---+==================================+SI1 and SI2 provide a subfield ID, typically two ASCII letters with some mnemonic value. Jean-Loup Gailly<gzip@> is maintaining a registry of subfieldIDs; please send him any subfield ID you wish to use. Subfield IDs with SI2 = 0 are reserved for future use. The followingIDs are currently defined:Deutsch Informational [Page 8]SI1 SI2 Data---------- ---------- ----0x41 (’A’) 0x70 (’P’) Apollo file type informationLEN gives the length of the subfield data, excluding the 4initial bytes.2.3.1.2. ComplianceA compliant compressor must produce files with correct ID1,ID2, CM, CRC32, and ISIZE, but may set all the other fields in the fixed-length part of the header to default values (255 for OS, 0 for all others). The compressor must set all reservedbits to zero.A compliant decompressor must check ID1, ID2, and CM, andprovide an error indication if any of these have incorrectvalues. It must examine FEXTRA/XLEN, FNAME, FCOMMENT and FHCRC at least so it can skip over the optional fields if they arepresent. It need not examine any other part of the header ortrailer; in particular, a decompressor may ignore FTEXT and OS and always produce binary output, and still be compliant. Acompliant decompressor must give an error indication if anyreserved bit is non-zero, since such a bit could indicate thepresence of a new field that would cause subsequent data to be interpreted incorrectly.3. References[1] "Information Processing - 8-bit single-byte coded graphiccharacter sets - Part 1: Latin alphabet No.1" (ISO 8859-1:1987). The ISO 8859-1 (Latin-1) character set is a superset of 7-bitASCII. Files defining this character set are available asiso_8859-1.* in ftp:///graphics/png/documents/[2] ISO 3309[3] ITU-T recommendation V.42[4] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification",available in ftp:///pub/archiving/zip/doc/[5] Gailly, J.-L., GZIP documentation, available as gzip-*.tar inftp:///pub/gnu/[6] Sarwate, D.V., "Computation of Cyclic Redundancy Checks via Table Look-Up", Communications of the ACM, 31(8), pp.1008-1013.Deutsch Informational [Page 9][7] Schwaderer, W.D., "CRC Calculation", April 85 PC Tech Journal,pp.118-133.[8] ftp://.au/pub/rocksoft/papers/crc_v3.txt,describing the CRC concept.4. Security ConsiderationsAny data compression method involves the reduction of redundancy inthe data. Consequently, any corruption of the data is likely to have severe effects and be difficult to correct. Uncompressed text, onthe other hand, will probably still be readable despite the presence of some corrupted bytes.It is recommended that systems using this data format provide somemeans of validating the integrity of the compressed data, such as by setting and checking the CRC-32 check value.5. AcknowledgementsTrademarks cited in this document are the property of theirrespective owners.Jean-Loup Gailly designed the gzip format and wrote, with Mark Adler, the related software described in this specification. GlennRanders-Pehrson converted this document to RFC and HTML format.6. Author’s AddressL. Peter DeutschAladdin Enterprises203 Santa Margarita Ave.Menlo Park, CA 94025Phone: (415) 322-0103 (AM only)FAX: (415) 322-1734EMail: <ghost@>Questions about the technical content of this specification can besent by email to:Jean-Loup Gailly <gzip@> andMark Adler <madler@>Editorial comments on this specification can be sent by email to:L. Peter Deutsch <ghost@> andGlenn Randers-Pehrson <randeg@>Deutsch Informational [Page 10]7. Appendix: Jean-Loup Gailly’s gzip utilityThe most widely used implementation of gzip compression, and theoriginal documentation on which this specification is based, werecreated by Jean-Loup Gailly <gzip@>. Since thisimplementation is a de facto standard, we mention some more of itsfeatures here. Again, the material in this section is not part ofthe specification per se, and implementations need not follow it tobe compliant.When compressing or decompressing a file, gzip preserves theprotection, ownership, and modification time attributes on the local file system, since there is no provision for representing protection attributes in the gzip file format itself. Since the file formatincludes a modification time, the gzip decompressor provides acommand line switch that assigns the modification time from the file, rather than the local modification time of the compressed input, tothe decompressed output.8. Appendix: Sample CRC CodeThe following sample code represents a practical implementation ofthe CRC (Cyclic Redundancy Check). (See also ISO 3309 and ITU-T V.42 for a formal specification.)The sample code is in the ANSI C programming language. Non C usersmay find it easier to read with these hints:& Bitwise AND operator.^ Bitwise exclusive-OR operator.>> Bitwise right shift operator. When applied to anunsigned quantity, as here, right shift inserts zerobit(s) at the left.! Logical NOT operator.++ "n++" increments the variable n.0xNNN 0x introduces a hexadecimal (base 16) constant.Suffix L indicates a long value (at least 32 bits)./* Table of CRCs of all 8-bit messages. */unsigned long crc_table[256];/* Flag: has the table been computed? Initially false. */int crc_table_computed = 0;/* Make the table for a fast CRC. */void make_crc_table(void){unsigned long c;Deutsch Informational [Page 11]int n, k;for (n = 0; n < 256; n++) {c = (unsigned long) n;for (k = 0; k < 8; k++) {if (c & 1) {c = 0xedb88320L ^ (c >> 1);} else {c = c >> 1;}}crc_table[n] = c;}crc_table_computed = 1;}/*Update a running crc with the bytes buf[0..len-1] and returnthe updated crc. The crc should be initialized to zero. Pre- and post-conditioning (one’s complement) is performed within thisfunction so it shouldn’t be done by the caller. Usage example:unsigned long crc = 0L;while (read_buffer(buffer, length) != EOF) {crc = update_crc(crc, buffer, length);}if (crc != original_crc) error();*/unsigned long update_crc(unsigned long crc,unsigned char *buf, int len){unsigned long c = crc ^ 0xffffffffL;int n;if (!crc_table_computed)make_crc_table();for (n = 0; n < len; n++) {c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8);}return c ^ 0xffffffffL;}/* Return the CRC of the bytes buf[0..len-1]. */unsigned long crc(unsigned char *buf, int len){return update_crc(0L, buf, len);}Deutsch Informational [Page 12]。