Skip to main content

10. Decoding Algorithm (解码算法)

官方英文原文 (Official English Text)

The decoding algorithm that produces the uncompressed data is as follows:

read window size
do
read ISLAST bit
if ISLAST
read ISLASTEMPTY bit
if ISLASTEMPTY
break from loop
read MNIBBLES
if MNIBBLES is zero
verify reserved bit is zero
read MSKIPLEN
skip any bits up to the next byte boundary
skip MSKIPLEN bytes
continue to the next meta-block
else
read MLEN
if not ISLAST
read ISUNCOMPRESSED bit
if ISUNCOMPRESSED
skip any bits up to the next byte boundary
copy MLEN bytes of compressed data as literals
continue to the next meta-block




loop for each three block categories (i = L, I, D)
read NBLTYPESi
if NBLTYPESi >= 2
read prefix code for block types, HTREE_BTYPE_i
read prefix code for block counts, HTREE_BLEN_i
read block count, BLEN_i
set block type, BTYPE_i to 0
initialize second-to-last and last block types to 0 and 1
else
set block type, BTYPE_i to 0
set block count, BLEN_i to 16777216
read NPOSTFIX and NDIRECT
read array of literal context modes, CMODE[]
read NTREESL
if NTREESL >= 2
read literal context map, CMAPL[]
else
fill CMAPL[] with zeros
read NTREESD
if NTREESD >= 2
read distance context map, CMAPD[]
else
fill CMAPD[] with zeros
read array of literal prefix codes, HTREEL[]
read array of insert-and-copy length prefix codes, HTREEI[]
read array of distance prefix codes, HTREED[]
do
if BLEN_I is zero
read block type using HTREE_BTYPE_I and set BTYPE_I
save previous block type
read block count using HTREE_BLEN_I and set BLEN_I
decrement BLEN_I
read insert-and-copy length symbol using HTREEI[BTYPE_I]
compute insert length, ILEN, and copy length, CLEN
loop for ILEN
if BLEN_L is zero
read block type using HTREE_BTYPE_L and set BTYPE_L
save previous block type
read block count using HTREE_BLEN_L and set BLEN_L
decrement BLEN_L
look up context mode CMODE[BTYPE_L]
compute context ID, CIDL from last two uncompressed bytes
read literal using HTREEL[CMAPL[64*BTYPE_L + CIDL]]
write literal to uncompressed stream
if number of uncompressed bytes produced in the loop for
this meta-block is MLEN, then break from loop (in this
case the copy length is ignored and can have any value)




if distance code is implicit zero from insert-and-copy code
set backward distance to the last distance
else
if BLEN_D is zero
read block type using HTREE_BTYPE_D and set BTYPE_D
save previous block type
read block count using HTREE_BLEN_D and set BLEN_D
decrement BLEN_D
compute context ID, CIDD from CLEN
read distance code using HTREED[CMAPD[4*BTYPE_D + CIDD]]
compute distance by distance short code substitution
if distance code is not zero,
and distance is not a static dictionary reference,
push distance to the ring buffer of last distances
if distance is less than the max allowed distance plus one
move backwards distance bytes in the uncompressed data,
and copy CLEN bytes from this position to
the uncompressed stream
else
look up the static dictionary word, transform the word as
directed, and copy the result to the uncompressed stream
while number of uncompressed bytes for this meta-block < MLEN
while not ISLAST

If the stream ends before the completion of the last meta-block, then the stream should be rejected as invalid.

Note that a duplicated string reference may refer to a string in a previous meta-block, i.e., the backward distance may cross one or more meta-block boundaries. However, a backward copy distance will not refer past the beginning of the uncompressed stream or the window size; any such distance is interpreted as a reference to a static dictionary word. Also, note that the referenced string may overlap the current position, for example, if the last 2 bytes decoded have values X and Y, a string reference with <length = 5, distance = 2> adds X,Y,X,Y,X to the uncompressed stream.


中文翻译 (Chinese Translation)

产生未压缩数据 (Uncompressed Data) 的解码算法 (Decoding Algorithm) 如下:

读取窗口大小 (window size)
do
读取 ISLAST 位
if ISLAST
读取 ISLASTEMPTY 位
if ISLASTEMPTY
跳出循环
读取 MNIBBLES
if MNIBBLES 为零
验证保留位为零
读取 MSKIPLEN
跳过到下一个字节边界的任何位
跳过 MSKIPLEN 字节
继续到下一个元块
else
读取 MLEN
if not ISLAST
读取 ISUNCOMPRESSED 位
if ISUNCOMPRESSED
跳过到下一个字节边界的任何位
将 MLEN 字节的压缩数据作为字面量复制
继续到下一个元块




对三个块类别循环 (i = L, I, D)
读取 NBLTYPESi
if NBLTYPESi >= 2
读取块类型的前缀编码,HTREE_BTYPE_i
读取块计数的前缀编码,HTREE_BLEN_i
读取块计数,BLEN_i
设置块类型,BTYPE_i 为 0
初始化倒数第二个和最后一个块类型为 0 和 1
else
设置块类型,BTYPE_i 为 0
设置块计数,BLEN_i 为 16777216
读取 NPOSTFIX 和 NDIRECT
读取字面上下文模式数组,CMODE[]
读取 NTREESL
if NTREESL >= 2
读取字面上下文映射,CMAPL[]
else
用零填充 CMAPL[]
读取 NTREESD
if NTREESD >= 2
读取距离上下文映射,CMAPD[]
else
用零填充 CMAPD[]
读取字面前缀编码数组,HTREEL[]
读取插入和复制长度前缀编码数组,HTREEI[]
读取距离前缀编码数组,HTREED[]
do
if BLEN_I 为零
使用 HTREE_BTYPE_I 读取块类型并设置 BTYPE_I
保存前一个块类型
使用 HTREE_BLEN_I 读取块计数并设置 BLEN_I
递减 BLEN_I
使用 HTREEI[BTYPE_I] 读取插入和复制长度符号
计算插入长度 ILEN 和复制长度 CLEN
循环 ILEN 次
if BLEN_L 为零
使用 HTREE_BTYPE_L 读取块类型并设置 BTYPE_L
保存前一个块类型
使用 HTREE_BLEN_L 读取块计数并设置 BLEN_L
递减 BLEN_L
查找上下文模式 CMODE[BTYPE_L]
从最后两个未压缩字节计算上下文 ID,CIDL
使用 HTREEL[CMAPL[64*BTYPE_L + CIDL]] 读取字面量
将字面量写入未压缩流
if 在此元块的循环中产生的未压缩字节数为 MLEN,
则跳出循环 (在这种情况下,复制长度被忽略,
可以具有任何值)




if 距离编码是插入和复制编码的隐式零
将后向距离设置为最后一个距离
else
if BLEN_D 为零
使用 HTREE_BTYPE_D 读取块类型并设置 BTYPE_D
保存前一个块类型
使用 HTREE_BLEN_D 读取块计数并设置 BLEN_D
递减 BLEN_D
从 CLEN 计算上下文 ID,CIDD
使用 HTREED[CMAPD[4*BTYPE_D + CIDD]] 读取距离编码
通过距离短编码替换计算距离
if 距离编码不为零,
并且距离不是静态字典引用,
将距离推入最后距离的环形缓冲区
if 距离小于最大允许距离加一
在未压缩数据中向后移动距离字节,
并从此位置复制 CLEN 字节到未压缩流
else
查找静态字典词,按照指示转换该词,
并将结果复制到未压缩流
while 此元块的未压缩字节数 < MLEN
while not ISLAST

如果流在最后一个元块完成之前结束,则应拒绝流为无效 (Invalid).

请注意,重复字符串引用 (Duplicated String Reference) 可能引用前一个元块中的字符串,即后向距离 (Backward Distance) 可能跨越一个或多个元块边界 (Meta-Block Boundary)。但是,后向复制距离不会引用未压缩流的开头或窗口大小之前的位置; 任何此类距离都被解释为对静态字典词的引用。另外,请注意,引用的字符串可能与当前位置重叠 (Overlap),例如,如果解码的最后 2 个字节的值为 X 和 Y,则字符串引用 <length = 5, distance = 2> 将 X,Y,X,Y,X 添加到未压缩流中.

关键算法步骤

主循环结构:

  1. 读取窗口大小
  2. 处理每个元块直到遇到最后一个元块
  3. 对于每个元块,处理命令序列直到生成 MLEN 字节

块类型管理:

  • 三个独立的块类别:字面量 (L)、插入和复制 (I)、距离 (D)
  • 每个类别维护自己的块类型和块计数
  • 当块计数归零时切换到新的块类型

命令处理:

  1. 读取插入和复制长度
  2. 插入字面量字节
  3. 读取距离编码
  4. 执行复制操作

特殊情况:

  • 隐式零距离:使用最后一个距离
  • 静态字典引用:距离超过窗口大小
  • 重叠复制:源和目标位置可能重叠

来源 (Source): RFC 7932, Section 10
官方文本 (Official Text): https://www.rfc-editor.org/rfc/rfc7932.txt