<?xml version="1.0" encoding="US-ASCII"?> encoding="utf-8"?>

<!-- updated by Chris 05/06/20 -->

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
  <!ENTITY RFC1952 PUBLIC "" "http://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.1952.xml">
]> "rfc2629-xhtml.ent">

<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" category="info"
     obsoletes="8478"
        docName="draft-kucherawy-rfc8478bis-05">

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<?rfc toc="yes" ?>
<?rfc tocdepth="4" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc compact="yes" ?>
<?rfc subcompact="no"?> docName="draft-kucherawy-rfc8478bis-05" number="8878"
     updates="" submissionType="IETF" category="info" consensus="true"
     xml:lang="en" tocInclude="true" tocDepth="4" symRefs="true"
     sortRefs="true" version="3">

  <!-- xml2rfc v2v3 conversion 2.44.0 -->

<front>

    <title abbrev="application/zstd">
		Zstandard Compression and the application/zstd 'application/zstd' Media Type
    </title>
    <seriesInfo name="RFC" value="8878"/>
    <author initials="Y." surname="Collet" fullname="Yann Collet">

		<organization>
			Facebook
		</organization>
      <organization>Facebook</organization>
      <address>
        <postal>
          <street>1 Hacker Way</street>
          <city>Menlo Park</city>
          <region>CA</region>
          <code>94025</code>
          <country>United States of America</country>
        </postal>
        <email>cyan@fb.com</email>
      </address>
    </author>
    <author initials="M. S." initials="M." surname="Kucherawy" fullname="Murray S. Kucherawy" role="editor">

		<organization>
			Facebook
		</organization>
      <organization>Facebook</organization>
      <address>
        <postal>
          <street>1 Hacker Way</street>
          <city>Menlo Park</city>
          <region>CA</region>
          <code>94025</code>
          <country>United States of America</country>
        </postal>
        <email>msk@fb.com</email>
      </address>
    </author>
    <date year="2020"/> year="2020" month="September" />
    <area>General</area>
    <keyword>compression</keyword>
    <abstract>
		<t> Zstandard,

      <t>Zstandard, or "zstd" (pronounced "zee standard"), is a lossless data
compression mechanism.  This document describes the mechanism and
registers a media type and type, content
		    encoding encoding, and a structured syntax
suffix to be used when transporting zstd-compressed content via Multipurpose Internet Mail Extensions
		    (MIME).  It also registers a corresponding media
		    type, content encoding, and structured syntax suffix. </t> MIME.</t>

      <t> Despite use of the word "standard" as part of its name, Zstandard,
		    readers are advised that this document is not an
		    Internet Standards Track specification; it is being
		    published for informational purposes only. </t>
      <t> This document replaces and obsoletes RFC 8478. </t>
    </abstract>
  </front>
  <middle>
    <section anchor="intro" title="Introduction"> numbered="true" toc="default">
      <name>Introduction</name>
      <t> Zstandard, or "zstd" (pronounced "zee standard"), is a
		    data compression mechanism, akin to gzip
		    <xref target="RFC1952"/>. target="RFC1952" format="default"/>. </t>
      <t> Despite use of the word "standard" as part of its name,
		    readers are advised that this document is not an
		    Internet Standards Track specification; it is being
		    published for informational purposes only. </t>

		<t> This
      <t>This document describes the Zstandard format.  Also, to enable the
      transport of a data object compressed with Zstandard, this document
      registers a media type type, content encoding, and structured syntax suffix
      that can be used to identify such content when it is used in a payload
		    encoded using Multipurpose Internet Mail Extensions
		    (MIME). </t>
      payload.</t>
    </section>
    <section anchor="definitions" title="Definitions">
		<t> Some numbered="true" toc="default">
      <name>Definitions</name>
      <t>Some terms used elsewhere in this document are defined
      here for clarity. </t>

		<t> <list style="hanging">
		    <t hangText="uncompressed:"> Describes
      <dl newline="false" spacing="normal">
        <dt>uncompressed:</dt>
        <dd>Describes an arbitrary
			set of bytes in their original form, prior to being
	subjected to compression. </t>

		    <t hangText="compress, compression:"> The act of processing
			a set of bytes via the compression mechanism
			described here. </t>

		    <t hangText="compressed:"> Describes </dd>

        <dt>compressed:</dt>
        <dd>Describes the result of passing
			a set of bytes through this mechanism.  The original
			input has thus been compressed. </t>

		    <t hangText="decompress, decompression:"> The act of
			processing a set of bytes through the inverse
			of the compression mechanism described here, in an
			attempt to recover the original set of bytes prior
			to compression. </t>

		    <t hangText="decompressed:"> Describes </dd>
        <dt>decompressed:</dt>
        <dd>Describes the result of passing
			a set of bytes through the reverse of this mechanism.
			When this is successful, the decompressed payload and
			the uncompressed payload are indistinguishable. </t>

		    <t hangText="encode:"> The </dd>
        <dt>encode:</dt>
        <dd>The process of translating data
			from one form to another; this may include compression compression,
			or it may refer to other translations done as part
			of this specification. </t>

		    <t hangText="decode:"> The </dd>
        <dt>decode:</dt>
        <dd>The reverse of "encode"; describes
			a process of reversing a prior encoding to recover
			the original content. </t>

		    <t hangText="frame:"> Content </dd>
        <dt>frame:</dt>
        <dd>Content compressed by Zstandard is
			transformed into a Zstandard frame. Multiple frames
			can be appended into a single file or stream. A frame
			is completely independent, has a defined beginning
			and end, and has a set of parameters that tells the
			decoder how to decompress it. </t>

		    <t hangText="block:"> A </dd>
        <dt>block:</dt>
        <dd>A frame encapsulates one or multiple
			blocks. Each block contains arbitrary content, which
			is described by its header, and has a guaranteed
			maximum content size that depends upon frame
			parameters.  Unlike frames, each block depends
			on previous blocks for proper decoding.  However, each
			block can be decompressed without waiting for its
			successor, allowing streaming operations. </t>

		    <t hangText="natural order:"> A </dd>
        <dt>natural order:</dt>
        <dd>A sequence or ordering of
			objects or values that is typical of that type of
			object or value.  A set of unique integers, for
			example, is in "natural order" if if, when progressing
			from one element in the set or sequence to the next,
			there is never a decrease in value. </t>
		    </list> </t>

		<t> The </dd>
      </dl>
      <t>The naming convention for identifiers within the
			specification is Mixed_Case_With_Underscores.
			Identifiers inside square brackets indicate that the
			identifier is optional in the presented context. </t>
    </section>
    <section anchor="compression" title="Compression Algorithm"> numbered="true" toc="default">
      <name>Compression Algorithm</name>
      <t> This section describes the Zstandard algorithm. </t>

      <t> The purpose of this document is to define a lossless
		    compressed data format that is a) independent of the CPU
		    type, operating system, file system, and character set and
		    b) is suitable for file compression and pipe and streaming
		    compression, using the Zstandard algorithm. The text of
		    the specification assumes a basic background in
		    programming at the level of bits and other primitive data
		    representations. </t>
      <t> The data can be produced or consumed, even for an
		    arbitrarily long sequentially presented input data stream,
		    using only an a priori bounded amount of intermediate
		    storage, and hence
		    storage; hence, it can be used in data communications.
		    The format uses the Zstandard compression method, and
		    an optional xxHash-64 checksum method
		    <xref target="XXHASH"/>, target="XXHASH" format="default"/>, for detection of data
		    corruption. </t>
      <t> The data format defined by this specification does not
		    attempt to allow random access to compressed data. </t>
      <t> Unless otherwise indicated below, a compliant compressor
		    must produce data sets that conform to the specifications
		    presented here.  However, it does not need to support all
		    options. </t>
      <t> A compliant decompressor must be able to decompress at
		    least one working set of parameters that conforms to the
		    specifications presented here. It may also ignore
		    informative fields, such as the checksum. Whenever it does
		    not support a parameter defined in the compressed stream,
		    it must produce a non-ambiguous an unambiguous error code and associated
		    error message explaining which parameter is
		    unsupported. </t>
      <t> This specification is intended for use by implementers
		    of software to compress data into Zstandard format and/or
		    decompress data from Zstandard format. The Zstandard
		    format is supported by an open source open-source reference
		    implementation, written in portable C, and available at
		    <xref target="ZSTD"/>. target="ZSTD" format="default"/>. </t>
      <section anchor="comp_frames" title="Frames"> numbered="true" toc="default">
        <name>Frames</name>
        <t> Zstandard compressed data is made up of one
			    or more frames.  Each frame is independent and can
			    be decompressed independently of other frames.  The
			    decompressed content of multiple concatenated
			    frames is the concatenation of each frame's
			    decompressed content. </t>
        <t> There are two frame formats defined for
			    Zstandard: Zstandard frames and skippable frames.
			    Zstandard frames contain compressed data, while
			    skippable frames contain custom user metadata. </t>
        <section anchor="comp_zstd_frames"
				title="Zstandard Frames"> numbered="true" toc="default">
          <name>Zstandard Frames</name>
          <t> The structure of a single Zstandard frame
				    is as follows:

                                    <figure><artwork>
  +--------------------+------------+
  |    Magic_Number    | 4 bytes    |
  +--------------------+------------+
  |    Frame_Header    | 2-14 bytes |
  +--------------------+------------+
  |     Data_Block     | n bytes    |
  +--------------------+------------+
  | [More Data_Blocks] |            |
  +--------------------+------------+
  | [Content_Checksum] | 4 bytes    |
  +--------------------+------------+
                                    </artwork></figure>

          </t>

				<t> <list style="hanging">
					<t hangText="Magic_Number:">

<table anchor="single-frame">
  <name>The Structure of a Single Zstandard Frame</name>
  <tbody>
    <tr>
      <td>Magic_Number</td>
      <td>4 bytes</td>
    </tr>
    <tr>
      <td>Frame_Header</td>
      <td>2-14 bytes</td>
    </tr>
    <tr>
      <td>Data_Block</td>
      <td>n bytes</td>
    </tr>
    <tr>
      <td>[More Data_Blocks]</td>
      <td></td>
    </tr>
    <tr>
      <td>[Content_Checksum]</td>
      <td>4 bytes</td>
    </tr>
  </tbody>
</table>
          <dl newline="false" spacing="normal">
            <dt>Magic_Number:</dt>
            <dd>
	      4 bytes, little-endian format.  Value: 0xFD2FB528.
					</t>
					<t hangText="Frame_Header:">
	    </dd>
            <dt>Frame_Header:</dt>
            <dd>
	      2 to 14 bytes, detailed in
	      <xref target="comp_frame_hdr"/>.
					</t>
					<t hangText="Data_Block:"> target="comp_frame_hdr" format="default"/>.
	    </dd>
            <dt>Data_Block:</dt>
            <dd>
	      Detailed in <xref target="blocks"/>. target="blocks" format="default"/>.
	      This is where data appears.
					</t>
					<t hangText="Content_Checksum:">
	    </dd>
            <dt>Content_Checksum:</dt>
            <dd>
	      An optional 32-bit checksum,
	      only present if
	      Content_Checksum_Flag is set.
	      The content checksum is the
	      result of the XXH64() hash
	      function
	      <xref target="XXHASH"/> target="XXHASH" format="default"/>
	      digesting the original
 	      (decoded) data as input, and a
	      seed of zero.  The low 4
	      bytes of the checksum are
	      stored in little-endian format.
					</t>
				    </list> </t>
	    </dd>
          </dl>
          <t> The magic number was selected to be less
	  probable to find at the beginning of an
	  arbitrary file.  It avoids trivial
	  patterns (0x00, 0xFF, repeated bytes,
	  increasing bytes, etc.), contains byte
	  values outside of the ASCII range, and doesn't
	  map into UTF-8 space, all of which reduce
	  the likelihood of its appearance at the
	  top of a text file. </t>
          <section anchor="comp_frame_hdr"
					title="Frame Header"> numbered="true" toc="default">
            <name>Frame Header</name>
            <t> The frame header has a variable
	    size, with a minimum of 2 bytes
					    and
	    up to a maximum of 14 bytes depending on
	    optional parameters. The structure
	    of Frame_Header is as follows:

                                                <figure><artwork>
  +-------------------------+-----------+
  | Frame_Header_Descriptor | 1 byte    |
  +-------------------------+-----------+
  |   [Window_Descriptor]   | 0-1 byte  |
  +-------------------------+-----------+
  |     [Dictionary_ID]     | 0-4 bytes |
  +-------------------------+-----------+
  |  [Frame_Content_Size]   | 0-8 bytes |
  +-------------------------+-----------+
                                                </artwork></figure> </t> follows:</t>

<table anchor="frame-header">
  <name>The Structure of Frame_Header</name>
  <tbody>
    <tr>
      <td>Frame_Header_Descriptor </td>
      <td>1 byte</td>
    </tr>
    <tr>
      <td>[Window_Descriptor]</td>
      <td>0-1 byte</td>
    </tr>
    <tr>
      <td>[Dictionary_ID]</td>
      <td>0-4 bytes</td>
    </tr>
    <tr>
      <td>[Frame_Content_Size]</td>
      <td>0-8 bytes</td>
    </tr>
  </tbody>
</table>

            <section anchor="comp_frame_header_desc"
						title="Frame_Header_Descriptor"> numbered="true" toc="default">
              <name>Frame_Header_Descriptor</name>
              <t> The first header's byte is called the
	      Frame_Header_Descriptor. It describes
	      which other fields are present. Decoding this
	      byte is enough to tell the size of Frame_Header.

				    	        <figure><artwork>
  +------------+-------------------------+
  | Bit Number | Field Name              |
  +------------+-------------------------+
  |    7-6     | Frame_Content_Size_Flag |
  +------------+-------------------------+
  |     5      | Single_Segment_Flag     |
  +------------+-------------------------+
  |     4      | (unused)                |
  +------------+-------------------------+
  |     3      | (reserved)              |
  +------------+-------------------------+
  |     2      | Content_Checksum_Flag   |
  +------------+-------------------------+
  |    1-0     | Dictionary_ID_Flag      |
  +------------+-------------------------+
                                                </artwork></figure>
              </t>

<table anchor="Frame-Header-Descriptor">
  <name>The Frame_Header_Descriptor</name>
  <thead>
    <tr>
      <th>Bit Number</th>
      <th>Field Name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>7-6</td>
      <td>Frame_Content_Size_Flag </td>
    </tr>
    <tr>
      <td>5</td>
      <td>Single_Segment_Flag  </td>
    </tr>
    <tr>
      <td>4</td>
      <td>(unused)</td>
    </tr>
    <tr>
      <td>3</td>
      <td>(reserved)</td>
    </tr>
    <tr>
      <td>2</td>
      <td>Content_Checksum_Flag</td>
    </tr>
    <tr>
      <td>1-0</td>
      <td>Dictionary_ID_Flag</td>
    </tr>

  </tbody>
</table>
              <t> In this table, <xref target="Frame-Header-Descriptor" />, bit 7 is
	      the highest bit, while bit
	      0 is the lowest one. </t>
              <section title="Frame_Content_Size_Flag"> numbered="true" toc="default">
                <name>Frame_Content_Size_Flag</name>
                <t>
		  This is a 2-bit flag (equivalent to Frame_Header_Descriptor
		  right-shifted 6 bits) specifying whether Frame_Content_Size
		  (the decompressed data size) is provided within the header.
		  Frame_Content_Size_Flag provides FCS_Field_Size, which is the
		  number of bytes used by Frame_Content_Size according to
							the following table:
					                        <figure><artwork>
  +-------------------------+--------+---+---+---+
  | Frame_Content_Size_Flag |   0    | 1 | 2 | 3 |
  +-------------------------+--------+---+---+---+
  | FCS_Field_Size          | 0 or 1 | 2 | 4 | 8 |
  +-------------------------+--------+---+---+---+
                                                                </artwork></figure></t>
		  <xref target="frame-content-size-flag"/>:
                </t>

<table anchor="frame-content-size-flag">
  <name>Frame_Content_Size_Flag Provides FCS_Field_Size</name>
  <tbody>
    <tr>
      <td>Frame_Content_Size_Flag</td>
      <td align="center">0</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
    </tr>
    <tr>
      <td>FCS_Field_Size</td>
      <td align="center">0 or 1</td>
      <td>2</td>
      <td>4</td>
      <td>8</td>
    </tr>
  </tbody>
</table>

                <t>
		  When Frame_Content_Size_Flag
		  is 0, FCS_Field_Size
		  depends on
		  Single_Segment_Flag:
							If
		  if Single_Segment_Flag
		  is set, FCS_Field_Size
		  is 1.  Otherwise,
		  FCS_Field_Size is 0;
		  Frame_Content_Size
		  is not provided.
                </t>
              </section>
              <section title="Single_Segment_Flag"> numbered="true" toc="default">
                <name>Single_Segment_Flag</name>
                <t>
							If this flag is
							set, data must
							be regenerated
							within a single
							continuous
							memory segment.
                </t>
                <t>
							In this case,
							Window_Descriptor
							byte is skipped, but
							Frame_Content_Size
							is necessarily
							present.  As a
							consequence,
							the decoder
							must allocate a
							memory segment
							of a size equal to
							or larger than
							Frame_Content_Size.
                </t>
                <t>
							In order to protect the
							decoder from
							unreasonable memory
							requirements,
							a decoder is
							allowed to reject a
							compressed frame that
							requests a memory size
							beyond the decoder's
							authorized range.
                </t>
                <t>
							For broader
							compatibility,
							decoders are
							recommended to
							support memory
							sizes of at least 8 MB.
							This is only a
							recommendation;
							each decoder is
							free to support
							higher or lower
							limits, depending on
							local limitations.
                </t>
              </section>
              <section title="Unused Bit"> numbered="true" toc="default">
                <name>Unused Bit</name>
                <t>
							A decoder compliant
							with this specification
							version shall not
							interpret this bit.
							It might
							be used in a future
							version,
							version to signal a
							property that is not
							mandatory to properly
							decode the frame.
							An encoder compliant
							with this specification
							must set this bit to
							zero.
                </t>
              </section>
              <section title="Reserved Bit"> numbered="true" toc="default">
                <name>Reserved Bit</name>
                <t>
							This bit is reserved
							for some future
							feature. Its value must
							be zero. A decoder
							compliant with this
							specification version
							must ensure it is not
							set. This bit may be
							used in a future
							revision,
							revision to signal a
							feature that must be
							interpreted to decode
							the frame correctly.
                </t>
              </section>
              <section title="Content_Checksum_Flag"> numbered="true" toc="default">
                <name>Content_Checksum_Flag</name>
                <t>
							If this flag is set, a
							32-bit
							Content_Checksum will
							be present at the
							frame's end.
							See the description of
							Content_Checksum above.
                </t>
              </section>
              <section title="Dictionary_ID_Flag"> numbered="true" toc="default">
                <name>Dictionary_ID_Flag</name>
                <t>
		  This is a 2-bit flag
		  (= Frame_Header_Descriptor &amp; 0x3) indicating
		  whether a dictionary ID
		  is provided within the
		  header.  It also
		  specifies the size of
		  this field as
		  DID_Field_Size:

							        <figure><artwork>
  +--------------------+---+---+---+---+
  | Dictionary_ID_Flag | 0 | 1 | 2 | 3 |
  +--------------------+---+---+---+---+
  | DID_Field_Size     | 0 | 1 | 2 | 4 |
  +--------------------+---+---+---+---+
                                                                </artwork></figure></t>
                </t>

<table anchor="Dictionary-ID-Flag">
  <name>Dictionary_ID_Flag</name>
  <tbody>
    <tr>
      <td>Dictionary_ID_Flag</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
    </tr>
    <tr>
      <td>DID_Field_Size</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
      <td>4</td>
    </tr>
  </tbody>
</table>
              </section>
            </section>
            <section anchor="comp_window_descr"
						title="Window Descriptor"> numbered="true" toc="default">
              <name>Window Descriptor</name>
              <t>
		This provides guarantees about
		the minimum
		memory buffer required to
		decompress a frame.  This
		information is important for
		decoders to allocate enough
		memory.
              </t>
              <t>
		The Window_Descriptor byte is
		optional. When
		Single_Segment_Flag is set,
		Window_Descriptor is not
		present. In this case,
		Window_Size is
		Frame_Content_Size, which can
		be any value from 0 to
						2^64-1
		2<sup>64</sup> - 1 bytes (16 ExaBytes).

						        <figure><artwork>
  +------------+----------+----------+
  | Bit Number |   7-3    |   2-0    |
  +------------+----------+----------+
  | Field Name | Exponent | Mantissa |
  +------------+----------+----------+
                                                        </artwork></figure></t>
              </t>

<table anchor="window-descriptor">
  <name>Window_Descriptor</name>
  <tbody>
    <tr>
      <td>Bit Number</td>
      <td align="center">7-3</td>
      <td align="center">2-0</td>
    </tr>
    <tr>
      <td>Field Name</td>
      <td>Exponent</td>
      <td>Mantissa</td>
    </tr>
  </tbody>
</table>

              <t>
		The minimum memory buffer size
		is called Window_Size. It is
		described by the following
						formulae:

						        <figure><artwork>
		formulas:
              </t>
              <artwork name="" type="" align="left" alt=""><![CDATA[
  windowLog = 10 + Exponent;
  windowBase = 1 &lt;&lt; << windowLog;
  windowAdd = (windowBase / 8) * Mantissa;
  Window_Size = windowBase + windowAdd;
                                                        </artwork></figure></t>
                                                        ]]></artwork>
              <t>
						The minimum Window_Size is
						1 KB. The maximum Window_Size
						is (1&lt;&lt;41) +
						7*(1&lt;&lt;38) bytes,
						which is 3.75 TB.
              </t>
              <t> In general, larger
						Window_Size values tend to
						improve the compression ratio, but
						at the cost of increased memory
						usage. </t>
              <t>
						To properly decode compressed
						data, a decoder will need to
						allocate a buffer of at least
						Window_Size bytes.
              </t>
              <t>
						In order to protect
						decoders from unreasonable
						memory requirements, a decoder
						is allowed to reject a
						compressed frame that
						requests a memory size beyond the
						decoder's authorized range.
              </t>
              <t>
						For improved interoperability,
						it's recommended for decoders
						to support values of
						Window_Size up to 8 MB and
						for encoders not to generate
						frames requiring a Window_Size
						larger than 8 MB.

						It's merely a recommendation
						though, and decoders are free
						to support larger higher or lower
						limits, depending on local
						limitations.
              </t>
            </section>
            <section anchor="comp_dictionary_id"
						title="Dictionary_ID"> numbered="true" toc="default">
              <name>Dictionary_ID</name>

              <t>
						This is a field of variable size field, size,
						which contains the ID of the
						dictionary required to properly
						decode the frame. This field
						is optional. When it's not
						present, it's up to the decoder
						to know which dictionary
						to use.  </t>
              <t>
						Dictionary_ID field size is
						provided by DID_Field_Size.
						DID_Field_Size is directly
						derived from the value
						of Dictionary_ID_Flag. One byte
						can represent an ID 0-255; 2
						bytes can represent an ID
						0-65535; 4 bytes can
						represent an ID 0-4294967295.
						Format is little-endian.
              </t>
              <t>
						It is permitted to represent a
						small ID (for example, 13) with
						a large 4-byte dictionary
						ID, even if it is less
						efficient.
              </t>
              <t>
		Within private environments,
		any dictionary ID
		can be used.  However, for
		frames and dictionaries
		distributed in public space,
		Dictionary_ID must be
		attributed carefully.
		The following
		ranges are reserved for use
		only with dictionaries that
		have been registered with
		IANA (see <xref target="iana_dict"/>):

						  <list style="hanging">
<?rfc subcompact="yes"?>
    							<t hangText="low range:">	target="iana_dict" format="default"/>):
              </t>
              <dl newline="false" spacing="normal">
                <dt>low range:</dt>
                <dd>
								&lt;= 32767
							</t>
    							<t hangText="high range:">
								>=
							</dd>
                <dt>high range:</dt>
                <dd>
								&gt;= (1 &lt;&lt; 31)
							</t>
<?rfc subcompact="no"?>
						  </list> </t>
							</dd>
              </dl>
              <t> Any other value for
						    Dictionary_ID can be used
						    by private arrangement
						    between participants. </t>
              <t> Any payload presented for
						    decompression that
						    references an unregistered
						    reserved dictionary ID
						    results in an error. </t>
            </section>
            <section anchor="comp_frame_content_size"
						title="Frame Content Size"> numbered="true" toc="default">

              <name>Frame_Content_Size</name>
              <t>
		This is the original
		(uncompressed) size. This
		information is optional.
		Frame_Content_Size uses a
		variable number of bytes,
		provided by FCS_Field_Size.
		FCS_Field_Size is provided by
		the value of
		Frame_Content_Size_Flag.
		FCS_Field_Size can be equal to
		0 (not present), 1, 2, 4, or
		8 bytes.

						        <figure><artwork>
  +----------------+--------------+
  | FCS
              </t>

<table anchor="Frame-Content-Size">
  <name>Frame_Content_Size</name>
  <thead>
    <tr>
      <th align="center">FCS Field Size | Range        |
  +----------------+--------------+
  |        0       | unknown      |
  +----------------+--------------+
  |        1       | 0 Size</th>
      <th>Range</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td>unknown</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td>0 - 255      |
  +----------------+--------------+
  |        2       | 256 255</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td>256 - 65791  |
  +----------------+--------------+
  |        4       | 0 65791</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td>0 - 2^32 2<sup>32</sup> - 1 |
  +----------------+--------------+
  |        8       | 0 1</td>
    </tr>
    <tr>
      <td align="center">8</td>
      <td>0 - 2^64 2<sup>64</sup> - 1 |
  +----------------+--------------+
                                                        </artwork></figure></t> 1</td>
    </tr>

  </tbody>
</table>

              <t>
		Frame_Content_Size format is
		little-endian. When
		FCS_Field_Size is 1, 4, or 8
		bytes, the value is read
		directly. When FCS_Field_Size
		is 2, the offset of 256 is
		added. It's allowed to
		represent a small size (for
						example
		example, 18) using any
		compatible variant.
              </t>
            </section>
          </section>
          <section anchor="blocks" title="Blocks"> numbered="true" toc="default">
            <name>Blocks</name>
            <t> After Magic_Number and
	    Frame_Header, there are some number of
	    blocks. Each frame must have at least
	    1 block, but there is no upper
	    limit on the number of blocks per
	    frame. </t>
            <t> The structure of a block is as
	    follows:

					        <figure><artwork>
  +--------------+---------------+
  | Block_Header | Block_Content |
  +--------------+---------------+
  |    3 bytes   |    n bytes    |
  +--------------+---------------+
                                                </artwork></figure></t>
            </t>

<table anchor="block">
  <name>The Structure of a Block</name>
  <thead>
    <tr>
      <th align="center">Block_Header</th>
      <th align="center">Block_Content</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">3 bytes</td>
      <td align="center">n bytes</td>
    </tr>
  </tbody>
</table>
            <t> Block_Header uses 3 bytes,
	    written using little-endian
	    convention. It contains three fields:

					        <figure><artwork>
  +------------+------------+------------+
  | Last_Block | Block_Type | Block_Size |
  +------------+------------+------------+
  |    bit 0   |   bits 1-2 |  bits 3-23 |
  +------------+------------+------------+
                                                </artwork></figure></t>

					    <section title="Last_Block">
            </t>

<table anchor="block-header">
  <name>Block_Header</name>
  <thead>
    <tr>
      <th align="center">Last_Block</th>
      <th align="center">Block_Type</th>
      <th align="center">Block_Size</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">bit 0</td>
      <td align="center">bits 1-2</td>
      <td align="center">bits 3-23</td>
    </tr>
  </tbody>
</table>

            <section numbered="true" toc="default">
              <name>Last_Block</name>
              <t> The lowest bit (Last_Block)
	      signals whether
	      this block is the last one.
	      The frame will end after this
	      last block. It may be followed
	      by an optional Content_Checksum
	      (see
	      <xref target="comp_zstd_frames"/>). target="comp_zstd_frames" format="default"/>).
              </t>
            </section>
            <section title="Block_Type"> numbered="true" toc="default">
              <name>Block_Type</name>
              <t> The next 2 bits represent
	      the Block_Type.  There are four
	      block types:

					        <figure><artwork>
  +-----------+------------------+
  |   Value   |    Block_Type    |
  +-----------+------------------+
  |     0     |     Raw_Block    |
  +-----------+------------------+
  |     1     |     RLE_Block    |
  +-----------+------------------+
  |     2     | Compressed_Block |
  +-----------+------------------+
  |     3     |     Reserved     |
  +-----------+------------------+
                                                </artwork></figure></t>

						<t> <list style="hanging">
							<t hangText="Raw_Block:">
              </t>

<table anchor="block-types">
  <name>The Four Block Types</name>
  <thead>
    <tr>
      <th align="center">Value</th>
      <th>Block_Type</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">Raw_Block</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">RLE_Block</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">Compressed_Block</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">Reserved</td>
    </tr>
  </tbody>
</table>

              <dl newline="false" spacing="normal">
                <dt>Raw_Block:</dt>
                <dd>
		  This is an
		  uncompressed block.
		  Block_Content contains
		Block_Size bytes. </t>

							<t hangText="RLE_Block:"> </dd>
                <dt>RLE_Block:</dt>
                <dd>
		  This is a single byte,
		  repeated Block_Size
		  times. Block_Content
		  consists of a single
		  byte. On the
		  decompression side,
		  this byte must be
		  repeated Block_Size
		times. </t>

							<t hangText="Compressed_Block:"> </dd>
                <dt>Compressed_Block:</dt>
                <dd>
		  This is a compressed
		  block as described in
		  <xref target="comp_blocks"/>. target="comp_blocks" format="default"/>.
		  Block_Size is the
		  length of
		  Block_Content, namely
		  the compressed data.
		  The decompressed size
		  is not known, but its
		  maximum possible value
		  is guaranteed (see
		below). </t>

							<t hangText="Reserved:"> </dd>
                <dt>Reserved:</dt>
                <dd>
		  This is not a block.
		  This value cannot be
		  used with the current
		  specification. If such
		  a value is present,
 		  it is considered to be
		  corrupt data, and a
		  compliant decoder must
		reject it. </t>
						    </list> </t> </dd>
              </dl>
            </section>
            <section title="Block_Size"> numbered="true" toc="default">

              <name>Block_Size</name>
              <t> The upper 21 bits of
	      Block_Header represent the
	      Block_Size. </t>
              <t> When Block_Type is
	      Compressed_Block or Raw_Block,
	      Block_Size is the size of
	      Block_Content (hence excluding
	      Block_Header).  </t>
              <t> When Block_Type is
	      RLE_Block, since Block_Content's
	      size is always 1, Block_Size
	      represents the number of times
	      this byte must be repeated. </t>
              <t> Block_Size is limited by
	      Block_Maximum_Size (see below).
              </t>
            </section>
            <section
						title="Block_Content numbered="true" toc="default">
              <name>Block_Content and Block_Maximum_Size"> Block_Maximum_Size</name>
              <t> The size of Block_Content
	      is limited by
	      Block_Maximum_Size, which is
	      the smallest of:

						<list style="symbols">
							<t> Window_Size
              </t>
							<t>
              <ul spacing="normal">
                <li> Window_Size </li>
                <li> 128 KB </t>
						</list> </t> </li>
              </ul>
              <t> Block_Maximum_Size is
						constant for a given frame.
						This maximum is applicable to
						both the decompressed size
						and the compressed size of any
						block in the frame. </t>
              <t> The reasoning for this
						limit is that a decoder can
						read this information at the
						beginning of a frame and use
						it to allocate buffers.
						The guarantees on the size of
						blocks ensure that the buffers
						will be large enough for any
						following block of the valid
						frame. </t>
              <t> If the compressed block
						is larger than the uncompressed uncompressed,
						sending the uncompressed
						block (i.e., a Raw_Block) is
						recommended instead. </t>
            </section>
          </section>
          <section anchor="comp_blocks"
					title="Compressed Blocks"> numbered="true" toc="default">
            <name>Compressed Blocks</name>
            <t> To decompress a compressed block,
					the compressed size must be provided
					from the Block_Size field within
					Block_Header. </t>
            <t> A compressed block consists of two
					sections: a Literals Section Literals_Section
					(<xref target="comp_literals"/>) target="comp_literals" format="default"/>) and
					a Sequences_Section
					(<xref target="comp_sequences"/>). target="comp_sequences" format="default"/>).
					The results of the two sections are
					then combined to produce the
					decompressed data in Sequence
					Execution
					(<xref target="comp_sequence_exec"/>). target="comp_sequence_exec" format="default"/>).
            </t>
            <t> To decode a compressed block, the
					following elements are necessary:
					    <list style="symbols">
						<t>
            </t>
            <ul spacing="normal">
              <li> Previous decoded data, up
						to a distance of Window_Size,
						or the beginning of the Frame,
						whichever is smaller.
						Single_Segment_Flag
						will be set in the latter
						case. </t>

						<t> </li>
              <li> List of "recent offsets"
						from the previous
						Compressed_Block.
						</t>

						<t>
						</li>
              <li> The previous Huffman tree,
						required by
						Treeless_Literals_Block type.
						</t>

						<t>
	      </li>

              <li> Previous Finite State Entropy (FSE) decoding
						tables, required by
						Repeat_Mode, for each symbol
						type (literals lengths,
						match lengths, offsets). </t>
					    </list> </t> </li>
            </ul>
            <t> Note that decoding tables are not
					always from the previous
					Compressed_Block:
					    <list style="symbols">
						<t>
            </t>
            <ul spacing="normal">
              <li> Every decoding table can
						    come from a dictionary. </t>

						<t> </li>
              <li> The Huffman tree comes from
						    the previous
						    Compressed_Literals_Block. </t>
					    </list></t> </li>
            </ul>
            <section anchor="comp_literals"
						title="Literals_Section_Header"> numbered="true" toc="default">
              <name>Literals_Section_Header</name>
              <t> All literals are regrouped
						in the first part of the
						block. They can be decoded
						first and then copied during
						Sequence Execution (see
						<xref target="comp_sequence_exec"/>), target="comp_sequence_exec" format="default"/>),
						or they can be decoded on the
						flow during Sequence
						Execution. </t>
              <t> Literals can be stored
						uncompressed or compressed
						using Huffman prefix codes.
						When compressed, an optional
						tree description can be
						present, followed by 1 or
						4 streams.

						    <figure><artwork>
  +----------------------------+
  |   Literals_Section_Header  |
  +----------------------------+
  | [Huffman_Tree_Description] |
  +----------------------------+
  |        [Jump_Table]        |
  +----------------------------+
  |          Stream_1          |
  +----------------------------+
  |         [Stream_2]         |
  +----------------------------+
  |         [Stream_3]         |
  +----------------------------+
  |         [Stream_4]         |
  +----------------------------+
                                                    </artwork></figure></t>

						<section title="Literals_Section_Header">

              </t>
<table anchor="compressed-literals">
  <name>Compressed Literals</name>
  <tbody>
    <tr>
      <td align="center">Literals_Section_Header</td>
    </tr>
    <tr>
      <td align="center">[Huffman_Tree_Description]</td>
    </tr>
    <tr>
      <td align="center">[Jump_Table]</td>
    </tr>
    <tr>
      <td align="center">Stream_1</td>
    </tr>
    <tr>
      <td align="center">[Stream_2]</td>
    </tr>
    <tr>
      <td align="center">[Stream_3]</td>
    </tr>
    <tr>
      <td align="center">[Stream_4]</td>
    </tr>
  </tbody>
</table>
              <section numbered="true" toc="default">
                <name>Literals_Section_Header</name>
                <t>
		  This field describes
		  how literals are
		  packed. It's a
		  byte-aligned
		  variable-size bit field,
		  ranging from 1 to
		  5 bytes, using
		  little-endian
		  convention.

							    <figure><artwork>
  +---------------------+-----------+
  | Literals_Block_Type |  2 bits   |
  +---------------------+-----------+
  |     Size_Format     | 1-2 bits  |
  +---------------------+-----------+
  |   Regenerated_Size  | 5-20 bits |
  +---------------------+-----------+
  |  [Compressed_Size]  | 0-18 bits |
  +---------------------+-----------+
                                                            </artwork></figure></t>
                </t>

<table anchor="Literals_Section_Header">
  <name>Literals_Section_Header</name>
  <tbody>
    <tr>
      <td align="center">Literals_Block_Type</td>
      <td align="center">2 bits</td>
    </tr>
    <tr>
      <td align="center">Size_Format</td>
      <td align="center">1-2 bits</td>
    </tr>
    <tr>
      <td align="center">Regenerated_Size</td>
      <td align="center">5-20 bits</td>
    </tr>
    <tr>
      <td align="center">[Compressed_Size] </td>
      <td align="center">0-18 bits</td>
    </tr>
  </tbody>
</table>

                <t> In this
		representation, bits at
		the top are the lowest
		bits. </t>
                <t> The
		Literals_Block_Type
		field uses the two
		lowest bits of the
		first byte, describing
		four different block
		types:

							    <figure><artwork>
  +---------------------------+-------+
  |    Literals_Block_Type    | Value |
  +---------------------------+-------+
  |     Raw_Literals_Block    |   0   |
  +---------------------------+-------+
  |     RLE_Literals_Block    |   1   |
  +---------------------------+-------+
  | Compressed_Literals_Block |   2   |
  +---------------------------+-------+
  |  Treeless_Literals_Block  |   3   |
  +---------------------------+-------+
                                                            </artwork></figure></t>

							<t> <list style="hanging">
							    <t hangText="Raw_Literals_Block:">
                </t>

<table anchor="Literals_Block_Type">
  <name>Literals_Block_Type</name>
  <thead>
    <tr>
      <th align="center">Literals_Block_Type</th>
      <th align="center">Value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">Raw_Literals_Block</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">RLE_Literals_Block</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">Compressed_Literals_Block</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">Treeless_Literals_Block</td>
      <td align="center">3</td>
    </tr>
  </tbody>
</table>
                <dl newline="false" spacing="normal">
                  <dt>Raw_Literals_Block:</dt>
                  <dd>
		    Literals are
		    stored
		    uncompressed.
		    Literals_Section_Content
		    is
		    Regenerated_Size.
							    </t>

							    <t hangText="RLE_Literals_Block:">
		  </dd>
                  <dt>RLE_Literals_Block:</dt>
                  <dd>
		    Literals consist of a single-byte value repeated
		    Regenerated_Size times. Literals_Section_Content is
		  1. </t>

							    <t hangText="Compressed_Literals_Block:"> </dd>
                  <dt>Compressed_Literals_Block:</dt>
                  <dd>
		    This is a standard Huffman-compressed
		    block, starting with a Huffman tree description.
		    See details below.  Literals_Section_Content
		    is Compressed_Size.
							    </t>

							    <t hangText="Treeless_Literals_Block:">
		  </dd>
                  <dt>Treeless_Literals_Block:</dt>
                  <dd>
		    This is a
		    Huffman-compressed
		    block, using the
		    Huffman tree
		    from the previous
								Compressed_Literals_Block,
		    Compressed_Literals_Block
		    or a dictionary
		    if there is no previous
		    Huffman-compressed
		    literals block.
		    Huffman_Tree_Description
		    will be
skipped. Note that if this mode is triggered without any previous
								Huffman-table Huffman table
in the frame (or dictionary, per <xref target="comp_dict"/>), target="comp_dict" format="default"/>),
it should be treated as data corruption. Literals_Section_Content is
Compressed_Size.
							    </t>
							</list> </t>
		  </dd>
                </dl>
                <t> The Size_Format is divided into two families:

							    <list style="symbols">
								<t>
                </t>
                <ul spacing="normal">
                  <li>
		    For Raw_Literals_Block and RLE_Literals_Block,
		    it's only necessary to decode Regenerated_Size.
		    There is no Compressed_Size
								field. </t>

								<t> field.</li>
                  <li>
		    For Compressed_Block and Treeless_Literals_Block,
		    it's required to decode both Compressed_Size
		    and Regenerated_Size (the decompressed
		    size). It's also necessary to decode the
		    number of streams (1 or 4). </t>
							    </list> </t> </li>
                </ul>
                <t> For values
		spanning several bytes,
		the convention is
		little endian. </t>
                <t> Size_Format for
		Raw_Literals_Block
		and RLE_Literals_Block
		uses 1 or 2 bits.  Its
		value is (Literals_Section_Header[0]>>2) (Literals_Section_Header[0]&gt;&gt;2) &amp; 0x3.

							    <list style="hanging">
								<t hangText="Size_Format
                </t>
                <dl newline="false" spacing="normal">
                  <dt>Size_Format == 00 or 10:"> 10:</dt>
                  <dd>
		    Size_Format
		    uses 1 bit.
		    Regenerated_Size
		    uses 5 bits
		    (value 0-31).
		    Literals_Section_Header
		    uses 1 byte.
		    Regenerated_Size
		    = Literal_Section_Header[0]>>3.
								</t>

								<t hangText="Size_Format Literal_Section_Header[0]&gt;&gt;3.
		  </dd>
                  <dt>Size_Format == 01:"> 01:</dt>
                  <dd>
		    Size_Format
		    uses 2 bits.
		    Regenerated_Size
		    uses 12 bits
		    (values 0-4095).
		    Literals_Section_Header
		    uses 2 bytes.
		    Regenerated_Size
		    =
								(Literals_Section_Header[0]>>4)
		    (Literals_Section_Header[0]&gt;&gt;4)
		    +
		    (Literals_Section_Header[1]&lt;&lt;4).
								</t>

								<t hangText="Size_Format
		  </dd>
                  <dt>Size_Format == 11:"> 11:</dt>
                  <dd>
		    Size_Format
		    uses 2 bits.
		    Regenerated_Size
		    uses 20 bits
		    (values
		    0-1048575).
		    Literals_Section_Header
		    uses 3
		    bytes.
		    Regenerated_Size
		    =
								(Literals_Section_Header[0]>>4)
		    (Literals_Section_Header[0]&gt;&gt;4)
		    +
		    (Literals_Section_Header[1]&lt;&lt;4)
		    +
								(Literals_Section_Header[2]&lt;&lt;12)
								</t>

							    </list> </t>
		    (Literals_Section_Header[2]&lt;&lt;12).
		  </dd>
                </dl>
                <t> Only Stream_1 is
		present for these
		cases. Note that it is
		permitted to represent
		a short value (for
		example, 13) using a
		long format, even if
		it's less efficient.
                </t>
                <t> Size_Format for
		Compressed_Literals_Block
		and
		Treeless_Literals_Block
		always uses 2 bits.
							    <list style="hanging">
								<t hangText="Size_Format
                </t>
                <dl newline="false" spacing="normal">
                  <dt>Size_Format == 00:"> 00:</dt>
                  <dd>
								A single
								stream. Both
								Regenerated_Size
								and Compressed_Size
								use 10 bits
								(values
								0-1023).
								Literals_Section_Header
								uses 3
								bytes.
								</t>

								<t hangText="Size_Format
								</dd>
                  <dt>Size_Format == 01:"> 01:</dt>
                  <dd>
								4 streams.
								Both
								Regenerated_Size
								and
								Compressed_Size
								use 10 bits
								(values
								0-1023).
								Literals_Section_Header
								uses 3
								bytes.
								</t>

								<t hangText="Size_Format
								</dd>
                  <dt>Size_Format == 10:"> 10:</dt>
                  <dd>
								4 streams.
								Both
								Regenerated_Size
								and
								Compressed_Size
								use 14 bits
								(values
								0-16383).
								Literals_Section_Header
								uses 4
								bytes.
								</t>

								<t hangText="Size_Format
								</dd>
                  <dt>Size_Format == 11:"> 11:</dt>
                  <dd>
								4 streams.
								Both
								Regenerated_Size
								and
								Compressed_Size
								use 18 bits
								(values
								0-262143).
								Literals_Section_Header
								uses 5
								bytes.
								</t>
							    </list> </t>
								</dd>
                </dl>
                <t> Both the
							Compressed_Size and
							Regenerated_Size fields
							follow little-endian
							convention. Note that
							Compressed_Size
							includes the size of
							the Huffman_Tree_Description
							when it is
							present. </t>
              </section>
              <section title="Raw_Literals_Block"> numbered="true" toc="default">
                <name>Raw_Literals_Block</name>
                <t>
							The data in Stream_1 is
							Regenerated_Size bytes
							long.  It contains
							the raw literals data
							to be used during
							Sequence Execution
							(<xref target="comp_sequences"/>). target="comp_sequences" format="default"/>).
                </t>
              </section>
              <section title="RLE_Literals_Block"> numbered="true" toc="default">
                <name>RLE_Literals_Block</name>
                <t>
							Stream_1 consists of a
							single byte that
							should be repeated
							Regenerated_Size times
							to generate the
							decoded literals.
                </t>
              </section>
              <section title="Compressed_Literals_Block numbered="true" toc="default">
                <name>Compressed_Literals_Block and Treeless_Literals_Block"> Treeless_Literals_Block</name>
                <t>

							Both of these modes
							contain Huffman-encoded Huffman-coded
							data.
							For Treeless_Literals_Block,
							the Huffman table comes from
							the previously
							compressed literals
							block, or from a
							dictionary;
							see <xref target="comp_dict"/>. target="comp_dict" format="default"/>.
                </t>
              </section>
              <section title="Huffman_Tree_Description"> numbered="true" toc="default">
                <name>Huffman_Tree_Description</name>
                <t>
							This section is
							only present
							when the
							Literals_Block_Type type
							is
							Compressed_Literals_Block
							(2). The format
							of Huffman_Tree_Description
							can be found in
							<xref target="huffman_tree_desc"/>. target="huffman_tree_desc" format="default"/>.
							The size of
							Huffman_Tree_Description
							is determined
							during the
							decoding process.  It
							must be used
							to determine
							where streams
							begin.

							        <figure><artwork>

                </t>
                <artwork name="" type="" align="left" alt=""><![CDATA[
  Total_Streams_Size = Compressed_Size
                       - Huffman_Tree_Description_Size
                                                                </artwork></figure></t>
                                                                ]]></artwork>
              </section>
              <section title="Jump_Table"> numbered="true" toc="default">
                <name>Jump_Table</name>
                <t> The Jump_Table
							    is only present
							    when there are
							    4 Huffman-coded
							    streams. </t>
                <t> (Reminder:
								Huffman-compressed
								data
								consists of
								either 1 or
								4
								Huffman-coded
								streams.)
                </t>
                <t> If only 1
								stream is
								present, it is
								a single
								bitstream
								occupying the
								entire
								remaining
								portion of the
								literals
								block, encoded
								as described
								within
								<xref target="huffman_coded_streams"/>. target="huffman_coded_streams" format="default"/>.
                </t>
                <t> If there are
							4 streams,
							Literals_Section_Header
							only provides
							enough
							information to
							know the
							decompressed
							and compressed
							sizes of all
							4 streams
							combined. The
							decompressed
							size of each
							stream is equal
							to
							(Regenerated_Size+3)/4,
							except for the
							last stream,
							which may be
							up to 3
							bytes smaller,
							to reach a
							total
							decompressed
							size as
							specified in
							Regenerated_Size.  </t>
                <t>
							The compressed
							size of each
							stream is
							provided
							explicitly in
							the Jump_Table.
							The Jump_Table
							is 6 bytes long and
							consists of three
							2-byte
							little-endian
							fields,
							describing the
							compressed
								sizes of the
								first 3
								streams.
								Stream4_Size
								is computed
								from
								Total_Streams_Size
								minus the sizes of
								other streams.

							        <figure><artwork>

                </t>
                <artwork name="" type="" align="left" alt=""><![CDATA[
  Stream4_Size = Total_Streams_Size - 6
                 - Stream1_Size - Stream2_Size
                 - Stream3_Size
                                                                </artwork></figure></t>
                                                                ]]></artwork>
                <t>
								Note that if
								Stream1_Size +
								Stream2_Size +
								Stream3_Size
								exceeds
								Total_Streams_Size,
								the data are
								considered
								corrupted. </t>

                <t>
								Each of these
								4 bitstreams
								is then decoded
								independently
								as a
								Huffman-Coded
								Huffman-coded
								stream, as
								described in
								<xref target="huffman_coded_streams"/>. target="huffman_coded_streams" format="default"/>.
                </t>
              </section>
            </section>
            <section anchor="comp_sequences"
						title="Sequences_Section"> numbered="true" toc="default">
              <name>Sequences_Section</name>
              <t> A compressed block is a
						    succession of sequences.
						    A sequence is a literal
						    copy command, followed by
						    a match copy command.  A
						    literal copy command
						    specifies a length.  It is
						    the number of bytes to be
						    copied (or extracted) from
						    the Literals Section. Literals_Section.
						    A match copy command
						    specifies an offset and a
						    length. </t>
              <t> When all sequences are
						    decoded, if there are
						    literals left in the
						    literals section,
						    Literals_Section, these
						    bytes are added at the
						    end of the block. </t>
              <t> This is described in more
						    detail in
						    <xref target="comp_sequence_exec"/>. target="comp_sequence_exec" format="default"/>. </t>
              <t> The Sequences_Section
 						    regroups all symbols
						    required to decode
						    commands.  There are three
						    symbol types: literals
						    lengths, offsets, and match
						    lengths.  They are encoded
						    together, interleaved, in
						    a single "bitstream". </t>
              <t> The Sequences_Section
						    starts by a header,
						    followed by optional
						    probability tables for
						    each symbol type, followed
						    by the bitstream. </t>

					        <t> <figure><artwork>
              <artwork name="" type="" align="left" alt=""><![CDATA[
  Sequences_Section_Header
    [Literals_Length_Table]
    [Offset_Table]
    [Match_Length_Table]
    bitStream
                                                </artwork></figure> </t>
                                                ]]></artwork>
              <t> To decode the
						    Sequences_Section, it's
						    necessary to know its
						    size. This size is deduced
						    from the size of the Literals_Section:
						    Sequences_Section_Size = Block_Size - Literals_Section_Header - Literals_Section_Content Literals_Section_Content. </t>
              <section title="Sequences_Section_Header" anchor="seq_sec_hdr"> anchor="seq_sec_hdr" numbered="true" toc="default">
                <name>Sequences_Section_Header</name>
                <t> This header
							    consists of two
							    items:

							    <list style="symbols">
								<t> Number_of_Sequences

                </t>
								<t>
                <ul spacing="normal">
                  <li> Number_of_Sequences </li>
                  <li> Symbol_Compression_Modes </t>
							    </list> </t> </li>
                </ul>
                <t> Number_of_Sequences
		is a variable size
		field using between
		1 and 3
		bytes.  If the
		first byte is
		"byte0":

							    <list style="symbols">
								<t>

                </t>
                <ul spacing="normal">
                  <li> if
								    (byte0 ==
								    0): there
								    are no
								    sequences.
								    The
								    sequence
								    section
								    stops here.
								    Decompressed
								    content is
								    defined
								    entirely as
								    Literals
								    Section
								    Literals_Section
								    content.
								    The FSE
								    tables used
								    in Repeat_Mode
								    are not
								    updated. </t>

								<t> </li>
                  <li> if (byte0
								    &lt; 128):
								    Number_of_Sequences
								    = byte0.
								    Uses 1
								    byte. </t>

								<t> </li>
                  <li> if (byte0
								    &lt; 255):
								    Number_of_Sequences
								    =
								    ((byte0 - 128)
								    &lt;&lt; 8) +
								    byte1. Uses
								    2 bytes. </t>

								<t> </li>
                  <li> if (byte0
								    == 255):
								    Number_of_Sequences
								    = byte1 +
								    (byte2 &lt;&lt; 8)
								    + 0x7F00.
								    Uses 3
								    bytes. </t>
							    </list> </t> </li>
                </ul>
                <t> Symbol_Compression_Modes
		is a single byte,
		defining the
		compression mode of
		each symbol type.

							    <figure><artwork>
  +-------------+----------------------+
  | Bit Number  |      Field Name      |
  +-------------+----------------------+
  |     7-6     | Literal_Lengths_Mode |
  +-------------+----------------------+
  |     5-4     |     Offsets_Mode     |
  +-------------+----------------------+
  |     3-2     |  Match_Lengths_Mode  |
  +-------------+----------------------+
  |     1-0     |       Reserved       |
  +-------------+----------------------+
                                                            </artwork></figure>
                </t>

<table anchor="Symbol_Compression_Modes">
  <name>Symbol_Compression_Modes</name>
  <thead>
    <tr>
      <th align="center">Bit Number</th>
      <th align="center">Field Name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">7-6</td>
      <td align="center">Literal_Lengths_Mode</td>
    </tr>
    <tr>
      <td align="center">5-4</td>
      <td align="center">Offsets_Mode</td>
    </tr>
    <tr>
      <td align="center">3-2</td>
      <td align="center">Match_Lengths_Mode</td>
    </tr>
    <tr>
      <td align="center">1-0</td>
      <td align="center">Reserved</td>
    </tr>
  </tbody>
</table>

<t> The last field,
Reserved, must be
all zeroes. </t>
<t> Literals_Lengths_Mode,
Offsets_Mode, and
Match_Lengths_Mode
define the
Compression_Mode of
literals lengths,
offsets, and match
lengths symbols,
respectively. They
follow the same
enumeration:

							    <figure><artwork>
  +-------+---------------------+
  | Value |  Compression_Mode   |
  +-------+---------------------+
  |   0   |   Predefined_Mode   |
  +-------+---------------------+
  |   1   |      RLE_Mode       |
  +-------+---------------------+
  |   2   | FSE_Compressed_Mode |
  +-------+---------------------+
  |   3   |     Repeat_Mode     |
  +-------+---------------------+
                                                            </artwork></figure></t>

							<t> <list style="hanging">
							    <t hangText="Predefined_Mode:">
</t>

<table anchor="literals">
  <name>Literals_Lengths_Mode, Offsets_Mode, and Match_Lengths_Mode</name>
  <thead>
    <tr>
      <th align="center">Value</th>
      <th align="center">Compression_Mode</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">Predefined_Mode</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">RLE_Mode</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">FSE_Compressed_Mode</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">Repeat_Mode</td>
    </tr>
  </tbody>
</table>

                <dl newline="false" spacing="normal">
                  <dt>Predefined_Mode:</dt>
                  <dd>
								A predefined
								FSE
								(see <xref target="comp_fse"/>) target="comp_fse" format="default"/>)
								distribution
								table is used, as
								defined in
								<xref target="default_dist"/>. target="default_dist" format="default"/>.
								No distribution
								table will be
								present. </t>

							    <t hangText="RLE_Mode:"> </dd>
                  <dt>RLE_Mode:</dt>
                  <dd>
								The table
								description
								consists of a
								single byte,
								which contains
								the symbol's
								value.  This
								symbol will
								be used for
								all sequences.
								</t>

							    <t hangText="FSE_Compressed_Mode:">
								</dd>
                  <dt>FSE_Compressed_Mode:</dt>
                  <dd>
								Standard FSE
								compression. A
								distribution
								table will be
								present. The
								format of this
								distribution
								table is
								described in
								<xref target="comp_fse_table"/>. target="comp_fse_table" format="default"/>.
								Note that the
								maximum allowed
								accuracy log
								for literals
								length and
								match length
								tables is 9,
								and the
								maximum
								accuracy log
								for the
								offsets table
								is 8.
								This mode must
								not be used when
								only one symbol
								is present;
								RLE_Mode should
								be used instead
								(although any
								other mode
								will work). </t>

							    <t hangText="Repeat_Mode:"> </dd>
                  <dt>Repeat_Mode:</dt>
                  <dd>
								The table used
								in the previous
								Compressed_Block
								with
								Number_Of_Sequences > &gt; 0
								will be
								used again, or
								if this is the
								first block,
								the table in
								the dictionary
								will be used.
								Note that this
								includes
								RLE_Mode,
								so if
								Repeat_Mode
								follows
								RLE_Mode, the
								same symbol
								will be
								repeated. It
								also
								includes
								Predefined_Mode,
								in which case
								Repeat_Mode
								will have the
								same outcome
								as
								Predefined_Mode.
								No distribution
								table will be
								present.
								If this mode is
								used without
								any previous
								sequence table
								in the frame
								(or dictionary;
								see
								<xref target="comp_dict"/>) target="comp_dict" format="default"/>)
								to repeat, this
								should be
								treated as
								corruption. </t>

							    </list></t> </dd>
                </dl>
                <section title="Sequence anchor="codes_lengths_offsets" numbered="true" toc="default">
                  <name>Sequence Codes for Lengths and Offsets" anchor="codes_lengths_offsets"> Offsets</name>
                  <t> Each symbol is a
							    code in its own
							    context, which
							    specifies Baseline
							    and Number_of_Bits
							    to add. Codes are
							    FSE compressed
							    and interleaved
							    with raw additional
							    bits in the same
							    bitstream. </t>
                  <t> Literals length
							    codes are values
							    ranging from 0 to
							    35
							    35, inclusive. They
							    define lengths from
							    0 to 131071 bytes.
							    The literals length
							    is equal to the
							    decoded Baseline
							    plus the result of
							    reading
							    Number_of_Bits bits
							    from the bitstream,
							    as a little-endian
							    value.

                                                            <figure><artwork>
  +----------------------+----------+----------------+
  | Literals_Length_Code | Baseline | Number_of_Bits |
  +----------------------+----------+----------------+
  |         0-15         |  length  |       0        |
  +----------------------+----------+----------------+
  |          16          |    16    |       1        |
  +----------------------+----------+----------------+
  |          17          |    18    |       1        |
  +----------------------+----------+----------------+
  |          18          |    20    |       1        |
  +----------------------+----------+----------------+
  |          19          |    22    |       1        |
  +----------------------+----------+----------------+
  |          20          |    24    |       2        |
  +----------------------+----------+----------------+
  |          21          |    28    |       2        |
  +----------------------+----------+----------------+
  |          22          |    32    |       3        |
  +----------------------+----------+----------------+
  |          23          |    40    |       3        |
  +----------------------+----------+----------------+
  |          24          |    48    |       4        |
  +----------------------+----------+----------------+
  |          25          |    64    |       6        |
  +----------------------+----------+----------------+
  |          26          |    128   |       7        |
  +----------------------+----------+----------------+
  |          27          |    256   |       8        |
  +----------------------+----------+----------------+
  |          28          |    512   |       9        |
  +----------------------+----------+----------------+
  |          29          |   1024   |       10       |
  +----------------------+----------+----------------+
  |          30          |   2048   |       11       |
  +----------------------+----------+----------------+
  |          31          |   4096   |       12       |
  +----------------------+----------+----------------+
  |          32          |   8192   |       13       |
  +----------------------+----------+----------------+
  |          33          |  16384   |       14       |
  +----------------------+----------+----------------+
  |          34          |  32768   |       15       |
  +----------------------+----------+----------------+
  |          35          |  65536   |       16       |
  +----------------------+----------+----------------+
                                                            </artwork></figure></t>

                  </t>

<table anchor="length">
  <name>Literals Length Codes</name>
  <thead>
    <tr>
      <th align="center">Literals_Length_Code</th>
      <th align="center">Baseline</th>
      <th align="center">Number_of_Bits</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0-15</td>
      <td align="center">length</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">16</td>
      <td align="center">16</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">17</td>
      <td align="center">18</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">18</td>
      <td align="center">20</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">19</td>
      <td align="center">22</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">20</td>
      <td align="center">24</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">21</td>
      <td align="center">28</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">22</td>
      <td align="center">32</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">23</td>
      <td align="center">40</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">24</td>
      <td align="center">48</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">25</td>
      <td align="center">64</td>
      <td align="center">6</td>
    </tr>
    <tr>
      <td align="center">26</td>
      <td align="center">128</td>
      <td align="center">7</td>
    </tr>
    <tr>
      <td align="center">27</td>
      <td align="center">256</td>
      <td align="center">8</td>
    </tr>
    <tr>
      <td align="center">28</td>
      <td align="center">512</td>
      <td align="center">9</td>
    </tr>
    <tr>
      <td align="center">29</td>
      <td align="center">1024</td>
      <td align="center">10</td>
    </tr>
    <tr>
      <td align="center">30</td>
      <td align="center">2048</td>
      <td align="center">11</td>
    </tr>
    <tr>
      <td align="center">31</td>
      <td align="center">4096</td>
      <td align="center">12</td>
    </tr>
    <tr>
      <td align="center">32</td>
      <td align="center">8192</td>
      <td align="center">13</td>
    </tr>
    <tr>
      <td align="center">33</td>
      <td align="center">16384</td>
      <td align="center">14</td>
    </tr>
    <tr>
      <td align="center">34</td>
      <td align="center">32768</td>
      <td align="center">15</td>
    </tr>
    <tr>
      <td align="center">35</td>
      <td align="center">65536</td>
      <td align="center">16</td>
    </tr>
  </tbody>
</table>

                  <t> Match length codes
							    are values ranging
							    from 0 to 52 52,
							    inclusive. They
							    define lengths from
							    3 to 131074 bytes.
							    The match length is
							    equal to the
							    decoded Baseline
							    plus the result of
							    reading
							    Number_of_Bits bits
							    from the bitstream,
							    as a little-endian
							    value.

							    <figure><artwork>
  +-------------------+-----------------------+----------------+
  | Match_Length_Code |       Baseline        | Number_of_Bits |
  +-------------------+-----------------------+----------------+
  |        0-31       | Match_Length_Code

                  </t>

<table anchor="Match_Length_Code">
  <name>Match Length Codes</name>
  <thead>
    <tr>
      <th align="center">Match_Length_Code</th>
      <th align="center">Baseline</th>
      <th align="center">Number_of_Bits</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0-31</td>
      <td align="center">Match_Length_Code + 3 |       0        |
  +-------------------+-----------------------+----------------+
  |         32        |          35           |       1        |
  +-------------------+-----------------------+----------------+
  |         33        |          37           |       1        |
  +-------------------+-----------------------+----------------+
  |         34        |          39           |       1        |
  +-------------------+-----------------------+----------------+
  |         35        |          41           |       1        |
  +-------------------+-----------------------+----------------+
  |         36        |          43           |       2        |
  +-------------------+-----------------------+----------------+
  |         37        |          47           |       2        |
  +-------------------+-----------------------+----------------+
  |         38        |          51           |       3        |
  +-------------------+-----------------------+----------------+
  |         39        |          59           |       3        |
  +-------------------+-----------------------+----------------+
  |         40        |          67           |       4        |
  +-------------------+-----------------------+----------------+
  |         41        |          83           |       4        |
  +-------------------+-----------------------+----------------+
  |         42        |          99           |       5        |
  +-------------------+-----------------------+----------------+
  |         43        |         131           |       7        |
  +-------------------+-----------------------+----------------+
  |         44        |         259           |       8        |
  +-------------------+-----------------------+----------------+
  |         45        |         515           |       9        |
  +-------------------+-----------------------+----------------+
  |         46        |         1027          |       10       |
  +-------------------+-----------------------+----------------+
  |         47        |         2051          |       11       |
  +-------------------+-----------------------+----------------+
  |         48        |         4099          |       12       |
  +-------------------+-----------------------+----------------+
  |         49        |         8195          |       13       |
  +-------------------+-----------------------+----------------+
  |         50        |         16387         |       14       |
  +-------------------+-----------------------+----------------+
  |         51        |         32771         |       15       |
  +-------------------+-----------------------+----------------+
  |         52        |         65539         |       16       |
  +-------------------+-----------------------+----------------+
                                                            </artwork></figure></t> 3</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">32</td>
      <td align="center">35</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">33</td>
      <td align="center">37</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">34</td>
      <td align="center">39</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">35</td>
      <td align="center">41</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">36</td>
      <td align="center">43</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">37</td>
      <td align="center">47</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">38</td>
      <td align="center">51</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">39</td>
      <td align="center">59</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">40</td>
      <td align="center">67</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">41</td>
      <td align="center">83</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">42</td>
      <td align="center">99</td>
      <td align="center">5</td>
    </tr>
    <tr>
      <td align="center">43</td>
      <td align="center">131</td>
      <td align="center">7</td>
    </tr>
    <tr>
      <td align="center">44</td>
      <td align="center">259</td>
      <td align="center">8</td>
    </tr>
    <tr>
      <td align="center">45</td>
      <td align="center">515</td>
      <td align="center">9</td>
    </tr>
    <tr>
      <td align="center">46</td>
      <td align="center">1027</td>
      <td align="center">10</td>
    </tr>
    <tr>
      <td align="center">47</td>
      <td align="center">2051</td>
      <td align="center">11</td>
    </tr>
    <tr>
      <td align="center">48</td>
      <td align="center">4099</td>
      <td align="center">12</td>
    </tr>
    <tr>
      <td align="center">49</td>
      <td align="center">8195</td>
      <td align="center">13</td>
    </tr>
    <tr>
      <td align="center">50</td>
      <td align="center">16387</td>
      <td align="center">14</td>
    </tr>
    <tr>
      <td align="center">51</td>
      <td align="center">32771</td>
      <td align="center">15</td>
    </tr>
    <tr>
      <td align="center">52</td>
      <td align="center">65539</td>
      <td align="center">16</td>
    </tr>
  </tbody>
</table>

                  <t> Offset codes are
							    values ranging from
							    0 to N. </t>
                  <t> A decoder is free
							    to limit its
							    maximum supported
							    value for N.
							    Support for values
							    of at least 22 is
							    recommended.
							    At the time of this
							    writing, the
							    reference decoder
							    supports a maximum
							    N value of 31. </t>
                  <t> An offset code is
							    also the number of
							    additional bits to
							    read in
							    little-endian
							    fashion and can
							    be translated into
							    an Offset_Value
							    using the following
							    formulas:

							    <figure><artwork>

                  </t>
                  <artwork name="" type="" align="left" alt=""><![CDATA[
  Offset_Value = (1 &lt;&lt; << offsetCode) + readNBits(offsetCode);
  if (Offset_Value > 3) Offset = Offset_Value - 3;
                                                            </artwork></figure></t>
                                                            ]]></artwork>
                  <t> This means that
		  maximum
		  Offset_Value is
							    (2^(N+1))-1,
		  (2<sup>N+1</sup>) - 1,
		  supporting
		  back-reference
		  distance up to
							    (2^(N+1))-4,
		  (2<sup>N+1</sup>) - 4, but it is
		  limited by the
		  maximum
		  back-reference
		  distance (see
		  <xref target="comp_window_descr"/>). target="comp_window_descr" format="default"/>). </t>
                  <t> Offset_Value from
							    1 to 3 are special:
							    they define "repeat
							    codes". This is
							    described in more
							    detail in
							    <xref target="repeat_offsets"/>. target="repeat_offsets" format="default"/>. </t>
                </section>
                <section title="Decoding Sequences"> numbered="true" toc="default">
                  <name>Decoding Sequences</name>
                  <t> FSE bitstreams are
							    read in reverse of
							    the direction they
							    are written. In zstd,
							    the compressor
							    writes bits forward
							    into a block, and
							    the decompressor
							    must read the
							    bitstream
							    backwards. </t>
                  <t> To find the start
							    of the bitstream, it
							    is therefore
							    necessary to know
							    the offset of the
							    last byte of the
							    block, which can be
							    found by counting
							    Block_Size bytes
							    after the block
							    header. </t>
                  <t> After writing the
							    last bit containing
							    information, the
							    compressor writes a
							    single 1 bit and
                  then fills the rest
                  of the byte with
                  zero bits. The
							    last byte of the
							    compressed
							    bitstream cannot be
							    zero for that
							    reason. </t>
                  <t> When decompressing,
							    the last byte
							    containing the
							    padding is the
							    first byte to read.
							    The decompressor
                  needs to skip the
                  up to 7 bits of
                  0-padding as well
                  as the the first 1
                  bit that occurs.
							    Afterwards, the
							    useful part of the
							    bitstream
							    begins. </t>

                  <t> FSE decoding
		  requires a 'state'
		  to be carried from
		  symbol to symbol.
		  For more
		  explanation on FSE
		  decoding, see
		  <xref target="comp_fse"/>. target="comp_fse" format="default"/>. </t>
                  <t> For sequence
							    decoding, a
							    separate state
							    keeps track of
							    each literal
							    lengths, literals
							    length, offsets,
							    and match lengths
							    symbols.
							    symbol. Some FSE
							    primitives are
							    also used. For
							    more details on
							    the operation of
							    these primitives,
							    see
							    <xref target="comp_fse"/>. target="comp_fse" format="default"/>. </t>
                  <t> The bitstream
							    starts with initial
							    FSE state values,
							    each using the
							    required number of
							    bits in their
							    respective
							    accuracy, decoded
							    previously from
							    their normalized
							    distribution. It
							    starts with
							    Literals_Length_State,
							    followed by
							    Offset_State, and
							    finally
							    Match_Length_State. </t>
                  <t> Note that all
							    values are read
							    backward, so the
							    'start' of the
							    bitstream is at the
							    highest position in
							    memory, immediately
							    before the last
							    1 bit for
							    padding. </t>
                  <t> After decoding the
							    starting states, a
							    single sequence is
							    decoded
							    Number_Of_Sequences
							    times. These
							    sequences are
							    decoded in order
							    from first to last.
							    Since the
							    compressor writes
							    the bitstream in
							    the forward
							    direction, this
							    means the
							    compressor must
							    encode the
							    sequences starting
							    with the last one
							    and ending with the
							    first. </t>
                  <t> For each of the
							    symbol types, the
							    FSE state can be
							    used to determine
							    the appropriate
							    code. The code then
							    defines the
							    Baseline and Number_of_Bits
							    to read for
							    each type.  The
							    description of the
							    codes for how to
							    determine these
							    values can be
							    found in
							    <xref target="seq_sec_hdr"/>. target="seq_sec_hdr" format="default"/>. </t>
                  <t> Decoding starts by
							    reading the
							    Number_of_Bits
							    required to decode
							    offset. It
							    does the same for
							    Match_Length and
							    then for
							    Literals_Length.
							    This sequence is
							    then used for
							    Sequence Execution
							    (see
							    <xref target="comp_sequence_exec"/>). target="comp_sequence_exec" format="default"/>). </t>
                  <t> If it is not the
							    last sequence in
							    the block, the next
							    operation is to
							    update states.

							    Using the rules
							    pre-calculated
							    precalculated in
							    the decoding
							    tables,
							    Literals_Length_State
							    is updated,
							    followed by
							    Match_Length_State,
							    and then
							    Offset_State.
							    See
							    <xref target="comp_fse"/> target="comp_fse" format="default"/>
							    for details on how
							    to update states
							    from the
							    bitstream. </t>
                  <t> This operation will
							    be repeated
							    Number_of_Sequences
							    times. At the end,
							    the bitstream shall
							    be entirely
							    consumed; otherwise,
							    the bitstream is
							    considered
							    corrupted. </t>
                </section>
              </section>
              <section anchor="default_dist"
                                                         title="Default Distributions"> numbered="true" toc="default">
                <name>Default Distributions</name>
                <t> If Predefined_Mode
							    is selected for a
							    symbol type, its
							    FSE decoding table
							    is generated from a
							    predefined
							    distribution table
							    defined here. For
							    details on how to
							    convert this
							    distribution into
							    a decoding table,
							    see <xref target="comp_fse"/>. target="comp_fse" format="default"/>. </t>
                <section title="Literals Length"> numbered="true" toc="default">
                  <name>Literals Length</name>
                  <t> The decoding
								table uses an
								accuracy log of
								6 bits (64
								states).

							        <figure><artwork>

                  </t>
                  <artwork name="" type="" align="left" alt=""><![CDATA[
  short literalsLength_defaultDistribution[36] =
    { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
      2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1,
      -1,-1,-1,-1
    };
                                                                </artwork></figure></t>
                                                                ]]></artwork>
                </section>
                <section title="Match Length"> numbered="true" toc="default">
                  <name>Match Length</name>
                  <t> The decoding
								table uses an
								accuracy log of
								6 bits (64
								states).

							        <figure><artwork>

                  </t>
                  <artwork name="" type="" align="left" alt=""><![CDATA[
  short matchLengths_defaultDistribution[53] =
    { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
      1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
      1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,
      -1,-1,-1,-1,-1
    };
                                                                </artwork></figure></t>
                                                                ]]></artwork>
                </section>
                <section title="Offset Codes"> numbered="true" toc="default">
                  <name>Offset Codes</name>
                  <t> The decoding
								table uses an
								accuracy log of
								5 bits (32
								states),
								states) and
								supports a
								maximum N value
								of 28, allowing
								offset values
								up to
								536,870,908. </t>
                  <t> If any sequence
								in the
								compressed
								block requires
								a larger offset
								than this, it's
								not possible to
								use the default
								distribution to
								represent it.

							        <figure><artwork>

                  </t>
                  <artwork name="" type="" align="left" alt=""><![CDATA[
  short offsetCodes_defaultDistribution[29] =
    { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
      1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1
    };
                                                                </artwork></figure></t>
                                                                ]]></artwork>
                </section>
              </section>
            </section>
          </section>
          <section anchor="comp_sequence_exec"
			title="Sequence Execution"> numbered="true" toc="default">
            <name>Sequence Execution</name>
            <t> Once literals and sequences have been decoded,
			    they are combined to produce the decoded content
			    of a block. </t>
            <t> Each sequence consists of a tuple of
			    (literals_length, offset_value, match_length),
			    decoded as described in the Sequences_Section
			    (<xref target="comp_sequences"/>). target="comp_sequences" format="default"/>). To execute a
			    sequence, first copy literals_length bytes from
			    the decoded literals to the output. </t>
            <t> Then, match_length bytes are copied from previous
			    decoded data. The offset to copy from is
			    determined by offset_value:
			    <list style="symbols">
				<t>
            </t>
            <ul spacing="normal">
              <li> if Offset_Value > &gt; 3, then the offset is
				    Offset_Value - 3; </t>

				<t> </li>
              <li> if Offset_Value is from 1-3, the offset is
				    a special repeat offset value. See
				    <xref target="repeat_offsets"/> target="repeat_offsets" format="default"/> for how
				    the offset is determined in this case. </t>
			    </list> </t> </li>
            </ul>
            <t> The offset is defined as from the current
			    position (after copying the literals), so an
			    offset of 6 and a match length of
			    3 means that 3 bytes should be copied from 6 bytes
			    back. Note that all offsets leading to previously
			    decoded data must be smaller than Window_Size
			    defined in Frame_Header_Descriptor
			    (<xref target="comp_frame_header_desc"/>). target="comp_frame_header_desc" format="default"/>). </t>
          </section>
          <section title="Repeat Offsets"
			         anchor="repeat_offsets"> anchor="repeat_offsets" numbered="true" toc="default">
            <name>Repeat Offsets</name>
            <t> As seen above, the first three values
				    define a repeated offset; we will call
				    them Repeated_Offset1, Repeated_Offset2,
				    and Repeated_Offset3.  They are sorted in
				    recency order, with Repeated_Offset1
				    meaning "most recent one". </t>
            <t> If offset_value is 1, then the offset used
				    is Repeated_Offset1, etc. </t>
            <t> There is one exception: When when the current
				    sequence's literals_length is 0, repeated
				    offsets are shifted by 1, so an
				    offset_value of 1 means Repeated_Offset2,
				    an offset_value of 2 means Repeated_Offset3,
				    and an offset_value of 3 means
				    Repeated_Offset1 - 1_byte. </t>
            <t> For the first block, the starting offset
				    history is populated with the following
				    values: Repeated_Offset1 (1),
				    Repeated_Offset2 (4), and
				    Repeated_Offset3 (8), unless
				    a dictionary is used, in which case they
				    come from the dictionary. </t>
            <t> Then each block gets its starting offset
				    history from the ending values of the most
				    recent Compressed_Block. Note that blocks
				    that are not Compressed_Block are skipped;
				    they do not contribute to offset
				    history. </t>
            <t> The newest offset takes the lead in offset
				    history, shifting others back (up to its
				    previous place if it was already
				    present).  This means that when
				    Repeated_Offset1 (most recent) is used,
				    history is unmodified. When
				    Repeated_Offset2 is used, it is swapped
				    with Repeated_Offset1. If any other offset
				    is used, it becomes Repeated_Offset1, and
				    the rest are shifted back by 1. </t>
          </section>
        </section>
        <section anchor="comp_skippable"
			title="Skippable Frames">
			<t> <figure><artwork>
  +--------------+------------+-----------+
  | Magic_Number | Frame_Size | User_Data |
  +--------------+------------+-----------+
  |    4 bytes   |   4 bytes  |  n bytes  |
  +--------------+------------+-----------+
                        </artwork></figure> </t> numbered="true" toc="default">
          <name>Skippable Frames</name>

<table anchor="skippable">
  <name>Skippable Frames</name>
  <thead>
    <tr>
      <th align="center">Magic_Number</th>
      <th align="center">Frame_Size</th>
      <th align="center">User_Data</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">4 bytes</td>
      <td align="center">4 bytes</td>
      <td align="center">n bytes</td>
    </tr>
  </tbody>
</table>

          <t> Skippable frames allow the insertion of
			    user-defined metadata into a flow of concatenated
			    frames. </t>
          <t> Skippable frames defined in this specification are
			    compatible with skippable frames in
			    <xref target="LZ4"/>. target="LZ4" format="default"/>. </t>
          <t> From a compliant decoder perspective, skippable
			    frames simply need to be skipped, and their
			    content ignored, resuming decoding after the
			    skippable frame. </t>
          <t> It should be noted that a skippable frame can be
			    used to watermark a stream of concatenated frames
			    embedding any kind of tracking information (even
			    just a Universally Unique Identifier (UUID)). Users wary of such possibility
			    should scan the stream of concatenated frames in
			    an attempt to detect such frames for analysis or
			    removal. </t>
          <t> The fields are:
			    <list style="hanging">
				<t hangText="Magic_Number:">
					4
          </t>
          <dl newline="false" spacing="normal">
            <dt>Magic_Number:</dt>

            <dd>4 bytes, little-endian format. Value:
					0x184D2A5?, which means any value from
					0x184D2A50 to 0x184D2A5F. All 16
					values are valid to identify a
					skippable frame.  This specification
					does not detail any specific tagging
					methods for skippable frames.
				</t>

			    	<t hangText="Frame_Size:">
				</dd>
            <dt>Frame_Size:</dt>
            <dd>
					This is the size, in bytes, of the
					following User_Data (without including
					the magic number nor the size field
					itself). This field is represented
					using 4 bytes, little-endian format,
					unsigned 32 bits. This means User_Data
					can't be bigger than (2^32-1)
					(2<sup>32</sup> -1) bytes.
				</t>

				<t hangText="User_Data:">
				</dd>
            <dt>User_Data:</dt>
            <dd>
					This field can be anything. Data will
					just be skipped by the decoder.
				</t>
			    </list> </t>
				</dd>
          </dl>
        </section>
      </section>
    </section>
    <section anchor="comp_entropy" title="Entropy Encoding"> numbered="true" toc="default">
      <name>Entropy Encoding</name>
      <t> Two types of entropy encoding are used by the
			    Zstandard format: FSE and Huffman coding.
			    Huffman is used to compress literals, while FSE
			    is used for all other symbols
			    (Literals_Length_Code, Match_Length_Code, and offset
			    codes) and to compress Huffman headers.</t>
      <section anchor="comp_fse" title="FSE"> numbered="true" toc="default">
        <name>FSE</name>
        <t> FSE, short for Finite State Entropy, is
				    an entropy codec based on
				    <xref target="ANS"/>. target="ANS" format="default"/>.
				    FSE encoding/decoding involves
				    a state that is carried over between
				    symbols, so decoding must be done in the
				    opposite direction as encoding. Therefore,
				    all FSE bitstreams are read from end to
				    beginning. Note that the order of the
				    bits in the stream is not reversed;
				    they are simply read in the reverse
				    order from which they were written. </t>
        <t> For additional details on FSE, see
				    Finite State Entropy
	"FiniteStateEntropy" <xref target="FSE"/>. target="FSE" format="default"/>. </t>

        <t> FSE decoding involves a decoding table
				    that has a power of 2 power-of-2 size and contains
				    three elements: Symbol, Num_Bits, and
				    Baseline. The base 2 logarithm
				    of the table size is its Accuracy_Log.
				    An FSE state value represents an index in
				    this table. </t>
        <t> To obtain the initial state value,
				    consume Accuracy_Log bits from the stream
				    as a little-endian value. The next symbol
				    in the stream is the Symbol indicated in
				    the table for that state. To obtain the
				    next state value, the decoder should
				    consume Num_Bits bits from the stream as a
				    little-endian value and add it to
				    Baseline. </t>
        <section anchor="comp_fse_table"
				         title="FSE numbered="true" toc="default">
          <name>FSE Table Description"> Description</name>
          <t> To decode FSE streams, it is necessary
					to construct the decoding table. The
					Zstandard format encodes FSE table
					descriptions as described here. </t>
          <t> An FSE distribution table describes the
					probabilities of all symbols from 0 to
					the last present one (included) on a
					normalized scale of
					(1&nbsp;&lt;&lt; Accuracy_Log).

					Note that there must be two or
					more symbols with non-zero nonzero probability.
          </t>
          <t> A bitstream is read forward, in
					little-endian fashion. It is not
					necessary to know its exact size,
					since the size will be discovered and
					reported by the decoding process.  The
					bitstream starts by reporting on which
					scale it operates.  If low4bits
					designates the lowest 4 bits of
					the first byte, then
					Accuracy_Log = low4bits + 5. </t>
          <t> This is followed by each symbol value,
					from 0 to the last present one. The
					number of bits used by each field is
					variable and depends on:
					<list style="hanging">
					    <t hangText="Remaining
          </t>
          <dl newline="false" spacing="normal">
            <dt>Remaining probabilities + 1:"> 1:</dt>
            <dd>
						For example, presuming an
						Accuracy_Log of 8, and
						presuming 100 probabilities
						points have already been
						distributed, the decoder may
						read any value from 0 to
						(256 - 100 + 1) == 157,
						inclusive. Therefore, it must
						read log2sup(157) == 8
						bits. </t>

					    <t hangText="Value decoded:"> </dd>
            <dt>Value decoded:</dt>
            <dd>
              <t>
						Small values use 1 fewer bit.
						For example, presuming values
						from 0 to 157 (inclusive) 157, inclusive, are
						possible, 255 - 157 = 98 values
						are remaining in an 8-bit
						field.  The first 98 values
						(hence
						(hence, from 0 to 97) use only
						7 bits, and values from 98 to
						157 use 8 bits. This is
						achieved through this scheme:
					        <figure><artwork>
  +------------+---------------+-----------+
  | Value Read | Value Decoded | Bits Used |
  +------------+---------------+-----------+
  |   0 the scheme in
						<xref target="value" />:
              </t>

<table anchor="value">
  <name>Values Decoded</name>
  <thead>
    <tr>
      <th align="center">Value Read</th>
      <th align="center">Value Decoded</th>
      <th align="center">Bits Used</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0 - 97   |     0 97</td>
      <td align="center">0 - 97    |     7     |
  +------------+---------------+-----------+
  |  98 97</td>
      <td align="center">7</td>
    </tr>
    <tr>
      <td align="center">98 - 127  |    98 127</td>
      <td align="center">98 - 127   |     8     |
  +------------+---------------+-----------+
  | 128 127</td>
      <td align="center">8</td>
    </tr>
    <tr>
      <td align="center">128 - 225  |     0 225</td>
      <td align="center">0 - 97    |     7     |
  +------------+---------------+-----------+
  | 226 97</td>
      <td align="center">7</td>
    </tr>
    <tr>
      <td align="center">226 - 255  |   128 255</td>
      <td align="center">128 - 157   |     8     |
  +------------+---------------+-----------+
                                                </artwork></figure> </t>
					    </list></t> 157</td>
      <td align="center">8</td>
    </tr>
  </tbody>
</table>

            </dd>
          </dl>
          <t> Symbol probabilities are read
						one by one, in order.  The
						probability is obtained from
						Value decoded Decoded using the
						formula P = Value - 1.  This
						means the value 0 becomes the
						negative probability -1.  This
						is a special probability that
						means "less than 1".  Its
						effect on the distribution
						table is described below.  For
						the purpose of calculating
						total allocated probability
						points, it counts as 1. </t>
          <t> When a symbol has a
						probability of zero, it is
						followed by a 2-bit repeat
						flag. This repeat flag tells
						how many probabilities of
						zeroes follow the current one.
						It provides a number ranging
						from 0 to 3. If it is a 3,
						another 2-bit repeat flag
						follows, and so on. </t>
          <t> When the last symbol reaches
						a cumulated total of
						(1&nbsp;&lt;&lt;&nbsp;Accuracy_Log),
						decoding is complete.  If the
						last symbol makes the cumulated
						total go above
						(1 &lt;&lt; Accuracy_Log),
						distribution is considered
						corrupted. </t>
          <t> Finally, the decoder can tell
						how many bytes were used in
						this process and how many
						symbols are present. The
						bitstream consumes a round
						number of bytes. Any remaining
						bit within the last byte is
						simply unused. </t>
          <t> The context in which the table
            is to be used specifies an expected
            number of symbols. That expected
            number of symbols never exceeds 256.
            If the number of symbols decoded
            is not equal to the expected, the
            header should be considered
            corrupt. </t>
          <t> The distribution of normalized
						probabilities is enough to
						create a unique decoding
						table.  The table has a size
						of (1 &lt;&lt; Accuracy_Log).
						Each cell describes the symbol
						decoded and instructions to
						get the next state. </t>
          <t> Symbols are scanned in their
						natural order for "less than 1"
						probabilities as described
						above.  Symbols with this
						probability are being
						attributed a single cell,
						starting from the end of the
						table and retreating. These
						symbols define a
						full state reset, reading
						Accuracy_Log bits. </t>
          <t> All remaining symbols are
						allocated in their natural
						order.  Starting from symbol 0
						and table position 0, each
						symbol gets allocated as many
						cells as its probability. Cell
						allocation is spread, not
						linear; each successor
						position follows this rule:
					        <figure><artwork>
          </t>
          <artwork name="" type="" align="left" alt=""><![CDATA[
  position += (tableSize >> 1) + (tableSize >> 3) + 3;
  position &amp;= &= tableSize - 1;
                                                </artwork></figure></t>
                                                ]]></artwork>
          <t> A position is skipped if it is
						already occupied by a "less
						than 1" probability symbol.
						Position does not reset between
						symbols; it simply iterates
						through each position in the
						table, switching to the next
						symbol when enough states have
						been allocated to the current
						one. </t>
          <t> The result is a list of state
						values. Each state will decode
						the current symbol. </t>
          <t> To get the Number_of_Bits and
						Baseline required for the next
						state, it is first necessary
						to sort all states in their
						natural order. The lower
						states will need 1 more bit
						than higher ones. The process
						is repeated for each symbol.
          </t>
          <t> For example, presuming a symbol
						has a probability of 5, it
						receives five state values.
						States are sorted in natural
						order.  The next power of
						2 is 8.  The space of
						probabilities is divided into
						8 equal parts.  Presuming the
						Accuracy_Log is 7, this
						defines 128 states, and each
						share (divided by 8) is 16
						in size.  In order to reach
						8, 8 - 5 = 3 lowest states will
						count "double", doubling the
						number of shares (32 in width),
						requiring 1 more bit in the
						process. </t>
          <t> Baseline is assigned starting
						from the higher states using
						fewer bits, and proceeding
						naturally, then resuming at
						the first state, each taking
						its allocated width from
						Baseline. </t>

					    <t> <figure><artwork>
  +----------------+-------+-------+--------+------+-------+
  |   state order  |   0   |   1   |   2    |  3   |  4    |
  +----------------+-------+-------+--------+------+-------+
  |     width      |   32  |   32  |   32   |  16  |  16   |
  +----------------+-------+-------+--------+------+-------+
  | Number_of_Bits |   5   |   5   |   5    |  4   |  4    |
  +----------------+-------+-------+--------+------+-------+
  |  range number  |   2   |   4   |   6    |  0   |  1    |
  +----------------+-------+-------+--------+------+-------+
  |    Baseline    |   32  |   64  |   96   |  0   |  16   |
  +----------------+-------+-------+--------+------+-------+
  |     range      | 32-63 | 64-95 | 96-127 | 0-15 | 16-31 |
  +----------------+-------+-------+--------+------+-------+
                                                </artwork></figure> </t>

<table anchor="state">
  <name>Baseline Assignments</name>
  <tbody>
    <tr>
      <td align="center">state order</td>
      <td align="center">0</td>
      <td align="center">1</td>
      <td align="center">2</td>
      <td align="center">3</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">width</td>
      <td align="center">32</td>
      <td align="center">32</td>
      <td align="center">32</td>
      <td align="center">16</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">Number_of_Bits</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">4</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">range number</td>
      <td align="center">2</td>
      <td align="center">4</td>
      <td align="center">6</td>
      <td align="center">0</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">Baseline</td>
      <td align="center">32</td>
      <td align="center">64</td>
      <td align="center">96</td>
      <td align="center">0</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">range</td>
      <td align="center">32-63</td>
      <td align="center">64-95</td>
      <td align="center">96-127</td>
      <td align="center">0-15</td>
      <td align="center">16-31</td>
    </tr>
  </tbody>
</table>

          <t> The next state is determined
						from the current state by
						reading the required
						Number_of_Bits and adding the
						specified Baseline. </t>
          <t> See <xref target="app_tables"/> target="app_tables" format="default"/>
						for the results of this
						process that are applied to the
						default distributions. </t>
        </section>
      </section>
      <section anchor="comp_huffman" title="Huffman Coding"> numbered="true" toc="default">
        <name>Huffman Coding</name>
        <t> Zstandard Huffman-coded streams are read
				    backwards, similar to the FSE bitstreams.
				    Therefore, to find the start of the
				    bitstream, it is necessary to know the
				    offset of the last byte of the
				    Huffman-coded stream. </t>
        <t> After writing the last bit containing
				    information, the compressor writes a
            single 1 bit and then fills the rest of
            the byte with 0 bits. The last byte of
				    the compressed bitstream cannot be 0 for
				    that reason. </t>
        <t> When decompressing, the last byte
				    containing the padding is the first byte
				    to read. The decompressor needs to skip
            the up to 7 bits of 0-padding as well as
            the first 1 bit that occurs. Afterwards,
            the useful part of the bitstream
            begins. </t>
        <t> The bitstream contains Huffman-coded
				    symbols in little-endian order, with the
				    codes defined by the method below. </t>
        <section anchor="huffman_tree_desc"
					title="Huffman numbered="true" toc="default">
          <name>Huffman Tree Description"> Description</name>
          <t> Prefix coding represents symbols
					    from an a priori known alphabet by
					    bit sequences (codewords), one
					    codeword for each symbol, in a
					    manner such that different symbols
					    may be represented by bit sequences
					    of different lengths, but a parser
					    can always parse an encoded string
					    unambiguously
					    unambiguously,
					    symbol by symbol. </t>
          <t> Given an alphabet with known symbol
					    frequencies, the Huffman algorithm
					    allows the construction of an
					    optimal prefix code using the
					    fewest bits of any possible prefix
					    codes for that alphabet. </t>
          <t> The prefix code must not exceed a
					    maximum code length. More bits
					    improve accuracy but yield a larger
					    header size and require more
					    memory or more complex decoding
					    operations. This specification
					    limits the maximum code length to
					    11 bits. </t>
          <t> All literal values from zero
					    (included) to the last present one
					    (excluded) are represented by
					    Weight with values from 0 to
					    Max_Number_of_Bits. Transformation
					    from Weight to Number_of_Bits
					    follows this pseudocode:
					    <figure><artwork>
          </t>
<sourcecode type="pseudocode">
  if Weight == 0
    Number_of_Bits = 0
  else
    Number_of_Bits = Max_Number_of_Bits + 1 - Weight
                                            </artwork></figure> </t>
</sourcecode>
          <t> The last symbol's Weight is
					    deduced from previously decoded
					    ones, by completing to the nearest
					    power of 2. This power of 2 gives
					    Max_Number_of_Bits the depth of
					    the current tree. </t>
          <t> For example, presume the following
					    Huffman tree must be described:
					    <figure><artwork>
  +---------------+----------------+
  | Literal Value | Number_of_Bits |
  +---------------+----------------+
  |       0       |        1       |
  +---------------+----------------+
  |       1       |        2       |
  +---------------+----------------+
  |       2       |        3       |
  +---------------+----------------+
  |       3       |        0       |
  +---------------+----------------+
  |       4       |        4       |
  +---------------+----------------+
  |       5       |        4       |
  +---------------+----------------+
                                            </artwork></figure>
          </t>

<table anchor="Huffman">
  <name>Huffman Tree</name>
  <thead>
    <tr>
      <th align="center">Literal Value</th>
      <th align="center">Number_of_Bits</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">5</td>
      <td align="center">4</td>
    </tr>
  </tbody>
</table>

          <t> The tree depth is 4, since its
					    longest element uses 4 bits.
					    (The longest elements are those
					    with the smallest frequencies.)
					    Value 5 will not be listed as it
					    can be determined from the values
					    for 0-4, nor will values above 5
					    as they are all 0. Values from 0
					    to 4 will be listed using Weight
					    instead of Number_of_Bits. The
					    pseudocode to determine Weight is:
					    <figure><artwork>
          </t>
<sourcecode type="pseudocode">
  if Number_of_Bits == 0
    Weight = 0
  else
    Weight = Max_Number_of_Bits + 1 - Number_of_Bits
                                            </artwork></figure> </t>
</sourcecode>
          <t> It gives the following series of
					    weights:
					    <figure><artwork>
  +---------------+--------+
  | Literal Value | Weight |
  +---------------+--------+
  |       0       |   4    |
  +---------------+--------+
  |       1       |   3    |
  +---------------+--------+
  |       2       |   2    |
  +---------------+--------+
  |       3       |   0    |
  +---------------+--------+
  |       4       |   1    |
  +---------------+--------+
                                            </artwork></figure>
          </t>

<table anchor="weights">
  <name>Weights</name>
  <thead>
    <tr>
      <th align="center">Literal Value</th>
      <th align="center">Weight</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">1</td>
    </tr>
  </tbody>
</table>

          <t> The decoder will do the inverse
					    operation: having collected weights
					    of literals from 0 to 4, it knows
					    the last literal, 5, is present
					    with a non-zero nonzero Weight. The Weight
					    of 5 can be determined by advancing
					    to the next power of 2. The sum of
					    2^(Weight-1)
					    2<sup>(Weight-1)</sup> (excluding 0's) is 15.
					    The nearest power of 2 is 16.
					    Therefore, Max_Number_of_Bits = 4
					    and Weight[5] = 16 - 15 = 1. </t>
          <section anchor="huffman_tree_header"
						title="Huffman numbered="true" toc="default">
            <name>Huffman Tree Header"> Header</name>
            <t> This is a single byte value
						(0-255), which describes how
						the series of weights is
						encoded.
					        <list style="hanging">
						    <t hangText="headerByte
            </t>
            <dl newline="false" spacing="normal">
              <dt>headerByte &lt; 128:"> 128:</dt>
              <dd>
							The series of weights
							is compressed using
							FSE (see below).  The
							length of the
							FSE-compressed series
							is equal to headerByte
							(0-127). </t>

						    <t hangText="headerByte >= 128:"> </dd>
              <dt>headerByte &gt;= 128:</dt>
              <dd>
                <t>
							This is a direct
							representation, where
							each Weight is written
							directly as a 4-bit
							field (0-15). They are
							encoded forward, 2
							weights to a byte with
							the first weight taking
							the top 4 bits and
							the second taking the
							bottom 4; for example, the
							following operations
							could be used to read
							the weights:
						        <figure><artwork>
                </t>
                <artwork name="" type="" align="left" alt=""><![CDATA[
  Weight[0] = (Byte[0] >> 4)
  Weight[1] = (Byte[0] &amp; & 0xf),
  etc.
                                                        </artwork></figure>
                                                        ]]></artwork>
                <t>
							The full representation
							occupies
							ceiling(Number_of_Symbols/2)
							bytes, meaning it uses
							only full bytes even
							if Number_of_Symbols is
							odd.  Number_of_Symbols
							= headerByte - 127.
							Note that maximum
							Number_of_Symbols is
							255 - 127 = 128. If any
							literal
							has a value over 128,
							raw header mode is not
							possible, and it is
							necessary to use FSE
							compression. </t>
					        </list></t>
              </dd>
            </dl>
          </section>
          <section anchor="huffman_tree_fse"
						title="FSE numbered="true" toc="default">
            <name>FSE Compression of Huffman Weights"> Weights</name>
            <t> In this case, the series of
						Huffman weights is compressed
						using FSE compression. It is a
						single bitstream with two
						interleaved states, sharing a
						single distribution table. </t>
            <t> To decode an FSE bitstream, it
						is necessary to know its
						compressed size. Compressed
						size is provided by headerByte.
						It's also necessary to know its
						maximum possible decompressed
						size, which is 255, since
						literal values span from 0 to
						255, and the last symbol's
						Weight is not represented. </t>
            <t> An FSE bitstream starts by
						a header, describing
						probabilities distribution. It
						will create a decoding table.
						For a list of Huffman weights,
						the maximum accuracy log is 6
						bits. For more details, see
						<xref target="comp_fse_table"/>. target="comp_fse_table" format="default"/>.
            </t>
            <t> The Huffman header compression
						uses two states, which share
						the same FSE distribution
						table.

The first state (State1)
						encodes the even-numbered index
						symbols, and the second
						(State2) encodes the odd-numbered
						index symbols. State1 is initialized
						first, and then State2, and
						they take turns decoding a
						single symbol and updating
						their state.

						For more details
						on these FSE operations, see
						<xref target="comp_fse"/>. target="comp_fse" format="default"/>. </t>
            <t> The number of symbols to be
						decoded is determined by
						tracking the bitStream overflow
						condition: If if updating state
						after decoding a symbol would
						require more bits than remain
						in the stream, it is assumed
						that extra bits are zero. Then,
						symbols for each of the
						final states are decoded and
						the process is complete.</t>
          </section>
          <section anchor="huffman_tree_conv"
						title="Conversion numbered="true" toc="default">
            <name>Conversion from Weights to Huffman Prefix Codes"> Codes</name>
            <t> All present symbols will now
						have a Weight value. It is
						possible to transform weights
						into Number_of_Bits, using
						this formula:

					        <figure><artwork>

            </t>

<sourcecode type="pseudocode">
  if Weight > 0
      Number_of_Bits = Max_Number_of_Bits + 1 - Weight
  else
      Number_of_Bits = 0
                                                </artwork></figure></t>
</sourcecode>
            <t> Symbols are sorted by Weight.
						Within the same Weight, symbols
						keep natural sequential
						order. Symbols
						with a Weight of zero are
						removed. Then, starting from
						the
						lowest Weight, prefix codes
						are distributed in sequential
						order. </t>
            <t> For example, assume the
						following list of weights
						has been decoded:
					        <figure><artwork>
  +---------+--------+
  | Literal | Weight |
  +---------+--------+
  |    0    |   4    |
  +---------+--------+
  |    1    |   3    |
  +---------+--------+
  |    2    |   2    |
  +---------+--------+
  |    3    |   0    |
  +---------+--------+
  |    4    |   1    |
  +---------+--------+
  |    5    |   1    |
  +---------+--------+
                                                </artwork></figure>
            </t>

<table anchor="decoded-weights">
  <name>Decoded Weights</name>
  <thead>
    <tr>
      <th align="center">Literal</th>
      <th align="center">Weight</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">4</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">3</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">2</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">5</td>
      <td align="center">1</td>
    </tr>
  </tbody>
</table>
            <t> Sorting by weight and then
						the natural sequential order
						yields the following
						distribution:
					        <figure><artwork>
  +---------+--------+----------------+--------------+
  | Literal | Weight | Number_Of_Bits | Prefix Codes |
  +---------+--------+----------------|--------------+
  |    3    |   0    |        0       |      N/A     |
  +---------+--------+----------------|--------------+
  |    4    |   1    |        4       |     0000     |
  +---------+--------+----------------|--------------+
  |    5    |   1    |        4       |     0001     |
  +---------+--------+----------------|--------------+
  |    2    |   2    |        3       |      001     |
  +---------+--------+----------------|--------------+
  |    1    |   3    |        2       |       01     |
  +---------+--------+----------------|--------------+
  |    0    |   4    |        1       |        1     |
  +---------+--------+----------------|--------------+
                                                </artwork></figure>
            </t>

<table anchor="sorting-by-weight">
  <name>Sorting by Weight</name>
  <thead>
    <tr>
      <th align="center">Literal</th>
      <th align="center">Weight</th>
      <th align="center">Number_Of_Bits</th>
      <th align="center">Prefix Codes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">3</td>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">N/A</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">0000</td>
    </tr>
    <tr>
      <td align="center">5</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">0001</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">2</td>
      <td align="center">3</td>
      <td align="center">001</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">3</td>
      <td align="center">2</td>
      <td align="center">01</td>
    </tr>
    <tr>
      <td align="center">0</td>
      <td align="center">4</td>
      <td align="center">1</td>
      <td align="center">1</td>
    </tr>
  </tbody>
</table>

          </section>
        </section>
        <section anchor="huffman_coded_streams"
					title="Huffman-Coded Streams"> numbered="true" toc="default">
          <name>Huffman-Coded Streams</name>
          <t> Given a Huffman decoding table, it is
					possible to decode a Huffman-coded
					stream. </t>
          <t> Each bitstream must be read backward,
					which starts starting from the end and goes going up to
					the beginning. Therefore, it is
					necessary to know the size of each
					bitstream. </t>
          <t> It is also necessary to know exactly
					which bit is the last. This is
					detected by a final bit flag: the
					highest bit of the last byte is a
					final-bit-flag. Consequently, a last
					byte of 0 is not possible. And the
					final-bit-flag itself is not part of
					the useful bitstream. Hence, the last
					byte contains between 0 and 7 useful
					bits. </t>
          <t> Starting from the end, it is possible
					to read the bitstream in a
					little-endian fashion, keeping track
					of already used bits. Since the
					bitstream is encoded in reverse order,
					starting from the end, read symbols in
					forward order. </t>
          <t> For example, if the literal sequence
					"0145" was encoded using the above prefix
					code, it would be encoded (in reverse
					order) as:
				        <figure><artwork>
  +---------+----------+
  | Symbol  | Encoding |
  +---------+----------+
  |    5    |   0000   |
  +---------+----------+
  |    4    |   0001   |
  +---------+----------+
  |    1    |    01    |
  +---------+----------+
  |    0    |    1     |
  +---------+----------+
  | Padding |   00001  |
  +---------+----------+
                                        </artwork></figure></t>
          </t>

<table anchor="coded-example">
  <name>Literal Sequence "0145"</name>
  <thead>
    <tr>
      <th align="center">Symbol</th>
      <th align="center">Encoding</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">5</td>
      <td align="center">0000</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">0001</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">01</td>
    </tr>
    <tr>
      <td align="center">0</td>
      <td align="center">1</td>
    </tr>
    <tr>
      <td align="center">Padding</td>
      <td align="center">00001</td>
    </tr>
  </tbody>
</table>

          <t> This results in the following 2-byte
					bitstream:
					<figure><artwork>
          </t>
          <artwork name="" type="" align="left" alt=""><![CDATA[
  00010000 00001101
                                        </artwork></figure></t>
                                        ]]></artwork>
          <t> Here is an alternative representation
					with the symbol codes separated by
					underscores:
				        <figure><artwork>
          </t>
          <artwork name="" type="" align="left" alt=""><![CDATA[
  0001_0000 00001_1_01
                                        </artwork></figure></t>
                                        ]]></artwork>
          <t> Reading the highest Max_Number_of_Bits
					bits, it's possible to compare the
					extracted value to the decoding table,
					determining the symbol to decode and
					number of bits to discard. </t>
          <t> The process continues reading up to
					the required number of symbols per
					stream. If a bitstream is not entirely
					and exactly consumed, hence reaching
					exactly its beginning position with
					all bits consumed, the decoding process
					is considered faulty. </t>
        </section>
      </section>
    </section>
    <section anchor="comp_dict" title="Dictionary Format"> numbered="true" toc="default">
      <name>Dictionary Format</name>
      <t> Zstandard is compatible with "raw content"
			    dictionaries, free of any format restriction,
			    except that they must be at least 8 bytes.
			    These dictionaries function as if they were just
			    the content part of a formatted dictionary. </t>
      <t> However, dictionaries created by "zstd --train"
			    in the reference implementation follow a specific
			    format, described here. </t>
      <t> Dictionaries are not included in the compressed
			    content but rather are provided out of band.
			    That is, the Dictionary_ID identifies which should
			    be used, but this specification does not describe
			    the mechanism by which the dictionary is obtained
			    prior to use during compression or
			    decompression. </t>
      <t> A dictionary has a size, defined either by a
			    buffer limit or a file size.  The general format
			    is:
			    <figure><artwork>
  +--------------+---------------+----------------+---------+
  | Magic_Number | Dictionary_ID | Entropy_Tables | Content |
  +--------------+---------------+----------------+---------+
                            </artwork></figure>
      </t>

			<t> <list style="hanging">
				<t hangText="Magic_Number:">

<table anchor="dictionary">
  <name>Dictionary General Format</name>
  <thead>
    <tr>
      <th>Magic_Number</th>
      <th>Dictionary_ID</th>
      <th>Entropy_Tables</th>
      <th>Content</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
  </tbody>
</table>

      <dl newline="false" spacing="normal">
        <dt>Magic_Number:</dt>
        <dd> 4 bytes ID,
					value 0xEC30A437, little-endian
					format. </t>

				<t hangText="Dictionary_ID:"> </dd>
        <dt>Dictionary_ID:</dt>
        <dd>
          <t> 4 bytes, stored
					in little-endian format. Dictionary_ID
					can be any value, except 0 (which
					means no Dictionary_ID). It is used by
					decoders to check if they use the
					correct dictionary. If the frame is
					going to be distributed in a private
					environment, any Dictionary_ID can be
					used. However, for public distribution
					of compressed frames, the following
					ranges are reserved and shall not be
					used:
<list style="hanging">
<?rfc subcompact="yes"?>
<t>low
</t>
          <dl newline="false" spacing="normal">
            <dt/>
            <dd>low range: &lt;= 32767</t>
<t>high 32767</dd>
            <dt/>
            <dd>high range: >= (2^31)</t>
<?rfc subcompact="no"?>
</list>
</t>

				<t hangText="Entropy_Tables:"> &gt;= (2<sup>31</sup>)</dd>
          </dl>
        </dd>
        <dt>Entropy_Tables:</dt>
        <dd> Follow the
					same format as the tables in
					compressed blocks. See the relevant
					FSE and Huffman sections for how to
					decode these tables. They are stored
					in the following order: Huffman table for
					literals, FSE table for offsets, FSE
					table for match lengths, and FSE table
					for literals lengths. These tables
					populate the Repeat Stats literals
					mode and Repeat distribution mode for
					sequence decoding. It is finally
					followed by 3 offset values,
					populating repeat offsets (instead of
					using {1,4,8}), stored in order,
					4-bytes
					4 bytes little-endian each, for a
					total of 12 bytes. Each repeat offset
					must have a value less than the
					dictionary size. </t>

				<t hangText="Content:"> </dd>
        <dt>Content:</dt>
        <dd> The rest of the
					dictionary is its content. The content
					acts as a "past" in front of data to be
					compressed or decompressed, so it can be
					referenced in sequence commands. As
					long as the amount of data decoded
					from this frame is less than or equal
					to Window_Size, sequence commands may
					specify offsets longer than the total
					length of decoded output so far to
					reference back to the dictionary,
					even parts of the dictionary with
					offsets larger than Window_Size.
					After the total output has surpassed
					Window_Size, however, this is no longer
					allowed, and the dictionary is no
					longer accessible. </t>
			    </list> </t> </dd>
      </dl>
    </section>
    <section anchor="dict_future" title="Use numbered="true" toc="default">
      <name>Use of Dictionaries"> Dictionaries</name>
      <t> Provisioning for use of dictionaries with zstd is being
		    explored.  See, for example, <xref target="DICT-SEC"/>. target="I-D.handte-httpbis-dict-sec" format="default"/>.
		    The likely outcome will be a registry of well-tested
		    dictionaries optimized for different use cases and
		    identifiers for each, possibly with a private negotiation
		    mechanism for use of unregistered dictionaries. </t>
      <t> To ensure compatibility with the
		    future specification of use of dictionaries with zstd
		    payloads, especially with MIME, content encoded with the
		    media type registered here should not use a dictionary.
		    The exception to this requirement might be a private
		    dictionary negotiaton, negotiation, suggested above, which is not part
		    of this specification. </t>
    </section>
    <section anchor="iana" title="IANA Considerations"> numbered="true" toc="default">
      <name>IANA Considerations</name>
      <t> IANA has updated two previously existing registrations and
                    made one new registration as described below. </t>
      <section anchor="iana_media_type"
			title="The numbered="true" toc="default">
        <name>The 'application/zstd' Media Type"> Type</name>
        <t> The 'application/zstd' media type identifies a
			    block of data that is compressed using zstd
			    compression.  The data is a stream of bytes as
			    described in this document.  IANA has
			    added the following to the "Media Types"
			    registry:

			    <list style="hanging">
				<t hangText="Type name:"> application </t>

				<t hangText="Subtype name:"> zstd </t>

				<t hangText="Required parameters:"> N/A </t>

				<t hangText="Optional parameters:"> N/A

        </t>

				<t hangText="Encoding considerations:">
        <dl>
          <dt>Type name:</dt>
          <dd>application</dd>
          <dt>Subtype name:</dt>
          <dd>zstd</dd>
          <dt>Required parameters:</dt>
          <dd>N/A</dd>
          <dt>Optional parameters:</dt>
          <dd>N/A</dd>
          <dt>Encoding considerations:</dt>
          <dd>
					binary
				</t>

				<t hangText="Security considerations:">
				</dd>
          <dt>Security considerations:</dt>
          <dd>
	    See <xref target="security"/> target="security" format="default"/> of
					[this document]
				</t>

				<t hangText="Interoperability considerations:">
	    RFC 8878.
				</dd>
          <dt>Interoperability considerations:</dt>
          <dd>
					N/A
				</t>

				<t hangText="Published specification:">
					[this document]
				</t>

				<t hangText="Applications that
				</dd>
          <dt>Published specification:</dt>
          <dd>
	    RFC 8878
	  </dd>
          <dt>Applications which use this media type:"> type:</dt>
          <dd>
					anywhere data size is an issue
				</t>

        			<t hangText="Fragment
				</dd>
          <dt>Fragment identifier considerations:"> considerations:</dt>
          <dd>
					No fragment identifiers are defined
					for this type.
				</t>

				<t hangText="Additional information:">
					<list style="hanging">
						<t hangText="Magic number(s):">
				</dd>
          <dt>Additional information:</dt>
          <dd>
            <t><br/></t>
	     <dl spacing="compact">
             <dt>Deprecated alias names for this type:</dt>
              <dd>
		N/A
	      </dd>
              <dt>Magic number(s):</dt>
              <dd>
		4 bytes, little-endian format. Value:&nbsp;0xFD2FB528
						</t>
<!--note: changed 'zstd' to 'zst' per author request-->
						<t hangText="File extension(s):">
	      </dd>

	      <dt>File extension(s):</dt>
              <dd>
		zst
						</t>
						<t hangText="Macintosh
	      </dd>
              <dt>Macintosh file type code(s):"> code(s):</dt>
              <dd>
		N/A
						</t>
					</list>
				</t>

				<t hangText="For
	      </dd>
	     </dl>
	  </dd>

	  <dt>Person &amp; email address to contact for further information:">
					See <xref target="ZSTD"/>
				</t>

				<t hangText="Intended usage:">
 information:</dt><dd>Yann Collet &lt;cyan@fb.com&gt;</dd>

          <dt>Intended usage:</dt>
          <dd>
					common
				</t>

				<t hangText="Restrictions
				</dd>
          <dt>Restrictions on usage:"> usage:</dt>
          <dd>
					N/A
				</t>

				<t hangText="Author:">
				</dd>
          <dt>Author:</dt>
          <dd>
					Murray S. Kucherawy
				</t>

				<t hangText="Change Controller:"> S.&nbsp;Kucherawy
				</dd>
          <dt>Change Controller:</dt>
          <dd>
					IETF
				</t>
				</dd>
          <!--note: changed 'yes' to 'no' per author request-->
				<t hangText="Provisional registration:">
				<dt>Provisional registration:</dt>
          <dd>
					no
				</t>
			</list> </t>
				</dd>
          <dt>For further information:</dt>
          <dd>
					See <xref target="ZSTD" format="default"/>
          </dd>
        </dl>
      </section>
      <section anchor="iana_content_encoding"
			title="Content Encoding"> numbered="true" toc="default">
        <name>Content Encoding</name>
        <t> IANA has added the following entry
			    to the "HTTP Content Coding Registry"
			    within the "Hypertext Transfer Protocol (HTTP)
			    Parameters" registry:

			    <list style="hanging">
				<t hangText="Name:"> zstd

        </t>

				<t hangText="Description:">
        <dl newline="false" spacing="normal">
          <dt>Name:</dt>
          <dd> zstd </dd>
          <dt>Description:</dt>
          <dd> A stream of bytes
					compressed using the Zstandard
					protocol </t>

				<t hangText="Pointer to specification text:">
					[this document]</t>
			    </list> </t> </dd>
          <dt>Reference:</dt>
          <dd>
	  RFC 8878</dd>
        </dl>
      </section>
      <section anchor="iana_suffix"
			title="Structured numbered="true" toc="default">
        <name>Structured Syntax Suffix"> Suffix</name>
        <t> IANA is requested to register has registered the following
			    into the Structured "Structured Syntax Suffix registry:

			    <list style="hanging">
				<t hangText="Name:"> Zstandard
			    Registry":
<!--[rfced] FYI - we will query IANA about "Registry" appearing as
    part of the title for "Structured Syntax Suffix Registry".-->

        </t>

				<t hangText="+suffix:">
        <dl newline="false" spacing="normal">
          <dt>Name:</dt>
          <dd> Zstandard </dd>
          <dt>+suffix:</dt>
          <dd> +zstd </t>

				<t hangText="Encoding Considerations:"> </dd>
          <dt>Encoding Considerations:</dt>
          <dd>
					binary </t>

				<t hangText="Interoperability Considerations:"> </dd>
          <dt>Interoperability Considerations:</dt>
          <dd>
					N/A </t>

				<t hangText="Fragment </dd>
          <dt>Fragment Identifier Considerations:"> Considerations:</dt>
          <dd>
					The syntax and semantics of fragment
					identifiers specified for +zstd should
					be as specified for "application/zstd".
					</t>

				<t hangText="Security Considerations:"> 'application/zstd'.
	  </dd>

	  <!--[rfced] FYI - We will ask IANA to add single quotes
	      around the medi type name listed in the "Structured
	      Syntax Suffix Registry". -->

          <dt>Security Considerations:</dt>
          <dd>
          See <xref target="security"/> target="security" format="default"/> of
					[this document]. </t>

				<t hangText="Contact:">
	  RFC 8878. </dd>
          <dt>Contact:</dt>
          <dd>
          Refer to the author for the
          application/zstd
          'application/zstd' media type. </t>

				<t hangText="Author/Change Controller:"> </dd>
          <dt>Author/Change Controller:</dt>
          <dd>
          IETF </t>
			    </list> </t> </dd>
        </dl>
      </section>
      <section anchor="iana_dict" title="Dictionaries"> numbered="true" toc="default">
        <name>Dictionaries</name>
        <t> Work in progress includes
			    development of dictionaries that will optimize
			    compression and decompression of particular
			    types of data.  Specification of such
			    dictionaries for public use will necessitate
			    registration of a code point from the reserved
			    range described in
			    <xref target="comp_dictionary_id"/> target="comp_dictionary_id" format="default"/> and its
			    association with a specific dictionary. </t>
        <t> However, At present, there are at present no such dictionaries
			    published for public use, so this document makes
			    has made
			    no immediate request of IANA to create such a
			    registry. </t>
      </section>
    </section>
    <section anchor="security" title="Security Considerations"> numbered="true" toc="default">
      <name>Security Considerations</name>

      <t> Any data compression data-compression method involves the reduction of
		    redundancy in the data.  Zstandard is no exception,
		    and the usual precautions apply. </t>
      <t> One should never compress a message whose
		    content must remain secret with a message generated by
		    a third party.  Such a compression can be used to guess the
		    content of the secret message through analysis of
		    entropy reduction.

                    This was demonstrated in the Compression Ratio
   Info-leak Made Easy (CRIME) attack <xref target="CRIME"/>, target="CRIME" format="default"/>, for example. </t>
      <t> A decoder has to demonstrate capabilities to detect
		    and prevent any kind of data tampering in the compressed
		    frame from triggering system faults, such as reading or
		    writing beyond allowed memory ranges.  This can be
		    guaranteed by either the implementation language
		    or careful bound checkings.  Of particular note is the
		    encoding of Number_of_Sequences values that cause the
		    decoder to read into the block header (and beyond), as
		    well as the indication of a Frame_Content_Size that is
		    smaller than the actual decompressed data, in an attempt
		    to trigger a buffer overflow.  It is highly recommended
		    to fuzz-test (i.e., provide invalid, unexpected, or
		    random input and verify safe operation of) decoder
		    implementations to test and harden their capability to
		    detect bad frames and deal with them without any adverse
		    system side effect. </t>
      <t> An attacker may provide correctly formed compressed frames
		    with unreasonable memory requirements.  A decoder must
		    always control memory requirements and enforce some
		    (system-specific) limits in order to protect memory usage
		    from such scenarios. </t>
      <t> Compression can be optimized by training a dictionary
		    on a variety of related content payloads.  This dictionary
		    must then be available at the decoder for decompression
		    of the payload to be possible.  While this document does
		    not specify how to acquire a dictionary for a given
		    compressed payload, it is worth noting that third-party
		    dictionaries may interact unexpectedly with a decoder,
		    leading to possible memory or other resource exhaustion resource-exhaustion
		    attacks.  We expect such topics to be discussed in further
		    detail in the Security Considerations section of a
		    forthcoming RFC for dictionary acquisition and
		    transmission, but highlight this issue now out of an
		    abundance of caution. </t>
      <t> As discussed in <xref target="comp_skippable"/>, target="comp_skippable" format="default"/>, it is
		    possible to store arbitrary user metadata in skippable
		    frames.  While such frames are ignored during decompression
		    of the data, they can be used as a watermark to track
		    the path of the compressed payload.  </t>
    </section>

	<section anchor="impl" title="Implementation Status">
		<t> Source code for a C language implementation of a
		    Zstandard-compliant library is available at
		    <xref target="ZSTD-GITHUB"/>.  This implementation is
		    considered to be the reference implementation and is
		    production ready; it implements the full range of the
		    specification.  It is routinely tested against security
		    hazards and widely deployed within Facebook
		    infrastructure. </t>

		<t> The reference version is optimized for speed and is highly
		    portable.  It has been proven to run safely on multiple
		    architectures (e.g., x86, x64, ARM, MIPS, PowerPC, IA64)
		    featuring 32- or 64-bit addressing schemes, a little- or
		    big-endian storage scheme, a number of different operating
		    systems (e.g., UNIX (including Linux, BSD, OS-X, and
		    Solaris) and Windows), and a number of compilers (e.g.,
		    gcc, clang, visual, and icc). </t>

		<t> A comprehensive and current list of known implementations
		    can be found at <xref target="ZSTD"/>. </t>
	</section>

  </middle>
  <back>
	<references title="Normative References">

<displayreference target="I-D.handte-httpbis-dict-sec" to="DICT-SEC"/>

    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>

        <reference anchor="ZSTD" target="http://www.zstd.net">
          <front>
            <title> Zstandard
            </title>
            <author fullname="Yann Collet">
				</author> />
            <date year="2017"/>
          </front>
        </reference>
      </references>

	<references title="Informative References">

      <references>
        <name>Informative References</name>

        <reference anchor="ANS" target="https://arxiv.org/pdf/1311.2540">
          <front>
            <title> Asymmetric numeral systems: entropy
					coding combining speed of Huffman
					coding with compression rate of
					arithmetic coding
            </title>
            <author initials='J' surname='Duda' fullname='Jarek Duda'/> initials="J" surname="Duda" fullname="Jarek Duda"/>
            <date month="January" year="2014"/>
          </front>
        </reference>

		<reference anchor="DICT-SEC">
			<front>
				<title> Security Considerations Regarding
					Compression Dictionaries </title>
                                <author initials='W' surname='Handte' fullname='W. Handte'/>
				<date month="October" year="2019"/>
			</front>
			<seriesInfo name="(work in progress)"
				value="draft-handte-httpbis-dict-sec"/>
		</reference>

<!-- [rfced] [DICT-SEC] I-D.handte-httpbis-dict-sec; IESG state Expired  -->

<xi:include
    href="https://www.rfc-editor.org/refs/bibxml3/reference.I-D.handte-httpbis-dict-sec.xml"/>

<!-- Reference [LZ4] The URL below is correct. Also found
     https://android.googlesource.com/platform/external/lz4/+/HEAD/doc/lz4_Frame_format.md
-->

        <reference anchor="LZ4" target="https://github.com/lz4/lz4/blob/master/doc/lz4_Frame_format.md">
          <front>
            <title>LZ4 Frame Format Description</title>
            <author fullname="Yann Collet"/> />
            <date month="January" year="2018"/> year="2019"/>
          </front>
      <refcontent>commit ec735ac</refcontent>
        </reference>

        <reference anchor="FSE" target="https://github.com/Cyan4973/FiniteStateEntropy/">
          <front>
            <title> FiniteStateEntropy
            </title>
            <author fullname="Yann Collet"/> />
            <date month="June" year="2018"/> month="July" year="2020"/>
          </front>
         <refcontent>commit 12a533a</refcontent>
        </reference>

        <reference anchor="CRIME" target="https://en.wikipedia.org/w/index.php?title=CRIME&amp;oldid=844538656">
          <front>
            <title>CRIME
            </title>
            <author/>
            <date month="June" year="2018"/>
          </front>
        </reference>

		<reference

      <!--<reference anchor="ZSTD-GITHUB" target="https://github.com/facebook/zstd">
          <front>
				<title>zstd
            <title>Zstandard - Fast real-time compression algorithm
            </title>
            <author fullname="Yann Collet"/> />
            <date month="August" year="2018"/> month="September" year="2020"/>
          </front>
		</reference>

		&RFC1952;
           <refcontent>commit bcedab0</refcontent>
        </reference>-->

        <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.1952.xml"/>

        <reference anchor="XXHASH" target="http://www.xxhash.org">
          <front>
				<title> XXHASH Algorithm
            <title>xxHash
            </title>
            <author fullname="(unknown author)"/> />
            <date year="2017"/>
          </front>
        </reference>

<reference anchor="Err5786" target="https://www.rfc-editor.org/errata/eid5786">
<front>
<title>Erratum ID 5786</title>
<author><organization>RFC Errata</organization></author>
</front>
<refcontent>RFC 8478</refcontent>
</reference>

      </references>
    </references>
    <section anchor="app_tables"
	         title="Decoding numbered="true" toc="default">
      <name>Decoding Tables for Predefined Codes"> Codes</name>
      <t> This appendix contains FSE decoding tables for the
		    predefined literal literals length, match length, and offset codes.

		    The tables have been constructed using the algorithm as
		    given above in <xref target="comp_fse_table"/>. target="comp_fse_table" format="default"/>. The tables here can be used as examples
		    to crosscheck that an implementation has built its decoding
		    tables correctly. </t>
      <section anchor="app_tables_literal"
			title="Literal numbered="true" toc="default">
        <name>Literals Length Code Table</name>

<table anchor="lit-length-code">
  <name>Literals Length Code</name>
  <thead>
    <tr>
      <th>State</th>
      <th>Symbol</th>
      <th>Number_Of_Bits</th>
      <th>Base</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">0</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">1</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">3</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">4</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">5</td>
      <td align="center">6</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">6</td>
      <td align="center">7</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">7</td>
      <td align="center">9</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">8</td>
      <td align="center">10</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">9</td>
      <td align="center">12</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">10</td>
      <td align="center">14</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">11</td>
      <td align="center">16</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">12</td>
      <td align="center">18</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">13</td>
      <td align="center">19</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">14</td>
      <td align="center">21</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">15</td>
      <td align="center">22</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">16</td>
      <td align="center">24</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">17</td>
      <td align="center">25</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">18</td>
      <td align="center">26</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">19</td>
      <td align="center">27</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">20</td>
      <td align="center">29</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">21</td>
      <td align="center">31</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">22</td>
      <td align="center">0</td>
      <td align="center">4</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">23</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">24</td>
      <td align="center">2</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">25</td>
      <td align="center">4</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">26</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">27</td>
      <td align="center">7</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">28</td>
      <td align="center">8</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">29</td>
      <td align="center">10</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">30</td>
      <td align="center">11</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">31</td>
      <td align="center">13</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">32</td>
      <td align="center">16</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">33</td>
      <td align="center">17</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">34</td>
      <td align="center">19</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">35</td>
      <td align="center">20</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">36</td>
      <td align="center">22</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">37</td>
      <td align="center">23</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">38</td>
      <td align="center">25</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">39</td>
      <td align="center">25</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">40</td>
      <td align="center">26</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">41</td>
      <td align="center">28</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">42</td>
      <td align="center">30</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">43</td>
      <td align="center">0</td>
      <td align="center">4</td>
      <td align="center">48</td>
    </tr>
    <tr>
      <td align="center">44</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">45</td>
      <td align="center">2</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">46</td>
      <td align="center">3</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">47</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">48</td>
      <td align="center">6</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">49</td>
      <td align="center">8</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">50</td>
      <td align="center">9</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">51</td>
      <td align="center">11</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">52</td>
      <td align="center">12</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">53</td>
      <td align="center">15</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">54</td>
      <td align="center">17</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">55</td>
      <td align="center">18</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">56</td>
      <td align="center">20</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">57</td>
      <td align="center">21</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">58</td>
      <td align="center">23</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">59</td>
      <td align="center">24</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">60</td>
      <td align="center">35</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">61</td>
      <td align="center">34</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">62</td>
      <td align="center">33</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">63</td>
      <td align="center">32</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>

  </tbody>
</table>

       </section>
       <section anchor="app_tables_match" numbered="true" toc="default">
	 <name>Match Length Code Table</name>

<table anchor="match-length">
  <name>Match Length Code Table</name>
  <thead>
    <tr>
      <th>State</th>
      <th>Symbol</th>
      <th>Number_Of_Bits</th>
      <th>Base</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">2</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">3</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">5</td>
      <td align="center">6</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">6</td>
      <td align="center">8</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">7</td>
      <td align="center">10</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">8</td>
      <td align="center">13</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">9</td>
      <td align="center">16</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">10</td>
      <td align="center">19</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">11</td>
      <td align="center">22</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">12</td>
      <td align="center">25</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">13</td>
      <td align="center">28</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">14</td>
      <td align="center">31</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">15</td>
      <td align="center">33</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">16</td>
      <td align="center">35</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">17</td>
      <td align="center">37</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">18</td>
      <td align="center">39</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">19</td>
      <td align="center">41</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">20</td>
      <td align="center">43</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">21</td>
      <td align="center">45</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">22</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">23</td>
      <td align="center">2</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">24</td>
      <td align="center">3</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">25</td>
      <td align="center">4</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">26</td>
      <td align="center">6</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">27</td>
      <td align="center">7</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">28</td>
      <td align="center">9</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">29</td>
      <td align="center">12</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">30</td>
      <td align="center">15</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">31</td>
      <td align="center">18</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">32</td>
      <td align="center">21</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">33</td>
      <td align="center">24</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">34</td>
      <td align="center">27</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">35</td>
      <td align="center">30</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">36</td>
      <td align="center">32</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">37</td>
      <td align="center">34</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">38</td>
      <td align="center">36</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">39</td>
      <td align="center">38</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">40</td>
      <td align="center">40</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">41</td>
      <td align="center">42</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">42</td>
      <td align="center">44</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">43</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">44</td>
      <td align="center">1</td>
      <td align="center">4</td>
      <td align="center">48</td>
    </tr>
    <tr>
      <td align="center">45</td>
      <td align="center">2</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">46</td>
      <td align="center">4</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">47</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">48</td>
      <td align="center">7</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">49</td>
      <td align="center">8</td>
      <td align="center">5</td>
      <td align="center">32</td>
    </tr>
    <tr>
      <td align="center">50</td>
      <td align="center">11</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">51</td>
      <td align="center">14</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">52</td>
      <td align="center">17</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">53</td>
      <td align="center">20</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">54</td>
      <td align="center">23</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">55</td>
      <td align="center">26</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">56</td>
      <td align="center">29</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">57</td>
      <td align="center">52</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">58</td>
      <td align="center">51</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">59</td>
      <td align="center">50</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">60</td>
      <td align="center">49</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">61</td>
      <td align="center">48</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">62</td>
      <td align="center">47</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">63</td>
      <td align="center">46</td>
      <td align="center">6</td>
      <td align="center">0</td>
    </tr>

  </tbody>
</table>

      </section>
      <section anchor="app_tables_offset" numbered="true" toc="default">
        <name>Offset Code Table"> Table</name>

<table anchor="offset-code">
  <name>Offset Code</name>
  <thead>
    <tr>
      <th>State</th>
      <th>Symbol</th>
      <th>Number_Of_Bits</th>
      <th>Base</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">0</td>
      <td align="center">0</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">1</td>
      <td align="center">6</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">2</td>
      <td align="center">9</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">3</td>
      <td align="center">15</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">4</td>
      <td align="center">21</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">5</td>
      <td align="center">3</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">6</td>
      <td align="center">7</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">7</td>
      <td align="center">12</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">8</td>
      <td align="center">18</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">9</td>
      <td align="center">23</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">10</td>
      <td align="center">5</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">11</td>
      <td align="center">8</td>
      <td align="center">4</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">12</td>
      <td align="center">14</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">13</td>
      <td align="center">20</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">14</td>
      <td align="center">2</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">15</td>
      <td align="center">7</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">16</td>
      <td align="center">11</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">17</td>
      <td align="center">17</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">18</td>
      <td align="center">22</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">19</td>
      <td align="center">4</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">20</td>
      <td align="center">8</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">21</td>
      <td align="center">13</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">22</td>
      <td align="center">19</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">23</td>
      <td align="center">1</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">24</td>
      <td align="center">6</td>
      <td align="center">4</td>
      <td align="center">16</td>
    </tr>
    <tr>
      <td align="center">25</td>
      <td align="center">10</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">26</td>
      <td align="center">16</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">27</td>
      <td align="center">28</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">28</td>
      <td align="center">27</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">29</td>
      <td align="center">26</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">30</td>
      <td align="center">25</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>
    <tr>
      <td align="center">31</td>
      <td align="center">24</td>
      <td align="center">5</td>
      <td align="center">0</td>
    </tr>

  </tbody>
</table>

      </section>
    </section>
    <section anchor="changes" numbered="true" toc="default">
      <name>Changes since RFC 8478</name>
      <t> <figure><artwork>
  +-------+--------+----------------+------+
  | State | Symbol | Number_Of_Bits | Base |
  +-------+--------+----------------+------+
  |    0  |    0   |        0       |   0  |
  +-------+--------+----------------+------+
  |    0  |    0   |        4       |   0  |
  +-------+--------+----------------+------+
  |    1  |    0   |        4       |  16  |
  +-------+--------+----------------+------+
  |    2  |    1   |        5       |  32  |
  +-------+--------+----------------+------+
  |    3  |    3   |        5       |   0  |
  +-------+--------+----------------+------+
  |    4  |    4   |        5       |   0  |
  +-------+--------+----------------+------+
  |    5  |    6   |        5       |   0  |
  +-------+--------+----------------+------+
  |    6  |    7   |        5       |   0  |
  +-------+--------+----------------+------+
  |    7  |    9   |        5       |   0  |
  +-------+--------+----------------+------+
  |    8  |   10   |        5       |   0  |
  +-------+--------+----------------+------+
  |    9  |   12   |        5       |   0  |
  +-------+--------+----------------+------+
  |   10  |   14   |        6       |   0  |
  +-------+--------+----------------+------+
  |   11  |   16   |        5       |   0  |
  +-------+--------+----------------+------+
  |   12  |   18   |        5       |   0  |
  +-------+--------+----------------+------+
  |   13  |   19   |        5       |   0  |
  +-------+--------+----------------+------+
  |   14  |   21   |        5       |   0  |
  +-------+--------+----------------+------+
  |   15  |   22   |        5       |   0  |
  +-------+--------+----------------+------+
  |   16  |   24   |        5       |   0  |
  +-------+--------+----------------+------+
  |   17  |   25   |        5       |  32  |
  +-------+--------+----------------+------+
  |   18  |   26   |        5       |   0  |
  +-------+--------+----------------+------+
  |   19  |   27   |        6       |   0  |
  +-------+--------+----------------+------+
  |   20  |   29   |        6       |   0  |
  +-------+--------+----------------+------+
  |   21  |   31   |        6       |   0  |
  +-------+--------+----------------+------+
  |   22  |    0   |        4       |  32  |
  +-------+--------+----------------+------+
  |   23  |    1   |        4       |   0  |
  +-------+--------+----------------+------+
  |   24  |    2   |        5       |   0  |
  +-------+--------+----------------+------+
  |   25  |    4   |        5       |  32  |
  +-------+--------+----------------+------+
  |   26  |    5   |        5       |   0  |
  +-------+--------+----------------+------+
  |   27  |    7   |        5       |  32  |
  +-------+--------+----------------+------+
  |   28  |    8   |        5       |   0  |
  +-------+--------+----------------+------+
  |   29  |   10   |        5       |  32  |
  +-------+--------+----------------+------+
  |   30  |   11   |        5       |   0  |
  +-------+--------+----------------+------+
  |   31  |   13   |        6       |   0  |
  +-------+--------+----------------+------+
  |   32  |   16   |        5       |  32  |
  +-------+--------+----------------+------+
  |   33  |   17   |        5       |   0  |
  +-------+--------+----------------+------+
  |   34  |   19   |        5       |  32  |
  +-------+--------+----------------+------+
  |   35  |   20   |        5       |   0  |
  +-------+--------+----------------+------+
  |   36  |   22   |        5       |  32  |
  +-------+--------+----------------+------+
  |   37  |   23   |        5       |   0  |
  +-------+--------+----------------+------+
  |   38  |   25   |        4       |   0  |
  +-------+--------+----------------+------+
  |   39  |   25   |        4       |  16  |
  +-------+--------+----------------+------+
  |   40  |   26   |        5       |  32  |
  +-------+--------+----------------+------+
  |   41  |   28   |        6       |   0  |
  +-------+--------+----------------+------+
  |   42  |   30   |        6       |   0  |
  +-------+--------+----------------+------+
  |   43  |    0   |        4       |  48  |
  +-------+--------+----------------+------+
  |   44  |    1   |        4       |  16  |
  +-------+--------+----------------+------+
  |   45  |    2   |        5       |  32  |
  +-------+--------+----------------+------+
  |   46  |    3   |        5       |  32  |
  +-------+--------+----------------+------+
  |   47  |    5   |        5       |  32  |
  +-------+--------+----------------+------+
  |   48  |    6   |        5       |  32  |
  +-------+--------+----------------+------+
  |   49  |    8   |        5       |  32  |
  +-------+--------+----------------+------+
  |   50  |    9   |        5       |  32  |
  +-------+--------+----------------+------+
  |   51  |   11   |        5       |  32  |
  +-------+--------+----------------+------+
  |   52  |   12   |        5       |  32  |
  +-------+--------+----------------+------+
  |   53  |   15   |        6       |   0  |
  +-------+--------+----------------+------+
  |   54  |   17   |        5       |  32  |
  +-------+--------+----------------+------+
  |   55  |   18   |        5       |  32  |
  +-------+--------+----------------+------+
  |   56  |   20   |        5       |  32  |
  +-------+--------+----------------+------+
  |   57  |   21   |        5       |  32  |
  +-------+--------+----------------+------+
  |   58  |   23   |        5       |  32  |
  +-------+--------+----------------+------+
  |   59  |   24   |        5       |  32  |
  +-------+--------+----------------+------+
  |   60  |   35   |        6       |   0  |
  +-------+--------+----------------+------+
  |   61  |   34   |        6       |   0  |
  +-------+--------+----------------+------+
  |   62  |   33   |        6       |   0  |
  +-------+--------+----------------+------+
  |   63  |   32   |        6       |   0  |
  +-------+--------+----------------+------+
                        </artwork></figure></t>
		</section>

		<section anchor="app_tables_match"
			title="Match Length Code Table">
		        <t> <figure><artwork>
  +-------+--------+----------------+------+
  | State | Symbol | Number_Of_Bits | Base |
  +-------+--------+----------------+------+
  |    0  |    0   |        0       |   0  |
  +-------+--------+----------------+------+
  |    0  |    0   |        6       |   0  |
  +-------+--------+----------------+------+
  |    1  |    1   |        4       |   0  |
  +-------+--------+----------------+------+
  |    2  |    2   |        5       |  32  |
  +-------+--------+----------------+------+
  |    3  |    3   |        5       |   0  |
  +-------+--------+----------------+------+
  |    4  |    5   |        5       |   0  |
  +-------+--------+----------------+------+
  |    5  |    6   |        5       |   0  |
  +-------+--------+----------------+------+
  |    6  |    8   |        5       |   0  |
  +-------+--------+----------------+------+
  |    7  |   10   |        6       |   0  |
  +-------+--------+----------------+------+
  |    8  |   13   |        6       |   0  |
  +-------+--------+----------------+------+
  |    9  |   16   |        6       |   0  |
  +-------+--------+----------------+------+
  |   10  |   19   |        6       |   0  |
  +-------+--------+----------------+------+
  |   11  |   22   |        6       |   0  |
  +-------+--------+----------------+------+
  |   12  |   25   |        6       |   0  |
  +-------+--------+----------------+------+
  |   13  |   28   |        6       |   0  |
  +-------+--------+----------------+------+
  |   14  |   31   |        6       |   0  |
  +-------+--------+----------------+------+
  |   15  |   33   |        6       |   0  |
  +-------+--------+----------------+------+
  |   16  |   35   |        6       |   0  |
  +-------+--------+----------------+------+
  |   17  |   37   |        6       |   0  |
  +-------+--------+----------------+------+
  |   18  |   39   |        6       |   0  |
  +-------+--------+----------------+------+
  |   19  |   41   |        6       |   0  |
  +-------+--------+----------------+------+
  |   20  |   43   |        6       |   0  |
  +-------+--------+----------------+------+
  |   21  |   45   |        6       |   0  |
  +-------+--------+----------------+------+
  |   22  |    1   |        4       |  16  |
  +-------+--------+----------------+------+
  |   23  |    2   |        4       |   0  |
  +-------+--------+----------------+------+
  |   24  |    3   |        5       |  32  |
  +-------+--------+----------------+------+
  |   25  |    4   |        5       |   0  |
  +-------+--------+----------------+------+
  |   26  |    6   |        5       |  32  |
  +-------+--------+----------------+------+
  |   27  |    7   |        5       |   0  |
  +-------+--------+----------------+------+
  |   28  |    9   |        6       |   0  |
  +-------+--------+----------------+------+
  |   29  |   12   |        6       |   0  |
  +-------+--------+----------------+------+
  |   30  |   15   |        6       |   0  |
  +-------+--------+----------------+------+
  |   31  |   18   |        6       |   0  |
  +-------+--------+----------------+------+
  |   32  |   21   |        6       |   0  |
  +-------+--------+----------------+------+
  |   33  |   24   |        6       |   0  |
  +-------+--------+----------------+------+
  |   34  |   27   |        6       |   0  |
  +-------+--------+----------------+------+
  |   35  |   30   |        6       |   0  |
  +-------+--------+----------------+------+
  |   36  |   32   |        6       |   0  |
  +-------+--------+----------------+------+
  |   37  |   34   |        6       |   0  |
  +-------+--------+----------------+------+
  |   38  |   36   |        6       |   0  |
  +-------+--------+----------------+------+
  |   39  |   38   |        6       |   0  |
  +-------+--------+----------------+------+
  |   40  |   40   |        6       |   0  |
  +-------+--------+----------------+------+
  |   41  |   42   |        6       |   0  |
  +-------+--------+----------------+------+
  |   42  |   44   |        6       |   0  |
  +-------+--------+----------------+------+
  |   43  |    1   |        4       |  32  |
  +-------+--------+----------------+------+
  |   44  |    1   |        4       |  48  |
  +-------+--------+----------------+------+
  |   45  |    2   |        4       |  16  |
  +-------+--------+----------------+------+
  |   46  |    4   |        5       |  32  |
  +-------+--------+----------------+------+
  |   47  |    5   |        5       |  32  |
  +-------+--------+----------------+------+
  |   48  |    7   |        5       |  32  |
  +-------+--------+----------------+------+
  |   49  |    8   |        5       |  32  |
  +-------+--------+----------------+------+
  |   50  |   11   |        6       |   0  |
  +-------+--------+----------------+------+
  |   51  |   14   |        6       |   0  |
  +-------+--------+----------------+------+
  |   52  |   17   |        6       |   0  |
  +-------+--------+----------------+------+
  |   53  |   20   |        6       |   0  |
  +-------+--------+----------------+------+
  |   54  |   23   |        6       |   0  |
  +-------+--------+----------------+------+
  |   55  |   26   |        6       |   0  |
  +-------+--------+----------------+------+
  |   56  |   29   |        6       |   0  |
  +-------+--------+----------------+------+
  |   57  |   52   |        6       |   0  |
  +-------+--------+----------------+------+
  |   58  |   51   |        6       |   0  |
  +-------+--------+----------------+------+
  |   59  |   50   |        6       |   0  |
  +-------+--------+----------------+------+
  |   60  |   49   |        6       |   0  |
  +-------+--------+----------------+------+
  |   61  |   48   |        6       |   0  |
  +-------+--------+----------------+------+
  |   62  |   47   |        6       |   0  |
  +-------+--------+----------------+------+
  |   63  |   46   |        6       |   0  |
  +-------+--------+----------------+------+
                        </artwork></figure></t>
		</section>

		<section anchor="app_tables_offset"
			title="Offset Code Table">
		        <t> <figure><artwork>
  +-------+--------+----------------+------+
  | State | Symbol | Number_Of_Bits | Base |
  +-------+--------+----------------+------+
  |    0  |    0   |        0       |   0  |
  +-------+--------+----------------+------+
  |    0  |    0   |        5       |   0  |
  +-------+--------+----------------+------+
  |    1  |    6   |        4       |   0  |
  +-------+--------+----------------+------+
  |    2  |    9   |        5       |   0  |
  +-------+--------+----------------+------+
  |    3  |   15   |        5       |   0  |
  +-------+--------+----------------+------+
  |    4  |   21   |        5       |   0  |
  +-------+--------+----------------+------+
  |    5  |    3   |        5       |   0  |
  +-------+--------+----------------+------+
  |    6  |    7   |        4       |   0  |
  +-------+--------+----------------+------+
  |    7  |   12   |        5       |   0  |
  +-------+--------+----------------+------+
  |    8  |   18   |        5       |   0  |
  +-------+--------+----------------+------+
  |    9  |   23   |        5       |   0  |
  +-------+--------+----------------+------+
  |   10  |    5   |        5       |   0  |
  +-------+--------+----------------+------+
  |   11  |    8   |        4       |   0  |
  +-------+--------+----------------+------+
  |   12  |   14   |        5       |   0  |
  +-------+--------+----------------+------+
  |   13  |   20   |        5       |   0  |
  +-------+--------+----------------+------+
  |   14  |    2   |        5       |   0  |
  +-------+--------+----------------+------+
  |   15  |    7   |        4       |  16  |
  +-------+--------+----------------+------+
  |   16  |   11   |        5       |   0  |
  +-------+--------+----------------+------+
  |   17  |   17   |        5       |   0  |
  +-------+--------+----------------+------+
  |   18  |   22   |        5       |   0  |
  +-------+--------+----------------+------+
  |   19  |    4   |        5       |   0  |
  +-------+--------+----------------+------+
  |   20  |    8   |        4       |  16  |
  +-------+--------+----------------+------+
  |   21  |   13   |        5       |   0  |
  +-------+--------+----------------+------+
  |   22  |   19   |        5       |   0  |
  +-------+--------+----------------+------+
  |   23  |    1   |        5       |   0  |
  +-------+--------+----------------+------+
  |   24  |    6   |        4       |  16  |
  +-------+--------+----------------+------+
  |   25  |   10   |        5       |   0  |
  +-------+--------+----------------+------+
  |   26  |   16   |        5       |   0  |
  +-------+--------+----------------+------+
  |   27  |   28   |        5       |   0  |
  +-------+--------+----------------+------+
  |   28  |   27   |        5       |   0  |
  +-------+--------+----------------+------+
  |   29  |   26   |        5       |   0  |
  +-------+--------+----------------+------+
  |   30  |   25   |        5       |   0  |
  +-------+--------+----------------+------+
  |   31  |   24   |        5       |   0  |
  +-------+--------+----------------+------+
                        </artwork></figure></t>
		</section>
	      </section>

	<section anchor="changes" title="Changes Since RFC8478">
		<t> The following are the changes in this document relative
		    to RFC 8478:

		    <list style="symbols">
			<t> Apply erratum #5786. </t>

			<t> Clarify forward compatibility regarding
			    dictionaries. </t>

			<t> Clarify application of Block_Maximum_Size. </t>

			<t> Add structured media type suffix registration. </t>

                        <t> Clarify that the content checksum is always The following are the changes in this document relative
      to RFC 8478:</t>

      <ul spacing="normal">
       <li> Applied erratum <xref target="Err5786" format="default"/>. </li>
      <li>
      Clarified forward compatibility regarding dictionaries. </li>
      <li>
      Clarified application of Block_Maximum_Size. </li>
      <li> Added
      structured media type suffix registration. </li>
      <li> Clarified
      that the content checksum is always 4 bytes. </t>

                        <t> Clarify </li>
      <li> Clarified
      handling of reserved and corrupt inputs. </t>

			<t> Add </li>
      <li> Added fragment
      identifier considerations to the media type registration. </t>
		    </list> </t> </li>
      </ul>
    </section>

    <section anchor="ack" title="Acknowledgments">
		<t> zstd numbered="false"
	     toc="default"> <name>Acknowledgments</name>
    <t>zstd was
      developed by Yann Collet. </t>

		<t> Felix Handte and Nick Terrell <t><contact fullname="Felix
      Handte"/> and <contact fullname="Nick Terrell"/> provided
      feedback that went into this revision and RFC 8478.  RFC 8478
      also received contributions from Bobo Bose-Kolanu, Kyle Nekritz, and
		    David Schleimer. <contact fullname="Bobo
      Bose-Kolanu" />, <contact fullname="Kyle Nekritz" />, and
      <contact fullname="David Schleimer"/>. </t> </section> </back>
      </rfc>