Home

Awesome

THIS TEXT IS A STUB, WRITTEN AS IF IT'S ALREADY REAL, BUT IT WANTS YOUR OPINIONS AND EDITS. IT IS IN NO WAY FINAL, BUT RATHER A FIRST DRAFT AT SOMETHING TO GET THE THINKY BITS OF OUR OPENTYPE BRAINS GOING. CFF IS FANTASTIC, AND THERE IS A LOT OF SIMPLIFICATION POSSIBLE FOR CFF THAT IS EXCLUSIVELY INTENDED FOR OPENTYPE WRAPPING. WHAT DOES A SIMPLIFICATION LIKE THAT LOOK LIKE?

The CFF Table

OpenType supports embedding of CFF data blocks as defined in Adobe's Tech Note 5176 ("The Compact Font Format Specification"). For more information on this specification, please consult this Tech Note.

In addition to generic CFF, OpenType supports simplified CFF for font data that has been specifically intended to be used inside an OpenType wrapper. CFF data that is not intended to ever be used outside of OpenType context uses the four byte OpenType magic number instead of the four byte CFF Header. This simplification differs from the full CFF specification in the following way:

CFF structureRole in CFFRole in OpenType-specific CFF
Initial four bytesCFF headerOpenType magic number
Name INDEXFont names for all fonts in the CFFNot used
Top DICT INDEXDICT information that applies to all fonts in the CFFOffsets to the rest of the CFF data
String INDEXAll strings needed for making the CFF human-readableNot used
Global Subr INDEXGlobal subroutines for all fontsFont subroutines
EncodingsEncodings used by all fontsNot used
CharsetsCharacter sets used by all fontsNot used
FDSelectglyph-to-FontDICT table used by CID fontsNot used
CharStrings INDEXGlyph outlinesGlyph outlines
Font DICT INDEXFont DICT structures used by CID fontsNot used
Private DICTFont-specific DICT metadataNot used
Local Subr INDEXFont-specific subroutinesSecondary font subroutines
Copyrights and TrademarksNo longer usedNot used

Data Layout for an OpenType-specific CFF block

A CFF block intended purely for use inside an OpenType wrapped has the following data layout:

Data structurecomments
OpenType magic numberMust be the four bytes 0x5F 0x0F 0x3C 0xF5.
Top DICT INDEXEncodes not-covered-by-OpenType metadata, and contains the offsets to the CharStrings and Local Subroutines structures.
Global Subr INDEXMay be used, must be a single byte 0x00 if not used.
CharStrings INDEXContains the glyph outlines for this font. Must use the Type 2 CharString format.
Local Subr INDEXMay be used, must be a single byte 0x00 if not used.

Data Types

Like all other OpenType data, OT-CFF data uses Motorola-style byte ordering (Big Endian). As a compact data format, OT-CFF does not incorporate any alignment restrictions, and as such there are no padding bytes used anywhere inside the CFF block itself, although the block as table may be padded to meet the alignment requirements of the OpenType wrapper.

Data objects are specified by byte offsets that are relative to some reference point within the CFF data. These offsets are 1 to 4 bytes in length. This documentation uses the convention of enclosing the reference point in parentheses and uses a reference point of (0) to indicate an offset relative to the start of the CFF data and (self) to indicate an offset relative to the first byte of the data structure containing the offset.

CFF uses five primary data formats, with Type 2 Charstrings specifying an additional data format for operator and operand information. In addition to the established OpenType Data Types (OpenType spec section reference here?), the following additional types are used:

NameRangeDescription
Offsetvariable1, 2, 3, or 4 byte numeral representing a byte offset.
Offsize1-41 byte unsigned numeral that specifies the size of (local) Offset(s)
DICT datavariablesemantic bytecode representing either DICT keys or numerals in several encodings

DICT data

DICT data encodes key-value pair represented in a compact tokenized format that is similar to that used to represent Type 2 charstrings. DICT keys are encoded as 1 or 2 byte operators, with dictionary values encoded as numeric operands that represent either integer or real values. An operator is preceded by the operand(s) that specify its value. A DICT is encoded as the concatenation of operand(s)/operator bytes.

Representing numbers in a DICT

Five different numeric encodings are defined, for efficiently representing various number ranges as well as real (as opposed to intenger) numbers. The following table shows these encodings, using the convention that the first byte of the operand is b0, the second byte is b1, and so on. The table is ordered by the number of bytes that need to be read to decode the number:

Sizeb0 rangevalue rangevalue calculation
132-246-107 to 107b0 - 139
2247-250108 to 1131(b0-247) * 256 + b1 + 108
2251-254-1131 to -108-(b0-251) * 256 - b1 - 108
328–32768 to 32767b1<<8 + b2
529-(2^31) to (2^31)-1b1<<24 + b2<<16 + b3<<8 + b4

Note: The 1, 2, and 3 byte integer formats used here are identical to those used in Type 2 CharStrings.

Some examples of the integer formats are listed here.

ValueEncoding
08b
100ef
-10027
1000fa 7c
-1000fe 7c
100001c 27 10
-100001c d8 f0
1000001d 00 01 86 a0
-1000001d ff fe 79 60

The real (as opposed to integer) number operand begins with a byte value of 30 (0x1E) followed by a variable-length sequence of bytes. Each byte encodes two 4-bit nibbles, with the first nibble of a pair stored in the most significant 4 bits of a byte, and the second nibble of a pair stored in the least significant 4 bits of a byte. The interpretation of each nibble is as follows:

NibbleRepresents
0-9digit 0-9
Adecimal point
Bpositive exponent
Cnegative exponent
DNot used
Eminus sign
Fend of number

A real number is terminated by one (or two) 0xF nibbles so that it is always padded to a full byte.

Some examples of the real formats are listed here.

ValueEncoding
–2.251E for the operator, followed by E2 (-2) A2 (.2) 5F (5, end of number)
0.140541E–31E for the operator, followed by 0A (0.) 140541 (140541) C3 (E-3) FF (end of number)

Representing operators in a DICT

Operators and operands may be distinguished by inspection of their first byte: 0–21 specify operators and 28, 29, 30, and 32–254 specify operands (numbers). Byte values 22–27, 31, and 255 are reserved values in DICT structures (unlike in Type 2 CharStrings).

OT-CFF follows the same stack restrictions as the generic CFF specification, such that an operator may not be preceded by more than 48 operands.

Operators may have one or more operands with an implicit format, formally declared in this specification for each operator. Possible operand interpretations are:

TypeDescription
numberInteger or real number
booleanInteger type with the values 0 (false) and 1 (true)
arrayA list of one or more numbers
deltaA number, or delta-encoded array of numbers (see below)

The length of arrays is determined by counting the operands preceding the operator.

The second and subsequent numbers in a delta-encoded array are encoded as the difference between successive values. For example, an array a0, a1, ..., an will be encoded as: a0 (a1–a0) (a2–a1), ..., (an–a(n–1)).

DICT operators may either use 1 or 2 bytes. 2-byte operators always start with the escape byte 12 (0x0C).

Unlike generic CFF, OT-CFF does not use default values for unspecified DICT keys. Most values for which generic CFF has default values are fields that are already encoded by OpenType font in OpenType tables, and as such should not be duplicated inside the CFF DICT.

The list of DICT operators is as follows:

Operators that may be used in an OT-CFF Top DICT are:

Operator byte codeOperator nameDescriptionOperand format
10StdHW...number
11StdVW...number
12 12StemSnapH...number
12 13StemSnapV...number
12 19InitialRandomSeedSeed value for the pseudo-random number generator (used if any of the CharStrings involve random numbers)number (default=0)
17CharStringsOffset to the CharStrings INDEXoffset(0)
19SubrsOffset to the local subroutinesoffset(self)

The defaultWidthX and nominalWidthX operators supply width values for glyphs. If a glyph width equals the defaultWidthX value it can be omitted from the charstring, otherwise the glyph width is computed by adding the charstring width to nominalWidthX value. If nominalWidthX is carefully chosen the bulk of the widths in the charstrings can be reduced from 2-byte to single-byte numbers, thereby saving space.

I'VE OMITTED THE UniqueID FIELD, BECAUSE FOR AN OPENTYPE-WRAPPED CFF THIS VALUE HAS NO MEANING

FOR PURE OPENTYPE CFF, THE DEFAULT AND NOMINAL WIDTH VALUES ARE ALREADY COVERED BY PROPER OPENTYPE METRICS, AND SO HAVE BEEN OMITTED HERE

AS A HISTORICAL TYPE1 CONSTRUCT, I'VE TAKEN THE LIBERTY TO OMIT BLUE ZONE INFORMATION. IF, FOR SOME REASON, THIS SHOULD STILL BE PART OF A MODERN SPEC, THEN THEY WILL NEED EXPLICIT DESCRIPTIONS ON WHAT THEY'RE FOR AND WHY OPENTYPE DOESN'T COVER THIS, AS THEY ARE ABSOLUTELY NOT COVERED BY TECH NOTES 5176 AND 5177

INDEX data structures

An INDEX is an array of variable-sized objects. It comprises a header, an object data blob, and an offset array for locating entries inside the object data blob.

The offset array specifies offsets within the object data. An object is retrieved by indexing the offset array and fetching the object at the specified offset. The object’s length can be determined by subtracting its offset from the next offset in the offset array. An additional offset pointing to "the end of the array" is added at the end of the offset array so that the length of the last object may be determined by subtracting its offset value from this last offset. As such, An INDEX may be skipped by jumping to the offset specified by the last element of the offset array.

The INDEX structure is as follows:

typenamedescription
USHORTcountNumber of objects stored in the INDEX's data region
OffsizeoffSizeThe number of bytes used for offsets in the offset array.
Offset[count+1]offsetThe array of offsets into the object data. Values start at 1 (see note below).
BYTE[]dataObject data byte sequence

Offsets in the offset array are relative to the byte that precedes the object data. As such, the first element of the offset array always has value 1, not 0. This ensures that every object has a corresponding offset which is always nonzero and permits the efficient implementation of dynamic object loading.

An empty INDEX is represented by a count field with a value "0". In this case, no additional fields need to be specified. Thus, the total size of an empty INDEX is 2 bytes with sequence 00 00.

OT-CFF uses the OpenType magic number instead of a CFF Header

Where the generic CFF specification requires the CFF block to start with a four byte CFF header, the OT-CFF block must start with the four byte OpenType magic number 0x5F0F3CF5.

Note that if a simplified CFF block is interpreted as generic CFF (for instance, if something copies the block out without rewriting it to proper generic CFF format for embedding in PostScript or PDF documents), the OpenType magic number will be interpreted as a CFF Header structure, leading to the following invalid parse result:

FieldValue
Major version95
Minor version15
Header structure size in bytes60
Global number of bytes used fo offset values95

This should lead to an explicit parse error because the Offsize field may only contain values 1, 2, 3, or 4. Thus, a simplified OpenType-CFF block will not work outside of an OpenType context.

The Top DICT INDEX

The top-level dictionary INDEX is an INDEX structure following the byte layout explained in the secion on INDEX structures, and is used for describing several not-covered-by-OpenType values, as well as to locate the byte positions of the starts of the CharStrings INDEX (for glyph outlines) and local subroutines (which glyph outlines may rely on).

Top DICT data

See the section on Data Types for which values may be encoded in the Top DICT's object data area.

The Global Subr INDEX

This INDEX structure has an object data block that contains all (fragments of) Type 2 CharStrings that may be used as subroutines ("Subr") by glyphs defined in the CharStrings INDEX object data block.

If no global subroutines are used, this table must be encoded as a single byte 0 (0x00).

Sub routine numbers in Type 2 CharStrings are skewed by a number called the "subr number bias", which is calculated from the subroutine count in either the local or global subr INDEX structures. This bias is calculated as follows:

USHORT bias;
USHORT nSubrs = subrINDEX.count;

if (nSubrs < 1240)
  bias = 107;
else if (nSubrs < 33900)
  bias = 1131;
else
  bias = 32768;

For correct subroutine selection the calculated bias must be added to a subroutine's operand before accessing the appropriate subroutine INDEX.

This technique allows subroutine numbers to be specified using negative as well as positive numbers, thereby fully utilizing the available number ranges, thus saving space.

The CharStrings INDEX

This INDEX structure has an object data block that contains all glyph outlines available to the OpenType font that this CFF block is housed in.

Glyph outlines must be defined using Type 2 CharStrings, and are accessed by GID as defined in the OpenType cmap table(s).

The first CharString (GID 0) must be the .notdef glyph. The number of glyphs encoded in the CharString INDEX object data block must be in agreement with the numGlyphs field in the maxp table.

Type 2 CharStrings

The official specification for Type 2 Charstrings is [Adobe's Tech Note 5177, "The Type 2 CharString format"](https://partners.adobe.com/public/developer/en/font/5177.Type 2.pdf). Please consult this Tech Note for information on how to encode/decode data in Type 2 CharString format.

The local Subr INDEX

This INDEX structure has an object data block that contains all (fragments of) Type 2 CharStrings that may be used as local subroutines by glyphs defined in the CharStrings INDEX object data block.

If no local subroutines are used, this table must be encoded as a single byte 0 (0x00).

Local subroutines, like global subroutines, require a "bias" correction before resolving. See the section on Global Subroutines for the description of how to calculate this bias.