CBDF Specification: Control Character Assignment Table

Version 1.0 (Phase II)

Document: 07-Control-Characters • Date: 2026-03-25

1. Overview

The 32 ASCII control characters (0x00-0x1F) serve as the command vocabulary of the CBDF format. They appear in two contexts:

  • A. STRUCTURAL: Separating major sections and sub-sections of the document. These appear at the document level (between sections) and within the Styles section (between sub-tables).
  • B. INLINE: Appearing within the Text section to switch styles, insert elements, and mark semantic boundaries. These are interspersed with printable UTF-8 text.

Plain-Text Extraction Rule

A plain-text reader strips ALL bytes 0x00-0x1F from the text section EXCEPT:

  • 0x09 (TAB) = Horizontal Tab
  • 0x0A (LINE_BREAK) = Line Feed

The remaining printable UTF-8 bytes are the readable text. Note: 0x0D (HORIZ_RULE) may optionally be replaced with "---" in plain text.

2. Master Assignment Table

HexDecASCIICBDF NameCommandPayloadPhase
0x000NULNOPNo Operation / PaddingNoneII
0x011SOHSUBJECT_STARTStart of HeadingNoneII
0x022STXTEXT_STARTStart of TextNoneII
0x033ETXTEXT_ENDEnd of TextNoneII
0x044EOTDOC_ENDEnd of TransmissionNoneII
0x055ENQRESERVED_05Reserved-III
0x066ACKRESERVED_06Reserved-III
0x077BELRESERVED_ALERTReserved (Alert)-III
0x088BSRESERVED_BKSPReserved (Backspace)-III
0x099HTTABHorizontal TabNoneII
0x0A10LFLINE_BREAKLine FeedNoneII
0x0B11VTPARA_BREAKParagraph BreakNoneII
0x0C12FFPAGE_BREAKPage BreakNoneII
0x0D13CRHORIZ_RULEHorizontal Rule[StyleIndex:1]II
0x0E14SOLINK_STARTLink Start[Type:1][Len:1][Target:N]II
0x0F15SILINK_ENDLink EndNoneII
0x1016DLEDATA_ESCAPEData Escape[Len:2 LE][Raw data:N]II
0x1117DC1STYLE_TEXTApply Text Style[Index:1]II
0x1218DC2STYLE_CONTAINERApply Container Style[Index:1]II
0x1319DC3STYLE_TABLEApply Table Style[Index:1]II
0x1420DC4STYLE_ENDEnd Style / PopNoneII
0x1521NAKELEMENT_IDElement ID[ID:1]II
0x1622SYNIMAGEInsert Image[ImageDefIndex:1]II
0x1723ETBBLOCK_ENDEnd of BlockNoneII
0x1824CANRESERVED_HIDEReserved (Hide/Show)-III
0x1925EMITEM_BLOCKStart Item Block[Type:1][StyleIndex:1]II
0x1A26SUBAI_PROMPTAI Prompt[Type:1][Len:2 LE][UTF-8:N]II
0x1B27ESCESCAPEEscape (Extended Cmd)[Code:1][Payload:variable]III
0x1C28FSSECTION_SEPFile SeparatorNoneII
0x1D29GSGROUP_SEPGroup SeparatorNoneII
0x1E30RSRECORD_SEPRecord SeparatorNoneII
0x1F31USUNIT_SEPUnit SeparatorNoneII
Control Characters Visual Map

Click diagram to open full size in new tab

3. Detailed Command Specifications

0x00 NOP (NUL): No Operation / Padding

Ignored by the parser. Used for byte alignment in style records or as filler in fixed-size fields.

0x01 SUBJECT_START (SOH): Styled Subject

Marks the beginning of the styled subject text within the Text section. Only meaningful when the document type is "email." The plain subject in the meta section is used for inbox listings; the styled subject is rendered when the email is opened.

[STX] [SOH] [DC1 style_index] Subject text here [DC4] Body text... [ETX]

0x02 TEXT_START (STX): Start of Text

Marks the beginning of the document body text. Mandatory in every document that has a Text section.

0x03 TEXT_END (ETX): End of Text

Marks the end of the document body text. ETX is technically redundant (the length prefix is authoritative) but serves as a VALIDATION CHECK: the parser can verify that ETX appears at the position indicated by the length prefix. If it doesn't, the document may be corrupt.

0x04 DOC_END (EOT): End of Document

Explicit end-of-file marker. Optional -- the parser may also use the section length prefix or actual EOF.

0x05-0x08 Reserved (Phase III)

Reserved for interactive features: 0x05 ENQ (resource request), 0x06 ACK (acknowledgment), 0x07 BEL (alert/sound), 0x08 BS (typing/animation). NOT functional in Phase II.

0x09 TAB (HT): Horizontal Tab

Standard tab character. PRESERVED in plain-text extraction.

0x0A LINE_BREAK (LF): Line Feed

Standard Unix line break. PRESERVED in plain-text extraction.

0x0B PARA_BREAK (VT): Paragraph Break

Marks a semantic paragraph boundary. Renderers should insert vertical spacing larger than a line break. Plain-text extraction: stripped (replaced by double LF by convention).

0x0C PAGE_BREAK (FF): Page Break

Starts a new rendering page or screen. Used in multi-page documents or long-form content.

0x0D HORIZ_RULE (CR): Horizontal Rule

Inserts a horizontal divider line (equivalent to HTML <hr>). Payload: [Style Index: 1 byte] -- references a Container Border Style. Index 0 = default rule (1px mid-gray line).

Design Note

0x0D was previously CARRIAGE_RETURN. Since CBDF is rendered by custom software (not a terminal), only LINE_BREAK (0x0A) is needed for line breaks. Windows-style CR+LF is unnecessary. Freeing 0x0D for HORIZ_RULE provides a high-value command for a commonly used element.

0x0E LINK_START (SO): Link Start

Marks the beginning of a hyperlink. Payload: [Type: 1 byte] [Length: 1 byte] [Target: N bytes]

TypeDescription
0URL (UTF-8 string)
1QWeb Page ID
2Mailbox Address (7 bytes)
3Action/Script trigger (1 byte action ID)
4-255Reserved

0x0F LINK_END (SI): Link End

Marks the end of a hyperlink started by SO (0x0E).

0x10 DATA_ESCAPE (DLE): Data Escape

Bypasses control character interpretation for a block of raw binary data. Payload: [Length: 2 bytes LE] [Raw data: N bytes]. For Phase II, primarily a safety mechanism. Most binary data should be in the Resources section.

0x11 STYLE_TEXT (DC1): Apply Text Style

Switches the current text formatting to the specified style. Payload: [Index: 1 byte] (0-255, index into Text Styles sub-table). The style remains active until another DC1, a DC4 (pop), or ETX.

0x12 STYLE_CONTAINER (DC2): Apply Container Style

Starts a new container (panel, div-equivalent) with the specified composite style. Payload: [Index: 1 byte]. The container holds all content until a matching ETB (0x17) or DC4 (0x14).

0x13 STYLE_TABLE (DC3): Apply Table Style

Starts a table with the specified style. Payload: [Index: 1 byte]. Cells separated by US (0x1F), rows by RS (0x1E), end with ETB (0x17).

Design Note

DC3 was assigned to TABLE STYLE (not layer targeting as Gemini proposed). Layer targeting is handled as a PROPERTY of the container style record (the composite style includes a layer ID field) rather than a separate inline command.

0x14 STYLE_END (DC4): End Style / Pop

Reverts to the previous style on the style stack. Each DC1, DC2, or DC3 pushes a style; DC4 pops back to what was active before.

0x15 ELEMENT_ID (NAK): Element ID

Assigns a stable, unique ID to the NEXT element. Payload: [ID: 1 byte] (0-254; 0xFF escapes to 2-byte LE extended ID). IDs MUST be unique and stable across re-serialization.

0x16 IMAGE (SYN): Insert Image

Inserts an image at the current position. Payload: [Image Definition Index: 1 byte]. Indirection chain: SYN [index] -> Image Definition -> Resource ID -> data.

0x17 BLOCK_END (ETB): End of Block

Closes the current block-level element (container, table, nav bar, or item block). Pairs with DC2, DC3, or EM.

0x18 RESERVED_HIDE (CAN): Reserved (Phase III)

Possible future use: "Initially Hidden" -- element is parsed but not rendered until user action reveals it. NOT functional in Phase II.

0x19 ITEM_BLOCK (EM): Start Item Block

Begins a structured item block. Payload: [Type: 1 byte] [Style Index: 1 byte].

TypeDescription
0Unordered list (bullets)
1Ordered list (numbered)
2Nav bar (horizontal/vertical navigation)
3Definition list (term/definition pairs)
4-255Reserved

0x1A AI_PROMPT (SUB): AI Prompt

Embeds an AI prompt that the receiver's client evaluates. Payload: [Type: 1 byte] [Length: 2 bytes LE] [UTF-8 prompt text: N bytes].

TypeDescription
0Style prompt (AI generates a visual style)
1Image prompt (AI generates an image)
2Layout prompt (AI modifies page layout)
3-255Reserved

Maximum prompt length: 65535 bytes (2-byte LE length field). If the client has no AI capability, it applies the user's configured default style/behavior.

0x1B ESCAPE (ESC): Extended Commands (Phase III)

Reserved for extended command sequences. Payload: [Command Code: 1 byte] [Payload: variable per command].

Pre-defined ESCAPE sub-commands (Phase III):

CodeNamePayloadDescription
0x01ESC_FRAME[FrameDefIndex:1]Insert iframe/embed
0x02ESC_LAYER_TARGET[LayerID:1]Direct content to layer
0x03ESC_COMMENT[Len:2 LE][UTF-8:N]Parser-ignored comment
0x04ESC_SUPER_ONNoneSuperscript toggle on
0x05ESC_SUPER_OFFNoneSuperscript toggle off
0x06ESC_SUB_ONNoneSubscript toggle on
0x07ESC_SUB_OFFNoneSubscript toggle off
0x08-0xFFReserved-Future extended commands

0x1C SECTION_SEP (FS): File Separator

Separates the major sections of a CBDF document. Always present even if a section is empty. Exception: if the meta section contains an EOF flag, NO FS markers follow.

0x1D GROUP_SEP (GS): Group Separator

Separates style sub-tables within the Styles section.

0x1E RECORD_SEP (RS): Record Separator

Separates individual records within a style sub-table. Also used within the Text section to separate TABLE ROWS inside a DC3 table context.

Dual-Use Safety Note

There is no ambiguity because the contexts are non-overlapping. RS only appears in styles mode (between records) or inside a DC3 table block (row separator). A parser that tracks its context will never misinterpret RS.

0x1F UNIT_SEP (US): Unit Separator

Separates items within a structured item block (started by EM 0x19). Used for nav bar items, list items, and table cells.

4. Phase Summary

Phase II (26 active commands):

NOP, SUBJECT_START, TEXT_START, TEXT_END, DOC_END, TAB, LINE_BREAK, PARA_BREAK, PAGE_BREAK, HORIZ_RULE, LINK_START, LINK_END, DATA_ESCAPE, STYLE_TEXT, STYLE_CONTAINER, STYLE_TABLE, STYLE_END, ELEMENT_ID, IMAGE, BLOCK_END, ITEM_BLOCK, AI_PROMPT, SECTION_SEP, GROUP_SEP, RECORD_SEP, UNIT_SEP

Phase III reserved (6 commands):

RESERVED_05 (0x05), RESERVED_06 (0x06), RESERVED_ALERT (0x07), RESERVED_BKSP (0x08), RESERVED_HIDE (0x18), ESCAPE (0x1B)

Preserved for plain-text extraction (2 commands): TAB (0x09), LINE_BREAK (0x0A)

5. Style Stack Behavior

The text section maintains a STYLE STACK. Commands that push/pop:

PUSH (new style context):

  • STYLE_TEXT [index] -- pushes text style
  • STYLE_CONTAINER [index] -- pushes container style (also opens block)
  • STYLE_TABLE [index] -- pushes table style (also opens block)

POP (revert to previous):

  • STYLE_END -- pops one level (reverts style, does NOT close block)
  • BLOCK_END -- closes block AND pops associated style

Example nesting:

[STYLE_CONTAINER 0x01]  -- push container style #1 (stack: [C1])
  [STYLE_TEXT 0x00]     -- push text style #0 (stack: [C1, T0])
    Normal text
    [STYLE_TEXT 0x03]   -- push text style #3 (stack: [C1, T0, T3])
      Bold text
    [STYLE_END]         -- pop (stack: [C1, T0])
    Normal again
  [STYLE_END]           -- pop (stack: [C1])
[BLOCK_END]             -- close container, pop (stack: [])

6. Cross-Model Alignment Notes

This table resolves the following disagreements between models:

  • a) STYLE_TABLE (0x13): Assigned to TABLE STYLE, not layer targeting. Layer IDs are a property of the container style record.
  • b) AI_PROMPT (0x1A): Uses [Type:1][Len:2 LE][Text:N] format. 2-byte length supports prompts up to 65535 bytes.
  • c) Phase III reservations: 6 commands reserved for future interactive features. NOT functional in Phase II.
  • d) Nav bar: Uses ITEM_BLOCK (0x19) with type byte, covering lists, nav bars, and other repeating-item structures with a single control code.
  • e) DATA_ESCAPE (0x10): Includes 2-byte LE length prefix so the parser knows exactly how many bytes to skip.

7. C Enumeration

The following enum can be used directly in C parser/serializer code:

typedef enum {
    /* -- Padding & Terminators -- */
    NOP              = 0x00,  /* No operation / alignment padding      */
    SUBJECT_START    = 0x01,  /* Styled subject begins (inside STX)    */
    TEXT_START        = 0x02,  /* Start of text body                    */
    TEXT_END          = 0x03,  /* End of text body (validation marker)  */
    DOC_END           = 0x04,  /* End of document                       */

    /* -- Reserved for Phase III -- */
    RESERVED_05       = 0x05,  /* Future: interactive resource request  */
    RESERVED_06       = 0x06,  /* Future: interactive acknowledgment    */
    RESERVED_ALERT    = 0x07,  /* Future: client alert / sound          */
    RESERVED_BKSP     = 0x08,  /* Future: typing / animation effect     */

    /* -- Whitespace -- */
    TAB               = 0x09,  /* Horizontal tab                        */
    LINE_BREAK        = 0x0A,  /* Line feed (Unix line break)           */
    PARA_BREAK        = 0x0B,  /* Paragraph break (extra spacing)       */
    PAGE_BREAK        = 0x0C,  /* Page break / new page                 */
    HORIZ_RULE        = 0x0D,  /* Horizontal rule: [StyleIndex]         */

    /* -- Links -- */
    LINK_START        = 0x0E,  /* Hyperlink start: [Type][Len][Target]  */
    LINK_END          = 0x0F,  /* Hyperlink end                         */

    /* -- Data -- */
    DATA_ESCAPE       = 0x10,  /* Raw binary data: [Len:2 LE][Data]     */

    /* -- Style Application -- */
    STYLE_TEXT        = 0x11,  /* Apply text style: [Index]             */
    STYLE_CONTAINER   = 0x12,  /* Apply container style: [Index]        */
    STYLE_TABLE       = 0x13,  /* Apply table style: [Index]            */
    STYLE_END         = 0x14,  /* Pop style stack / revert              */

    /* -- Elements -- */
    ELEMENT_ID        = 0x15,  /* Assign stable ID: [ID]                */
    IMAGE             = 0x16,  /* Insert image: [ImageDefIndex]         */
    BLOCK_END         = 0x17,  /* Close current block element           */

    /* -- Reserved for Phase III -- */
    RESERVED_HIDE     = 0x18,  /* Future: initially hidden / toggle     */

    /* -- Structured Content -- */
    ITEM_BLOCK        = 0x19,  /* List/nav/etc: [Type][StyleIndex]      */
    AI_PROMPT         = 0x1A,  /* AI prompt: [Type][Len:2 LE][UTF-8]   */

    /* -- Reserved for Phase III -- */
    ESCAPE            = 0x1B,  /* Extended commands: [Code][Payload]    */

    /* -- Structural Separators -- */
    SECTION_SEP       = 0x1C,  /* Between major sections (FS)           */
    GROUP_SEP         = 0x1D,  /* Between style sub-tables (GS)         */
    RECORD_SEP        = 0x1E,  /* Between records / table rows (RS)     */
    UNIT_SEP          = 0x1F   /* Between items in a list/nav (US)      */
} cbdf_cmd_t;

Link types:

typedef enum {
    LINK_URL          = 0x00,  /* URL string                            */
    LINK_QWEB_PAGE    = 0x01,  /* QWeb page ID                          */
    LINK_MAILBOX      = 0x02,  /* Mailbox address (7 bytes)             */
    LINK_ACTION       = 0x03   /* Logic section action trigger           */
} cbdf_link_type_t;

Item block types:

typedef enum {
    ITEM_UNORDERED    = 0x00,  /* Unordered list (bullets)              */
    ITEM_ORDERED      = 0x01,  /* Ordered list (numbered)               */
    ITEM_NAV          = 0x02,  /* Navigation bar                        */
    ITEM_DEFINITION   = 0x03   /* Definition list (term/value pairs)    */
} cbdf_item_type_t;

AI prompt types:

typedef enum {
    AI_STYLE          = 0x00,  /* Generate a visual style               */
    AI_IMAGE          = 0x01,  /* Generate an image                     */
    AI_LAYOUT         = 0x02   /* Modify page layout                    */
} cbdf_ai_type_t;