Appendix A — Encoding primitives

Appendix A — Encoding primitives

TL;DR. The self-delimiting field encoding used by every phux wire payload: field-tagged TLVs, eight wire types (varint, signed varint, fixed-32, fixed-64, bytes, message, list, tagged), and the extensibility rules that let decoders skip unknown fields by length. Big-endian throughout, protobuf-shaped but tailored to phux’s tagged unions.


1. Field encoding

Every payload is encoded as a sequence of fields. Fields are self-delimiting: a decoder can skip an unknown field without knowing its semantics.

A field is { field_id: varint, wire_type: u8, value: ... }, where wire_type determines how value is encoded:

wire_typeNameEncoding
0VARINTLEB128 unsigned integer
1SVARINTLEB128 zig-zag signed integer
2FIXED324 bytes, big-endian
3FIXED648 bytes, big-endian
4BYTES`varint length
5MESSAGE`varint length
6LIST`varint length
7TAGGED`varint tag

Messages and tagged unions are encoded as a sequence of fields, each prefixed with its field_id and wire_type. Decoders match by field_id (not by position) and skip unknown field_ids by reading their declared wire_type.

This format is intentionally similar in spirit to Protocol Buffers’ wire format, but designed for the specific concerns of this protocol:

  • Big-endian for hex-dump readability and “network feel”.
  • No varint-only restriction on integers; fixed widths exist where natural (e.g. timestamps, color channels) so the wire matches the conceptual width.
  • A first-class TAGGED wire type for tagged unions, so they don’t have to be reified as oneof-style hacks.

A canonical hex dump of a HELLO_OK selecting version 0.1.0 is included in crates/phux-protocol/tests/snapshots/hello_ok_v0_1_0.snap once the codec exists.