432 lines
15 KiB
Plaintext
432 lines
15 KiB
Plaintext
|
////
|
||
|
WireProto Specification © 2024 by Brent Saner is licensed under Creative Commons Attribution-ShareAlike 4.0 International. To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/
|
||
|
////
|
||
|
|
||
|
= WireProto Specification
|
||
|
Brent Saner <bts@square-r00t.net>
|
||
|
Last rendered {localdatetime}
|
||
|
:doctype: book
|
||
|
:docinfo: shared
|
||
|
:data-uri:
|
||
|
:imagesdir: images
|
||
|
:sectlinks:
|
||
|
:sectnums:
|
||
|
:sectnumlevels: 7
|
||
|
:toc: preamble
|
||
|
:toc2: left
|
||
|
:idprefix:
|
||
|
:toclevels: 7
|
||
|
:source-highlighter: rouge
|
||
|
:docinfo: shared
|
||
|
:this_protover: 1
|
||
|
:this_protover_hex: 0x00000001
|
||
|
:lib_ver: master
|
||
|
:lib_ver_ref: branch
|
||
|
|
||
|
[id="license"]
|
||
|
== License
|
||
|
|
||
|
++++
|
||
|
include::LICENSE.html[]
|
||
|
++++
|
||
|
|
||
|
In a nutshell, means you can:
|
||
|
|
||
|
* Use it in commercial/proprietary/internal works...
|
||
|
* Expand upon/change the specification...
|
||
|
** (As long as it is released under the same Creative Commons license)
|
||
|
|
||
|
As long as you attribute the original (this document). This can be as simple as something like:
|
||
|
|
||
|
====
|
||
|
Based on WireProto version <protocol version> as found at https://wireproto.io/.
|
||
|
====
|
||
|
|
||
|
More details certainly helps, though; you may want to mention the exact date you "forked" it, etc.
|
||
|
|
||
|
Please see the full text as collapsed above or https://creativecommons.org/licenses/by-sa/4.0/legalcode.en[the online version^] of the license for full legal copy.
|
||
|
|
||
|
NOTE: In the event of the embedded text in this document differing from the online version, the online version is assumed to take precedence as the valid license applicable to this work.
|
||
|
|
||
|
[id="proto"]
|
||
|
== Protocol
|
||
|
The WireProto data packing API is a custom wire protocol//message format designed for incredibly performant, unambiguous, predictable, platform-agnostic, client-agnostic communication. It is based heavily on the https://github.com/openssh/openssh-portable/blob/master/PROTOCOL.key[OpenSSH "v1" key format^] https://git.r00t2.io/r00t2/go_sshkeys/src/branch/master/_ref/KEY_GUIDE.html#v1_plain_2[(example/details)] packing method.
|
||
|
|
||
|
It supports arbitrary binary values, which means they can be anything according to the implementation-specific details; a common practice is to encode ("marshal") a Go struct to JSON bytes, and set that as a WireProto field's value.
|
||
|
|
||
|
It supports both static construction/parsing/dissection and stream approaches in a single format, as well as multiple commands per request message/multiple answers per response message.
|
||
|
|
||
|
*All* packed uint32 values are big-endian.
|
||
|
|
||
|
This specification <<proto_ver>> is `{this_protover}` (`{this_protover_hex}`).
|
||
|
|
||
|
[id="lib"]
|
||
|
=== Library
|
||
|
This protocol specification is accompanied with a reference library for Golang, https://git.r00t2.io/r00t2/go_wireproto["WireProto"^] (https://git.r00t2.io/r00t2/wireproto[_source_^]):
|
||
|
|
||
|
++++
|
||
|
<a href="https://pkg.go.dev/go.pkg.dev/r00t2.io/wireproto">
|
||
|
<img src="https://pkg.go.dev/badge/go.pkg.dev/r00t2.io/wireproto.svg"
|
||
|
alt="Go Reference">
|
||
|
</a>
|
||
|
++++
|
||
|
|
||
|
[id="ytho"]
|
||
|
=== Why a Custom Message Format?
|
||
|
Because existing ones (e.g. JSON, XML, YAML) are slow/bloaty, inaccurate, and/or inflexible. They struggle with binary or abritrary data (or in e.g. XML's case requiring intermediate conditional encoding/decoding).
|
||
|
|
||
|
If it can be represented as bytes (which all digital data can), WireProto can send and receive it.
|
||
|
|
||
|
Additionally:
|
||
|
|
||
|
* https://protobuf.dev/[*Protobuf*^] has performance issues (yes, really; protobufs have large overhead) and is restrictive on data types for future-proofing.
|
||
|
* https://go.dev/blog/gob[*Gob*^] is very language-limiting and does not support e.g. nil pointers and cyclical values.
|
||
|
* https://capnproto.org/[Cap'n Proto^] has wide language support and excellent performance but is terribly non-idiomatic, requiring the code to be generated from the schema and not vice versa (which is only ideal if you have only one communication interface).
|
||
|
* https://en.wikipedia.org/wiki/JSON_streaming[JSON streams^] have no delimiters defined, and thus this makes it an inconvenience if using a parser that does not know when the message ends/is complete, or if it is expecting a standalone JSON object.
|
||
|
|
||
|
[TIP]
|
||
|
====
|
||
|
WireProto is only used for binary packing/unpacking; this means it can be used with any e.g. https://pkg.go.dev/net#Conn[`net.Conn`^] (and even has helper functions explicitly to facilitate this), storage on-disk, etc.
|
||
|
|
||
|
Thus it is transport/storage-agnostic, and can be used with a https://pkg.go.dev/net#Dial[TCP socket, UDP socket, IPC (InterProcess Communication)/UDS (UNIX Domain Socket) handle,^] https://pkg.go.dev/crypto/tls#Dial[TLS-tunneled TCP socket^], etc.
|
||
|
====
|
||
|
|
||
|
[id="msg"]
|
||
|
== Message Format
|
||
|
|
||
|
[TIP]
|
||
|
====
|
||
|
Throughout this document, you may see references to things like `LF`, `SOH`, and so forth.
|
||
|
|
||
|
These refer to _ASCII control characters_. You will also see many values represented in hex.
|
||
|
|
||
|
You can find more details about this (along with a full ASCII reference) https://square-r00t.net/ascii.html[here^]. Note that the socket API fully supports UTF-8 -- just be sure that your <<alloc_size>> are aligned to the byte count, not character count.
|
||
|
====
|
||
|
|
||
|
Each *message* is generally composed of:
|
||
|
|
||
|
* The <<msg_respstatus>>footnote:responly[Response messages only.]
|
||
|
* A <<cksum, Checksum>>footnote:optclient[Optional for Request.]footnote:reqsrv[Required for Response.]
|
||
|
* A <<proto_ver>>
|
||
|
* A <<hdrs_msgstart>>
|
||
|
* A <<msg_grp>> <<alloc_cnt>>
|
||
|
* A <<msg_grp>> <<alloc_size>>
|
||
|
* A <<hdrs_bodystart>>
|
||
|
* One (or more) <<msg_grp>>(s), each of which contain:
|
||
|
** One (or more) <<msg_grp_rec>>(s), each of which contain:
|
||
|
*** One (or more) <<msg_grp_rec_kv, Field/Value pair>>(s), each of which contain:
|
||
|
**** A <<msg_grp_rec_kv_nm>>
|
||
|
**** A <<msg_grp_rec_kv_val>>
|
||
|
**** A <<msg_grp_recresp, Copy Record>>footnote:responly[]
|
||
|
* A <<hdrs_bodyend>>
|
||
|
* A <<hdrs_msgend>>
|
||
|
|
||
|
[id="msg_respstatus"]
|
||
|
=== Response Status
|
||
|
For responses, their messages have an additional byte prepended; a status indicator.
|
||
|
This allows client programs to quickly bail in the case of an error if no further parsing is desired.
|
||
|
|
||
|
The status will be indicated by one of <<hdrs_respstart, two values>>: an ASCII `ACK` (`0x06`) for all requests being returned successfully or an ASCII `NAK` (`0x15`) if one or more errors were encountered across all records.
|
||
|
|
||
|
[id="proto_ver"]
|
||
|
=== Protocol Version
|
||
|
The protocol version is a packed uint32 that denotes which version of this protocol specification is being used.
|
||
|
|
||
|
It is maintained seperately from the *library* version/repo tags.
|
||
|
|
||
|
The current protocol version (as demonstrated in this document) is `{this_protover}` (`{this_protover_hex}`).
|
||
|
|
||
|
NOTE: Version `0` is reserved for current `HEAD` of the `master` branch of this specification and should be considered experimental.
|
||
|
|
||
|
[id="msg_grp"]
|
||
|
=== Record Group
|
||
|
A record group contains multiple related <<msg_grp_rec, Records>>. It is common to only have a single Record Group.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. <<msg_grp_rec>> <<alloc_cnt>>
|
||
|
. <<msg_grp_rec>> <<alloc_size>>
|
||
|
. One (or more) <<msg_grp_rec, Records>>
|
||
|
|
||
|
[id="msg_grp_rec"]
|
||
|
==== Record
|
||
|
A record contains multiple related <<msg_grp_rec_kv, Field/Value Pairs (FVP)>>. It is typical to only have a single Record.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. <<msg_grp_rec_kv>> <<alloc_cnt>>
|
||
|
. <<msg_grp_rec_kv>> <<alloc_size>>
|
||
|
. One (or more) <<msg_grp_rec_kv, Field/Value Pairs>>
|
||
|
|
||
|
[IMPORTANT]
|
||
|
====
|
||
|
For response messages, the record's size allocator (but NOT the count allocator) includes the <<msg_grp_recresp, Copy Record>> size for each response record copy!footnote:responly[]
|
||
|
====
|
||
|
|
||
|
[id="msg_grp_rec_kv"]
|
||
|
===== Field/Value Pair (Key/Value Pair)
|
||
|
A field/value pair (also referred to as a key/value pair) contains a matched <<msg_grp_rec_kv_nm>> and its <<msg_grp_rec_kv_val>>.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. <<msg_grp_rec_kv_nm>> <<alloc_size>>
|
||
|
. <<msg_grp_rec_kv_val>> <<alloc_size>>
|
||
|
. A single <<msg_grp_rec_kv_nm>>
|
||
|
. A single matching <<msg_grp_rec_kv_val>>
|
||
|
|
||
|
[IMPORTANT]
|
||
|
====
|
||
|
Unlike most/all other <<alloc>> for other sections/levels, the field name and value allocators are consecutive <<alloc_size, Size Allocators>>! This is because there is only one field name and value per record.
|
||
|
====
|
||
|
|
||
|
[id="msg_grp_rec_kv_nm"]
|
||
|
====== Field Name
|
||
|
The field name is usually from a finite set of allowed names. The <<msg_grp_rec_kv_val>>, while written as bytes, often contains a data structure defined by the field name. (A field name is closer to a "value type".) It *must* be a UTF-8 string.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. The name in bytes
|
||
|
|
||
|
[id="msg_grp_rec_kv_val"]
|
||
|
====== Field Value
|
||
|
A field's value is, on the wire, just a series of bytes. The actual content of those bytes, including any structure or encoding, is likely to/probably depends on the paired <<msg_grp_rec_kv_nm>>.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. The value in bytes
|
||
|
|
||
|
[id="msg_grp_recresp"]
|
||
|
===== Copy Record (Response Copy of Request)
|
||
|
This contains a "copy" of the original/request's <<msg_grp_rec>> that this record is in response to.
|
||
|
|
||
|
It is a variant of a <<msg_grp_rec>> used exclusively in responses, and is tied to (included in) each response's <<msg_grp_rec_kv, FVP>>.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. <<msg_grp_rec_kvcpy>> <<alloc_cnt>>
|
||
|
. <<msg_grp_rec_kvcpy>> <<alloc_size>>
|
||
|
. One (or more) <<msg_grp_rec_kvcpy, Field/Value Pairs (Response Copy)>>
|
||
|
|
||
|
[id="msg_grp_rec_kvcpy"]
|
||
|
====== Field/Value Pair (Key/Value Pair) (Response Copy)
|
||
|
A field/value pair (also referred to as a key/value pair) contains a matched <<msg_grp_rec_kv_nm>> and its <<msg_grp_rec_kv_val>>.
|
||
|
|
||
|
It is a variant of a <<msg_grp_rec_kv, Field/Value Pair>> used exclusively in response copies of the original request's FVP.
|
||
|
|
||
|
Its structure is:
|
||
|
|
||
|
. <<msg_grp_rec_kv_nm>> <<alloc_size>>
|
||
|
. <<msg_grp_rec_kv_val>> <<alloc_size>>
|
||
|
. A single <<msg_grp_rec_kv_nm>>
|
||
|
. A single matching <<msg_grp_rec_kv_val>>
|
||
|
|
||
|
[IMPORTANT]
|
||
|
====
|
||
|
Unlike most/all other <<alloc>> for other sections/levels, the field name and value allocators are consecutive <<alloc_size, Size Allocators>>! This is because there is only one field name and value per record.
|
||
|
====
|
||
|
|
||
|
[id="cksum"]
|
||
|
== Checksums
|
||
|
Checksums are optional for the client but the server will *always* send them. *If present* in the request, the server will validate to ensure the checksum matches the message body (<<hdrs_bodystart, body start>> to <<hdrs_bodyend, body end>>, headers included). If the checksum does not match, an error will be returned.
|
||
|
|
||
|
They are represented as a big-endian-packed uint32.
|
||
|
|
||
|
The checksum must be prefixed with a <<hdrs_cksum>>. If no checksum is provided, this prefix must *not* be included in the sequence.
|
||
|
|
||
|
[TIP]
|
||
|
====
|
||
|
You can quickly check if a checksum is present by checking the first byte in requests or the second byte in responses. If it is `ESC` (`0x1b`), a checksum is provided. If it is `SOH` (`0x01`), one was *not* provided.
|
||
|
====
|
||
|
|
||
|
The checksum method used is the https://users.ece.cmu.edu/~koopman/crc/crc32.html[IEEE 802.3 CRC-32^], which should be natively available for all/most client implementations as it is perhaps the most ubiquitous of CRC-32 variants. (Polynomial `0x04c11db7`, reversed polynomial `0xedb88320`.)
|
||
|
|
||
|
To confirm you are using the correct CRC32 implementation (as there are a *ton* of "CRC-32" algorithms and methods out there), use the following validations:
|
||
|
|
||
|
.CRC-32 Validations
|
||
|
[cols="^.^2m,3m,^.^1m,^.^2m,^.^2m",options="header"]
|
||
|
|===
|
||
|
| String ^.^| Bytes | Checksum (integer) | Checksum (bytes, little-endian) | Checksum (bytes, big-endian)
|
||
|
|
||
|
| FooBarBazQuux | 0x466f6f42617242617a51757578 | 983022564 | 0xe4bb973a | 0x3a97bbe4
|
||
|
| 0123456789abcdef | 0x30313233343536373839616263646566 | 1757737011 | 0x33f0c468 | 0x68c4f033
|
||
|
|===
|
||
|
|
||
|
[id="hdrs"]
|
||
|
== Headers
|
||
|
Certain sections are wrapped with an identifying header. Those headers are included below for reference.
|
||
|
|
||
|
[id="hdrs_respstart"]
|
||
|
=== `RESPSTART` Byte Sequence
|
||
|
Responses have a <<msg_respstatus>>.footnote:responly[]
|
||
|
|
||
|
It is either an `ACK` (`0x06`) or `NAK` (`0x15`).
|
||
|
|
||
|
[id="hdrs_cksum"]
|
||
|
=== `CKSUM` Header Prefix
|
||
|
A checksum, if provided, will have a prefix header of `ESC` (`0x1b`).
|
||
|
|
||
|
[id="hdrs_msgstart"]
|
||
|
=== `MSGSTART` Header Prefix
|
||
|
The message start header indicates a start of a message.
|
||
|
|
||
|
It is an `SOH` (`0x01`).
|
||
|
|
||
|
[id="hdrs_bodystart"]
|
||
|
=== `BODYSTART` Header Prefix
|
||
|
The body start header indicates that actual data/records follows.
|
||
|
|
||
|
It is an `STX` (`0x02`).
|
||
|
|
||
|
[id="hdrs_bodyend"]
|
||
|
=== `BODYEND` Sequence
|
||
|
The body end prefix indicates the end of data/records.
|
||
|
|
||
|
It is an `ETX` (`0x03`).
|
||
|
|
||
|
[id="hdrs_msgend"]
|
||
|
=== `MSGEND` Sequence
|
||
|
The message end prefix indicates that a message in its entirety has ended.
|
||
|
|
||
|
It is an `EOT` (`0x04`).
|
||
|
|
||
|
[id="alloc"]
|
||
|
== Allocators
|
||
|
There are two type of allocators included for each following sequence of bytes: `count allocators` and `size allocators`.
|
||
|
|
||
|
They can be used by clients to determine the size of destination buffers, and are used by the server to efficiently unpack requests.
|
||
|
|
||
|
They are usually paired together with the count allocator preceding the size allocator, but not always (e.g. <<msg_grp_rec_kv>> have two <<alloc_size>>).
|
||
|
|
||
|
All allocators are unsigned 32-bit integers, little-endian-packed.
|
||
|
|
||
|
[id="alloc_cnt"]
|
||
|
=== Count Allocator
|
||
|
Count allocators indicate *how many* children objects are contained.
|
||
|
|
||
|
[id="alloc_size"]
|
||
|
=== Size Allocator
|
||
|
Size allocators indicate *how much* (in bytes) all children objects are combined together. It includes e.g. separators, etc.
|
||
|
|
||
|
[id="ref"]
|
||
|
== Reference Model and Examples
|
||
|
For a more visual explanation, given the following e.g. Golang structs from the https://pkg.go.dev/r00t2.io/wireproto[Golang reference library^] (`wireproto.Request{}` and `wireproto.Response{}`):
|
||
|
|
||
|
[id="ref_single"]
|
||
|
=== Single/Simple
|
||
|
|
||
|
[id="ref_single_req"]
|
||
|
==== Single/Simple Request
|
||
|
[%collapsible]
|
||
|
.Example Message Structure (Simple Request)
|
||
|
====
|
||
|
[source,go]
|
||
|
----
|
||
|
include::https://git.r00t2.io/r00t2/go_wireproto/raw/{lib_ver_ref}/{lib_ver}/test_obj_simple_req.go[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Would then serialize as (in hex):
|
||
|
|
||
|
[%collapsible]
|
||
|
.Annotated Hex
|
||
|
====
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/request.simple.txt[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Or, non-annotated:
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/request.simple.hex[]
|
||
|
----
|
||
|
|
||
|
[id="ref_single_resp"]
|
||
|
==== Single/Simple Response
|
||
|
[%collapsible]
|
||
|
.Example Message Structure (Simple Response)
|
||
|
====
|
||
|
[source,go]
|
||
|
----
|
||
|
include::https://git.r00t2.io/r00t2/go_wireproto/raw/{lib_ver_ref}/{lib_ver}/test_obj_simple_resp.go[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Would then serialize as (in hex):
|
||
|
|
||
|
[%collapsible]
|
||
|
.Annotated Hex
|
||
|
====
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/response.simple.txt[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Or, non-annotated:
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/response.simple.hex[]
|
||
|
----
|
||
|
|
||
|
[id="ref_multi"]
|
||
|
=== Multiple/Many/Complex
|
||
|
Multiple commands, parameters, etc. can be specified in one message.
|
||
|
|
||
|
[id="ref_multi_req"]
|
||
|
==== Complex Request
|
||
|
[%collapsible]
|
||
|
.Example Message Structure (Multiple/Many Requests, Single Message)
|
||
|
====
|
||
|
[source,go]
|
||
|
----
|
||
|
include::https://git.r00t2.io/r00t2/go_wireproto/raw/{lib_ver_ref}/{lib_ver}/test_obj_multi_req.go[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Would then serialize as (in hex):
|
||
|
|
||
|
[%collapsible]
|
||
|
.Annotated Hex
|
||
|
====
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/request.multi.txt[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Or, non-annotated:
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/request.multi.hex[]
|
||
|
----
|
||
|
|
||
|
[id="ref_multi_resp"]
|
||
|
==== Complex Response
|
||
|
[%collapsible]
|
||
|
.Example Message Structure (Response to Multiple/Many Requests, Single Message)
|
||
|
====
|
||
|
[source,go]
|
||
|
----
|
||
|
include::https://git.r00t2.io/r00t2/go_wireproto/raw/{lib_ver_ref}/{lib_ver}/test_obj_multi_resp.go[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Would then serialize as (in hex):
|
||
|
|
||
|
[%collapsible]
|
||
|
.Annotated Hex
|
||
|
====
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/response.multi.txt[]
|
||
|
----
|
||
|
====
|
||
|
|
||
|
Or, non-annotated:
|
||
|
[source,text]
|
||
|
----
|
||
|
include::docs/data/response.multi.hex[]
|
||
|
----
|