blob: cac4d4a9674defc082755c875faf06e0112210e6 [file] [log] [blame]
James Kuszmaulb13e13f2023-11-22 20:44:04 -08001= WPILib Packed Struct Serialization Specification, Version 1.0
2WPILib Developers <wpilib@wpi.edu>
3Revision 1.0 (0x0100), 6/8/2023
4:toc:
5:toc-placement: preamble
6:sectanchors:
7
8A simple format and schema for serialization of packed fixed size structured data.
9
10[[motivation]]
11== Motivation
12
13Schema-based serialization formats such as Protobuf and Flatbuffers are extremely flexible and can handle data type evolution, complex nested data structures, variable size / repeated data, optional fields, etc. However, this flexibility comes at a cost in both serialized data size and processing overhead. Many simple data structures, such as screen coordinates or robot poses, are fixed in size and can be stored much more compactly and serialized/deserialized much more quickly, especially on embedded or real-time platforms.
14
15Simply storing a C-style packed structure is very compact and fast, but information about the layout of the structure and the meaning of each member must be separately communicated for introspection by other tools such as interactive dashboards for data analysis of individual structure members. The motivation for this standard layout and schema is to provide a standardized means to communicate this information and enable dynamic decoding.
16
17Python's struct module uses a character-based approach to describe data layout of structures, but has no provisions for naming each member to communicate intent/meaning.
18
19[[references]]
20== References
21
22[[c-struct-declaration]]
23* Struct declaration, https://en.cppreference.com/w/c/language/struct
24
25[[definitions]]
26== Definitions
27
28[[schema]]
29== Schema
30
31The schema is a text-based format with similar syntax to the list of variable declarations in a C structure. The C syntax is flexible, easy to parse, and matches the intent of specifying a fixed size structure.
32
33Each member of the struct is defined by a single declaration. Each declaration is either a standard declaration or a bit-field declaration. Declarations are separated by semicolons. The last declaration may optionally have a trailing semicolon. Empty declarations (e.g. two semicolons back-to-back or separated by only whitespace) are allowed but are ignored. Unlike C structures, every declaration must be separated by a semicolon; commas cannot be used to declare multiple members with the same type. Declarations may also start and end with whitespace.
34
35[[variable]]
36=== Standard Declaration
37
38Standard declarations declare a member of a certain type or a fixed-size array of that type. The structure of a standard declaration is:
39
40* optional enum specification (integer data types only)
41* optional whitespace
42* type name
43* whitespace
44* identifier name
45* optional array size, consisting of:
46 * optional whitespace
47 * `[`
48 * optional whitespace
49 * size of array
50 * optional whitespace
51 * `]`
52
53The type name may be one of these:
54
55[cols="1,1,3", options="header"]
56|===
57|Type Name|Description|Payload Data Contents
58|`bool`|boolean|single byte (0=false, 1=true)
59|`char`|character|single byte (assumed UTF-8)
60|`int8`|integer|1-byte (8-bit) signed value
61|`int16`|integer|2-byte (16-bit) signed value
62|`int32`|integer|4-byte (32-bit) signed value
63|`int64`|integer|8-byte (64-bit) signed value
64|`uint8`|unsigned integer|1-byte (8-bit) unsigned value
65|`uint16`|unsigned integer|2-byte (16-bit) unsigned value
66|`uint32`|unsigned integer|4-byte (32-bit) unsigned value
67|`uint64`|unsigned integer|8-byte (64-bit) unsigned value
68|`float` or `float32`|float|4-byte (32-bit) IEEE-754 value
69|`double` or `float64`|double|8-byte (64-bit) IEEE-754 value
70|===
71
72If it is not one of the above, the type name must be the name of another struct.
73
74Examples of valid standard declarations:
75
76* `bool value` (boolean value, 1 byte)
77* `double arr[4]` (array of 4 doubles, 32 bytes total)
78* `enum {a=1, b=2} int8 val` (enumerated value, 1 byte)
79
80[[enum]]
81==== Enum Specification
82
83Integer declarations may have an enum specification to provide meaning to specific values. Values that are not specified may be communicated, but have no specific defined meaning. The structure of an enum specification is:
84
85* optional `enum`
86* optional whitespace
87* `{`
88* zero or more enum values, consisting of:
89 * optional whitespace
90 * identifier
91 * optional whitespace
92 * `=`
93 * optional whitespace
94 * integer value
95 * optional whitespace
96 * comma (optional for last value)
97* optional whitespace
98* `}`
99
100Examples of valid enum specifications:
101
102* `enum{}`
103* `enum { a = 1 }`
104* `enum{a=1,b=2,}`
105* `{a=1}`
106
107Examples of invalid enum specifications:
108
109* `enum` (no `{}`)
110* `enum{=2}` (missing identifier)
111* `enum{a=1,b,c}` (missing values)
112
113[[]]
114=== Bit-field Declaration
115
116Bit-field declarations declare a member with an explicit width in bits. The structure of a bit-field declaration is:
117
118* optional enum specification (integer data types only)
119* optional whitespace
120* type name; must be boolean or one of the integer data types
121* whitespace
122* identifier name
123* optional whitespace
124* colon (`:`)
125* optional whitespace
126* integer number of bits; minimum 1; maximum 1 for boolean types; for integer types, maximum is the width of the type (e.g. 32 for int32)
127
128As with non-bit-field integer variable declarations, an enum can be specified for integer bit-fields (e.g. `enum {a=1, b=2} uint32 value : 2`).
129
130It is not possible to have an array of bit-fields.
131
132Examples of valid bit-field declarations:
133
134* `bool value : 1`
135* `enum{a=1,b=2}int8 value:2`
136
137Examples of invalid bit-field declarations:
138
139* `double val:2` (must be integer or boolean)
140* `int32 val[2]:2` (cannot be array)
141* `bool val:3` (bool must be 1 bit)
142* `int16 val:17` (bit field larger than storage size)
143
144[[layout]]
145== Data Layout
146
147Members are stored in the same order they appear in the schema. Individual members are stored in little-endian order. Members are not aligned to any particular boundary; no byte-level padding is present in the data.
148
149[source]
150----
151bool b;
152int16 i;
153----
154
155results in a 3-byte encoding:
156
157`bbbbbbbb iiiiiiii iiiiiiii`
158
159where the first `iiiiiiii` is the least significant byte of `i`.
160
161[[layout-array]]
162=== Array Data Layout
163
164For array members, the individual items of the array are stored consecutively with no padding between each item.
165
166[source]
167----
168int16 i[2];
169----
170
171results in a 4-byte encoding:
172
173`i0i0i0i0 i0i0i0i0 i1i1i1i1 i1i1i1i1`
174
175where `i0` is the first element of the array, `i1` is the second element.
176
177[[layout-nested-structure]]
178
179Nested structures also have no surrounding padding.
180
181Given the Inner schema
182
183[source]
184----
185int16 i;
186int8 x;
187----
188
189and an outer schema of
190
191[source]
192----
193char c;
194Inner s;
195bool b;
196----
197
198results in a 5-byte encoding:
199
200`cccccccc iiiiiiii iiiiiiii xxxxxxxx bbbbbbbb`
201
202[[layout-bit-field]]
203=== Bit-Field Data Layout
204
205Multiple adjacant bit-fields of the same integer type width are packed together to fit in the minimum number of multiples of that type. The bit-fields are packed, starting from the least significant bit, in the order they appear in the schema. Individual bit-fields must not span across multiple underlying types; if a bit-field is larger than the remaining space in the data type, a new element of that type is started and the bit-field starts from the least significant bit of the new element. Unused bits should be set to 0 during serialization and must be ignored during deserialization.
206
207Boolean bit-fields are always a single bit wide. The underlying data type is by default uint8, but if a boolean bit-field immediately follows a bit-field of another integer type (and fits), it is packed into that type.
208
209[source]
210----
211int8 a:4;
212int16 b:4;
213----
214
215results in a 3-byte encoding:
216
217`0000aaaa 0000bbbb 00000000`
218
219as the integer type widths are different, even though the bits would fit.
220
221[source]
222----
223int16 a:4;
224uint16 b:5;
225bool c:1;
226int16 d:7;
227----
228
229results in a 4-byte encoding:
230
231`bbbbaaaa 000000cb 0ddddddd 00000000`
232
233As `c` is packed into the preceding int16, and `d` is too large to fit in the remaining bits of the first type.
234
235[source]
236----
237uint8 a:4;
238int8 b:2;
239bool c:1;
240int16 d:1;
241----
242
243results in a 3-byte encoding:
244
245`0cbbaaaa 0000000d 00000000`
246
247as `d` is int16, versus the `int8` of the previous values.
248
249[source]
250----
251bool a:1;
252bool b:1;
253int8 c:2;
254----
255
256results in a 1-byte encoding:
257
258`0000ccba`
259
260as `c` is an int8.
261
262[source]
263----
264bool a:1;
265bool b:1;
266int16 c:2;
267----
268
269results in a 3-byte encoding:
270
271`000000ba 000000cc 00000000`
272
273as `c` is an int16.
274
275Bit-fields do not "look inside" of nested structures. Given Inner
276
277[source]
278----
279int8 a:1;
280----
281
282and outer
283
284[source]
285----
286int8 b:1;
287Outer s;
288int8 c:1;
289----
290
291the result is a 3-byte encoding:
292
293`0000000b 0000000a 0000000c`
294
295[[layout-character-arrays]]
296=== Character Array (String) Data Layout
297
298Character arrays, as with other arrays, must be fixed length. The text they contain should be UTF-8. If a string is shorter than the length of the character array, the string starts at the first byte of the array, and any unused bytes at the end of the array must be filled with 0.
299
300[source]
301----
302char s[4];
303----
304
305with a string of "a" results in:
306
307`01100001 00000000 00000000 00000000`
308
309with a string of "abcd" results in:
310
311`01100001 01100010 01100011 01100100`