Reduce memory usage of the static flatbuffer API

This contains a couple of changes, which work together and are a bit
hard to separate out.

1) Instead of requiring the whole contents of a sub-message or vector to
   be aligned, split the alignment requirement up into an alignment
   requirement at an offset into the message.  This lets us leave the
   length field in a message, for example, at 4 byte alignment when the
   body requires 8 byte alignment.  This enables better packing of
   fields.
2) From James, don't pre-reserve space for vectors with 0 length.  They
   will trigger a re-allocation anyways when they are used since there
   is no space allocated, so pre-allocating doesn't help.
3) Remove padding at the end of messages and require the allocator to
   handle it instead.  We used to allocate kSize + kAlign and then
   manually align things, which resulted in wasted space.
4) Automatically add any extra padding after a vector to the vector.
   For some small vectors, this lets us use the padding for the vector
   rather than allocating more space.
5) Shrink the code generated for the object offsets by adding constexpr
   variables with the previous object size rather than inlining it.
   This results in a much faster build since clang-format was fighting
   the large fields at build time.
6) Sort fields in a flatbuffer by alignment to pack them better.

Change-Id: I5af440855e3425be31fa7f30c68af552fcf06cb2
Signed-off-by: Austin Schuh <austin.schuh@bluerivertech.com>
Signed-off-by: James Kuszmaul <james.kuszmaul@bluerivertech.com>
diff --git a/aos/flatbuffers/builder.h b/aos/flatbuffers/builder.h
index b4aefbe..be41b63 100644
--- a/aos/flatbuffers/builder.h
+++ b/aos/flatbuffers/builder.h
@@ -94,7 +94,6 @@
 
  private:
   size_t Alignment() const override { return flatbuffer_.t.Alignment(); }
-  size_t AbsoluteOffsetOffset() const override { return 0; }
   size_t NumberOfSubObjects() const override { return 1; }
   void SetPrefix() {
     // We can't do much if the provided buffer isn't at least 4-byte aligned,
@@ -103,13 +102,16 @@
     CHECK_EQ(reinterpret_cast<size_t>(buffer_.data()) % alignof(uoffset_t), 0u);
     *reinterpret_cast<uoffset_t *>(buffer_.data()) = flatbuffer_start_;
   }
-  // Because the allocator API doesn't provide a way for us to request a
-  // strictly aligned buffer, manually align the start of the actual flatbuffer
-  // data if needed.
+  // Manually aligns the start of the actual flatbuffer to handle the alignment
+  // offset.
   static size_t BufferStart(std::span<uint8_t> buffer) {
-    return aos::fbs::PaddedSize(
+    CHECK_EQ(reinterpret_cast<size_t>(buffer.data()) % T::kAlign, 0u)
+        << "Failed to allocate data of length " << buffer.size()
+        << " with alignment " << T::kAlign;
+
+    return aos::fbs::AlignOffset(
                reinterpret_cast<size_t>(buffer.data()) + sizeof(uoffset_t),
-               T::kAlign) -
+               T::kAlign, T::kAlignOffset) -
            reinterpret_cast<size_t>(buffer.data());
   }