Switch Logger over to using the new context UUID

We got a lovely kersplat when Jim rebooted one of the pi's and I tried
to restart the logger.  We've seen this occasionally in a bunch of
spots randomly.

F0313 17:40:26.905807  1893 log_writer.cc:677] Check failed: node_state_[f.data_node_index].header_valid : Can't write data before the header on channel { "name": "/pi2/aos", "type": "aos.message_bridge.Timestamp", "frequency": 10, "max_size": 200, "num_senders": 2, "source_node": "pi2", "destination_nodes": [ { "name": "roborio", "timestamp_logger": "LOCAL_AND_REMOTE_LOGGER", "timestamp_logger_nodes": [ "roborio" ], "priority": 1, "time_to_live": 5000000 }, { "name": "laptop", "priority": 1, "time_to_live": 5000000 } ], "logger": "LOCAL_AND_REMOTE_LOGGER", "logger_nodes": [ "roborio", "laptop" ] }
*** Check failure stack trace: ***
    @   0x53c888  google::LogMessage::Fail()
    @   0x53e418  google::LogMessage::SendToLog()
    @   0x53c508  google::LogMessage::Flush()
    @   0x53cbec  google::LogMessageFatal::~LogMessageFatal()
    @   0x4e2f00  aos::logger::Logger::LogUntil()
    @   0x4e5784  aos::logger::Logger::StartLogging()
    @   0x4d90f0  _ZNSt17_Function_handlerIFvvEZ4mainEUlvE_E9_M_invokeERKSt9_Any_data
    @   0x4f6cac  aos::ShmEventLoop::Run()
    @   0x4d4288  main
    @ 0xb6c1e580  __libc_start_main

This is because we are using the ServerStatistics message for the boot
UUID of a node.  If the logger starts up and there is data on a channel
from a node which isn't currently active, we don't have a way of telling
which boot that data is from.  That means we can't log the header, so we
can't log the data.

Instead of using the ServerStatistics message, use the new
remote_boot_uuid for the message.  That tells us exactly which boot is
is from reliably.  Since it is attached to the message, it can never get
out of sync.

The downside here is that we are adding another 16 bytes to each message
that is being sent.  That is close to doubling our overhead, but should
be significantly less than a simple flatbuffer.

Another option could have been to drop the data.  But, in the case of
parameters messages from other nodes which are low frequency and
critical for operating the system, we would not be able to reproduce the
state reliably.

Alternatives could also be to make the ServerStatistics message
complicated enough to track what messages came from what boot.  This
would likely end up being wack-a-mole to try to figure out how to
describe which messages from a bunch of vintages could be from which
boots in the queues.

Change-Id: Idc531ca1ff1628c38efc4877661f121c94641e78
diff --git a/aos/events/logging/log_reader.cc b/aos/events/logging/log_reader.cc
index 098b78c..91bb93f 100644
--- a/aos/events/logging/log_reader.cc
+++ b/aos/events/logging/log_reader.cc
@@ -1177,8 +1177,8 @@
              nullptr) {
     flatbuffers::FlatBufferBuilder fbb;
     fbb.ForceDefaults(true);
-    flatbuffers::Offset<flatbuffers::String> boot_uuid_offset =
-        event_loop_->boot_uuid().PackString(&fbb);
+    flatbuffers::Offset<flatbuffers::Vector<uint8_t>> boot_uuid_offset =
+        event_loop_->boot_uuid().PackVector(&fbb);
 
     RemoteMessage::Builder message_header_builder(fbb);
 
diff --git a/aos/events/logging/log_writer.cc b/aos/events/logging/log_writer.cc
index 400ad8d..0b08c94 100644
--- a/aos/events/logging/log_writer.cc
+++ b/aos/events/logging/log_writer.cc
@@ -241,7 +241,8 @@
   WriteHeader();
 
   LOG(INFO) << "Logging node as " << FlatbufferToJson(event_loop_->node())
-            << " start_time " << last_synchronized_time_;
+            << " start_time " << last_synchronized_time_ << " boot uuid "
+            << event_loop_->boot_uuid();
 
   // Force logging up until the start of the log file now, so the messages at
   // the start are always ordered before the rest of the messages.
@@ -451,19 +452,6 @@
         break;
       }
 
-      // Update the boot UUID as soon as we know we are connected.
-      if (!connection->has_boot_uuid()) {
-        VLOG(1) << "Missing boot_uuid for node " << aos::FlatbufferToJson(node);
-        break;
-      }
-
-      if (!node_state_[node_index].has_source_node_boot_uuid ||
-          node_state_[node_index].source_node_boot_uuid !=
-              connection->boot_uuid()->string_view()) {
-        node_state_[node_index].SetBootUUID(
-            connection->boot_uuid()->string_view());
-      }
-
       if (!connection->has_monotonic_offset()) {
         VLOG(1) << "Missing monotonic offset for setting start time for node "
                 << aos::FlatbufferToJson(node);
@@ -631,6 +619,9 @@
   // reboots which may have happened.
   WriteMissingTimestamps();
 
+  int our_node_index = aos::configuration::GetNodeIndex(
+      event_loop_->configuration(), event_loop_->node());
+
   // Write each channel to disk, one at a time.
   for (FetcherStruct &f : fetchers_) {
     while (true) {
@@ -653,6 +644,16 @@
         break;
       }
       if (f.writer != nullptr) {
+        // Only check if the boot UUID has changed if this is data from another
+        // node.  Our UUID can't change without restarting the application.
+        if (our_node_index != f.data_node_index) {
+          // And update our boot UUID if the UUID has changed.
+          if (node_state_[f.data_node_index].SetBootUUID(
+                  f.fetcher->context().remote_boot_uuid)) {
+            MaybeWriteHeader(f.data_node_index);
+          }
+        }
+
         // Write!
         const auto start = event_loop_->monotonic_now();
         flatbuffers::FlatBufferBuilder fbb(f.fetcher->context().size +
@@ -719,12 +720,8 @@
             flatbuffers::GetRoot<RemoteMessage>(f.fetcher->context().data);
 
         CHECK(msg->has_boot_uuid()) << ": " << aos::FlatbufferToJson(msg);
-        if (!node_state_[f.contents_node_index].has_source_node_boot_uuid ||
-            node_state_[f.contents_node_index].source_node_boot_uuid !=
-                msg->boot_uuid()->string_view()) {
-          node_state_[f.contents_node_index].SetBootUUID(
-              msg->boot_uuid()->string_view());
-
+        if (node_state_[f.contents_node_index].SetBootUUID(
+                UUID::FromVector(msg->boot_uuid()))) {
           MaybeWriteHeader(f.contents_node_index);
         }
 
diff --git a/aos/events/logging/log_writer.h b/aos/events/logging/log_writer.h
index 0e88b4f..f5b55a7 100644
--- a/aos/events/logging/log_writer.h
+++ b/aos/events/logging/log_writer.h
@@ -197,7 +197,7 @@
 
     // This is an initial UUID that is a valid UUID4 and is pretty obvious that
     // it isn't valid.
-    std::string source_node_boot_uuid = "00000000-0000-4000-8000-000000000000";
+    UUID source_node_boot_uuid = UUID::Zero();
 
     aos::SizePrefixedFlatbufferDetachedBuffer<LogFileHeader> log_file_header =
         aos::SizePrefixedFlatbufferDetachedBuffer<LogFileHeader>::Empty();
@@ -208,30 +208,23 @@
     // follow.  This is cleared when boot_uuid is known to not match anymore.
     bool header_valid = false;
 
-    // Sets the source_node_boot_uuid, properly updating everything.
-    void SetBootUUID(const UUID &new_source_node_boot_uuid) {
-      new_source_node_boot_uuid.CopyTo(source_node_boot_uuid.data());
-      header_valid = false;
-      has_source_node_boot_uuid = true;
-
-      flatbuffers::String *source_node_boot_uuid_string =
-          log_file_header.mutable_message()->mutable_source_node_boot_uuid();
-      CHECK_EQ(source_node_boot_uuid.size(),
-               source_node_boot_uuid_string->size());
-      memcpy(source_node_boot_uuid_string->data(), source_node_boot_uuid.data(),
-             source_node_boot_uuid.size());
-    }
-    void SetBootUUID(std::string_view new_source_node_boot_uuid) {
+    // Sets the source_node_boot_uuid, properly updating everything.  Returns
+    // true if it changed, false otherwise.
+    bool SetBootUUID(const UUID &new_source_node_boot_uuid) {
+      if (has_source_node_boot_uuid &&
+          source_node_boot_uuid == new_source_node_boot_uuid) {
+        return false;
+      }
       source_node_boot_uuid = new_source_node_boot_uuid;
       header_valid = false;
       has_source_node_boot_uuid = true;
 
       flatbuffers::String *source_node_boot_uuid_string =
           log_file_header.mutable_message()->mutable_source_node_boot_uuid();
-      CHECK_EQ(source_node_boot_uuid.size(),
-               source_node_boot_uuid_string->size());
-      memcpy(source_node_boot_uuid_string->data(), source_node_boot_uuid.data(),
-             source_node_boot_uuid.size());
+      CHECK_EQ(UUID::kStringSize, source_node_boot_uuid_string->size());
+      source_node_boot_uuid.CopyTo(source_node_boot_uuid_string->data());
+
+      return true;
     }
   };
 
diff --git a/aos/events/logging/logger_test.cc b/aos/events/logging/logger_test.cc
index 6ffa16c..17078dc 100644
--- a/aos/events/logging/logger_test.cc
+++ b/aos/events/logging/logger_test.cc
@@ -419,7 +419,7 @@
   // channel.
   bool shared;
   // sha256 of the config.
-  std::string sha256;
+  std::string_view sha256;
 };
 
 class MultinodeLoggerTest : public ::testing::TestWithParam<struct Param> {
@@ -1900,8 +1900,8 @@
           }
 
           ASSERT_TRUE(header.has_boot_uuid());
-          EXPECT_EQ(header.boot_uuid()->string_view(),
-                    pi2_event_loop->boot_uuid().ToString());
+          EXPECT_EQ(UUID::FromVector(header.boot_uuid()),
+                    pi2_event_loop->boot_uuid());
 
           EXPECT_EQ(pi1_context->queue_index, header.remote_queue_index());
           EXPECT_EQ(pi2_context->remote_queue_index,
@@ -1980,8 +1980,8 @@
           }
 
           ASSERT_TRUE(header.has_boot_uuid());
-          EXPECT_EQ(header.boot_uuid()->string_view(),
-                    pi1_event_loop->boot_uuid().ToString());
+          EXPECT_EQ(UUID::FromVector(header.boot_uuid()),
+                    pi1_event_loop->boot_uuid());
 
           EXPECT_EQ(pi2_context->queue_index, header.remote_queue_index());
           EXPECT_EQ(pi1_context->remote_queue_index,
@@ -2199,25 +2199,24 @@
   ConfirmReadable(pi1_single_direction_logfiles_);
 }
 
+constexpr std::string_view kCombinedConfigSha1(
+    "8d17eb7c2347fd4a8a9a2e0f171a91338fe4d5dd705829c39497608075a8d6fc");
+constexpr std::string_view kSplitConfigSha1(
+    "a6235491429b7b062e5da35c1d0d279c7e7e33cd70787f231d420ab831959744");
+
 INSTANTIATE_TEST_CASE_P(
     All, MultinodeLoggerTest,
-    ::testing::Values(
-        Param{
-            "multinode_pingpong_combined_config.json", true,
-            "47511a1906dbb59cf9f8ad98ad08e568c718a4deb204c8bbce81ff76cef9095c"},
-        Param{"multinode_pingpong_split_config.json", false,
-              "ce3ec411a089e5b80d6868bdb2ff8ce86467053b41469e50a09edf3c0110d80"
-              "f"}));
+    ::testing::Values(Param{"multinode_pingpong_combined_config.json", true,
+                            kCombinedConfigSha1},
+                      Param{"multinode_pingpong_split_config.json", false,
+                            kSplitConfigSha1}));
 
 INSTANTIATE_TEST_CASE_P(
     All, MultinodeLoggerDeathTest,
-    ::testing::Values(
-        Param{
-            "multinode_pingpong_combined_config.json", true,
-            "47511a1906dbb59cf9f8ad98ad08e568c718a4deb204c8bbce81ff76cef9095c"},
-        Param{"multinode_pingpong_split_config.json", false,
-              "ce3ec411a089e5b80d6868bdb2ff8ce86467053b41469e50a09edf3c0110d80"
-              "f"}));
+    ::testing::Values(Param{"multinode_pingpong_combined_config.json", true,
+                            kCombinedConfigSha1},
+                      Param{"multinode_pingpong_split_config.json", false,
+                            kSplitConfigSha1}));
 
 // TODO(austin): Make a log file where the remote node has no start time.
 
diff --git a/aos/events/simulated_event_loop_test.cc b/aos/events/simulated_event_loop_test.cc
index 22d4028..39dbe99 100644
--- a/aos/events/simulated_event_loop_test.cc
+++ b/aos/events/simulated_event_loop_test.cc
@@ -650,10 +650,9 @@
          channel_index = channel.first](const RemoteMessage &header) {
           VLOG(1) << aos::FlatbufferToJson(&header);
           EXPECT_TRUE(header.has_boot_uuid());
-          EXPECT_EQ(header.boot_uuid()->string_view(),
+          EXPECT_EQ(UUID::FromVector(header.boot_uuid()),
                     simulated_event_loop_factory.GetNodeEventLoopFactory(pi2)
-                        ->boot_uuid()
-                        .ToString());
+                        ->boot_uuid());
 
           const aos::monotonic_clock::time_point header_monotonic_sent_time(
               chrono::nanoseconds(header.monotonic_sent_time()));
@@ -1430,10 +1429,9 @@
        &simulated_event_loop_factory, pi2, network_delay, &pi2_pong_event_loop,
        &pi1_remote_timestamp](const RemoteMessage &header) {
         EXPECT_TRUE(header.has_boot_uuid());
-        EXPECT_EQ(header.boot_uuid()->string_view(),
+        EXPECT_EQ(UUID::FromVector(header.boot_uuid()),
                   simulated_event_loop_factory.GetNodeEventLoopFactory(pi2)
-                      ->boot_uuid()
-                      .ToString());
+                      ->boot_uuid());
         VLOG(1) << aos::FlatbufferToJson(&header);
         if (header.channel_index() == reliable_channel_index) {
           ++reliable_timestamp_count;
@@ -1506,7 +1504,7 @@
                : "/pi1/aos/remote_timestamps/pi2/test/aos-examples-Ping",
       [&timestamp_count, &expected_boot_uuid](const RemoteMessage &header) {
         EXPECT_TRUE(header.has_boot_uuid());
-        EXPECT_EQ(UUID::FromString(header.boot_uuid()), expected_boot_uuid);
+        EXPECT_EQ(UUID::FromVector(header.boot_uuid()), expected_boot_uuid);
         VLOG(1) << aos::FlatbufferToJson(&header);
         ++timestamp_count;
       });
diff --git a/aos/events/simulated_network_bridge.cc b/aos/events/simulated_network_bridge.cc
index eb7f243..40b468f 100644
--- a/aos/events/simulated_network_bridge.cc
+++ b/aos/events/simulated_network_bridge.cc
@@ -159,8 +159,8 @@
                                     send_node_factory_->boot_uuid());
       }
 
-      flatbuffers::Offset<flatbuffers::String> boot_uuid_offset =
-          send_node_factory_->boot_uuid().PackString(&fbb);
+      flatbuffers::Offset<flatbuffers::Vector<uint8_t>> boot_uuid_offset =
+          send_node_factory_->boot_uuid().PackVector(&fbb);
 
       RemoteMessage::Builder message_header_builder(fbb);
 
diff --git a/aos/network/message_bridge_server_lib.cc b/aos/network/message_bridge_server_lib.cc
index 18df3fa..77fe1e4 100644
--- a/aos/network/message_bridge_server_lib.cc
+++ b/aos/network/message_bridge_server_lib.cc
@@ -129,8 +129,8 @@
         aos::Sender<RemoteMessage>::Builder builder =
             peer.timestamp_logger->MakeBuilder();
 
-        flatbuffers::Offset<flatbuffers::String> boot_uuid_offset =
-            server_status.BootUUID(peer.node_index).PackString(builder.fbb());
+        flatbuffers::Offset<flatbuffers::Vector<uint8_t>> boot_uuid_offset =
+            server_status.BootUUID(peer.node_index).PackVector(builder.fbb());
 
         RemoteMessage::Builder remote_message_builder =
             builder.MakeBuilder<RemoteMessage>();
diff --git a/aos/network/remote_message.fbs b/aos/network/remote_message.fbs
index b43f055..6d2a8d1 100644
--- a/aos/network/remote_message.fbs
+++ b/aos/network/remote_message.fbs
@@ -27,8 +27,11 @@
   // Queue index of this message on the remote node.
   remote_queue_index:uint32 = 4294967295 (id: 7);
 
+  // Old UUID with a string UUID.
+  old_boot_uuid:string (id: 8);
+
   // UUID for this boot.
-  boot_uuid:string (id: 8);
+  boot_uuid:[uint8] (id: 9);
 }
 
 root_type RemoteMessage;