Bound the max number of messages saved for matching
We have seen log files with a bunch of data but no timestamps logged.
This can happen if the timestamps are not logged, or a ton of messages
aren't delivered.
In this case, TimestampMapper accumulates up all the data in the hopes
that there is a matching timestamp at some point in the future.
Eventually, this is futile and just results in a memory explosion. I've
seen logs exhaust 64 GB of ram, which is pretty ridiculous.
Instead, bound the amount of time we save these messages by the TTL and
an expected network delay. This is a bit scary since if we throw the
data out too early, we'll declare it as an early end of the log, and
someone might not notice. That seems like a safer failure mode right
now than producing unreadable logs.
Change-Id: I4e0d93e77e5ae3b3b2ee1e62e829009cd56b09be
Signed-off-by: Austin Schuh <austin.linux@gmail.com>
diff --git a/aos/events/logging/logfile_utils.h b/aos/events/logging/logfile_utils.h
index 4c82e4d..e91b51a 100644
--- a/aos/events/logging/logfile_utils.h
+++ b/aos/events/logging/logfile_utils.h
@@ -727,6 +727,8 @@
// Bool tracking per channel if a message is delivered to the node this
// NodeData represents.
bool delivered = false;
+ // The TTL for delivery.
+ std::chrono::nanoseconds time_to_live = std::chrono::nanoseconds(0);
};
// Vector with per channel data.
@@ -792,6 +794,7 @@
// Timestamp of the last message returned. Used to make sure nothing goes
// backwards.
BootTimestamp last_message_time_ = BootTimestamp::min_time();
+ BootTimestamp last_popped_message_time_ = BootTimestamp::min_time();
// Time this node is queued up until. Used for caching.
BootTimestamp queued_until_ = BootTimestamp::min_time();