Drop times which are only slightly out of order

Given our problem statement, if we have 2 nodes, node A and node B, we
get a different answer depending on which node we fix and solve for.
The easiest way to see this is to pick a valid solution (say, pick node
A), and then use each node's timestamp from that solution as the
solution node and timestamp, and re-solve.  You will see that the
solution isn't the same for each node.

 Times can't be compared when solving for node 3.
   3802923.526119119sec vs 3802923.526119320sec -> -201ns
   420.018034692sec vs 420.018034624sec -> 68ns
   420.024212276sec vs 420.024212477sec -> -201ns
   420.059764778sec vs 420.059765248sec -> -470ns
   420.003547819sec vs 420.003548020sec -> -201ns
   424.066674159sec vs 424.066674359sec -> -200ns

If you algebraically solve the cost for 2 nodes, you get:
  cost = (tb - ta - (m1 ta + b1))^2 + (ta - tb - (m2 tb + b2))^2

Take the partial derivatives for each node, set them to 0, and solve,
and you get:
 tb = (1 + (1 + m1)^2)/(2 + m1 + m2) ta - (b2 + (1 + m1) b1)/(2 + m1 + m2)
and
 tb = (2 + m1 + m2)/(1 + (m2 + 1)^2) ta - (b1 + (1 + m2) b2)/(1 + (m2 + 1)^2)

which are different solutions except when, m1 == m2.

This effect should be small, so for now, lets just drop anything that is
close but slightly out of order.

Longer term, I'd love the problem statement to be better defined to not
have this problem (feels like something is rotated), but that's a
problem for later.

Change-Id: I2c124cd1962e32c9915f60029206b6bd2d0bf6b4
diff --git a/aos/network/multinode_timestamp_filter.cc b/aos/network/multinode_timestamp_filter.cc
index e9834c1..84de0f7 100644
--- a/aos/network/multinode_timestamp_filter.cc
+++ b/aos/network/multinode_timestamp_filter.cc
@@ -591,6 +591,31 @@
   return dt;
 }
 
+// Returns the amount of time that ta and tb are out of order by.  The primary
+// direction is defined to be the direction of the average of the offsets.  So,
+// if the average is +, and we get 1 - outlier, the absolute value of that -
+// outlier is the invalid distance.
+chrono::nanoseconds InvalidDistance(
+    const std::vector<monotonic_clock::time_point> &ta,
+    const std::vector<monotonic_clock::time_point> &tb) {
+  // Use an int128 so we have no concern about number of times or size of the
+  // difference.
+  absl::int128 sum = 0;
+  for (size_t i = 0; i < ta.size(); ++i) {
+    if (ta[i] == monotonic_clock::min_time ||
+        tb[i] == monotonic_clock::min_time) {
+      continue;
+    }
+    sum += (ta[i] - tb[i]).count();
+  }
+  // Pick the direction and sign to return.
+  if (sum < 0) {
+    return MaxElapsedTime(tb, ta);
+  } else {
+    return MaxElapsedTime(ta, tb);
+  }
+}
+
 // Class to efficiently track up to 64 bit sets.  It uses a uint64 as the
 // backing store, and ffs() to find the first bit set efficiently so we can
 // iterate through the set.
@@ -880,11 +905,24 @@
           solution_index = node_a_index;
           break;
         case TimeComparison::kInvalid:
+          // If times are close enough, drop the invalid time.
+          if (InvalidDistance(times, solution) < chrono::nanoseconds(500)) {
+            VLOG(1) << "Times can't be compared by "
+                    << InvalidDistance(times, solution).count() << "ns";
+            for (size_t i = 0; i < times.size(); ++i) {
+              VLOG(1) << "  " << times[i] << " vs " << solution[i] << " -> "
+                      << (times[i] - solution[i]).count() << "ns";
+            }
+            VLOG(1) << "Ignoring because it is close enough.";
+            next_node_filter->Consume();
+            break;
+          }
           // Somehow the new solution is better *and* worse than the old
           // solution...  This is an internal failure because that means time
           // goes backwards on a node.
           CHECK_EQ(times.size(), solution.size());
-          LOG(INFO) << "Times can't be compared.";
+          LOG(INFO) << "Times can't be compared by "
+                    << InvalidDistance(times, solution).count() << "ns";
           for (size_t i = 0; i < times.size(); ++i) {
             LOG(INFO) << "  " << times[i] << " vs " << solution[i] << " -> "
                       << (times[i] - solution[i]).count() << "ns";