Add code to train a simple turret controller

This uses Soft Actor-Critic with automatic temperature adjustment to
train a controller for last year's turret.  This is a stepping step to
training a more complicated turret for the swerve.  I need to still
parallelize it, play with the hyper-parameters, and teach it how to go
to a nonzero goal.

Change-Id: I1357b5fbf8549acac4ee0b94ef8f2636867c28ad
Signed-off-by: Austin Schuh <austin.linux@gmail.com>
diff --git a/tools/python/requirements.txt b/tools/python/requirements.txt
index 2b806f5..241425a 100644
--- a/tools/python/requirements.txt
+++ b/tools/python/requirements.txt
@@ -44,3 +44,6 @@
 
 # Experience buffer for reinforcement learning
 flashbax
+
+# Experiment tracking
+aim