AGV Control Optimization with Machine Learning

A cooperation between Linköping University and Toyota Material Handling


Designing a controller for a new product is a challenging task and may take several month of hard work until the desired performance is achieved. The main focus of this project is to use a machine-learning-based approach in order to tune a PID controller. Such a controller is used to control an Automated Guided Vehicle (AGV) manufactured by Toyota. The AGV is modelled and built on three main modules: a simulator, a controller, and a machine learning based parameter tuner. The project is an assignment from the course TSRT10 - Reglerteknisk projektkurs at Linköping University which has been created in collaboration with Toyota Material Handling.

Watch the video to see the results

Project Members


The Team:

  • Carl Hampus Hedén
  • Mahdi Najafi
  • Alfred Boman
  • Adam Kagebeck
  • Karl Blomkvist
  • Viktor Ekström
  • Rasmus Björk

Post Development

Path following

A significant improvement in the path-following algorithm ensures that AGV can handle crossings. As of now, the path-following algorithm will always choose the closest node; in essence, if a node - a shortcut - exists that is closer to the AGV’s position yet which will steer the AGV towards an unintended node, the current path-following algorithm will steer towards this erroneous node. This was also the major flaw of the RRT as its sharp corners often caused the path-follower to take shortcuts in the corners. An improved path-following algorithm would consider the closest node as well as its relative position in the path so as to avoid shortcuts.

Physical Electrical Motor Development

An interesting further work area is the modeling of the electrical motor. Many disturbances can be modeled through the motor, for instance position of the load that the AGV is carrying and chance of frictions between the floor and the wheels. If a physical model of the motor, for instance a state space model, was developed, measurements of the wheel speed and estimations of the required torque could perhaps be used as observations to the agent. Then the agent could be trained to recognize these disturbances and improve its ability to handle these scenarios. This would make the control more robust which could be beneficial in environments with a lot of unpredictable occurrences.

Machine Learning Methods

When it comes to the machine learning methods there are quite a bit of things that could be experimented with further. There are a great amount of possibilities when dealing with networks and all the different parameters used for the agent and its training. The architecture of the actor and critic networks can be implemented in many different ways, with different amount and types of layers. To explore several network structures was not a part of this project and could be interesting to develop further. This combined with changing the amount of hidden units could give an agent that performs better.

PMachine Learning Implementation

In this project the agent gives actions in the form of control parameters that directly controls the AGV. This could easily be changed, with the existing system, to instead tune already existing PID-parameters meaning that the agent only finely tunes the system. Another interesting implementation is to remove the controller completely and let the agent directly give actions as the control signals for the AGV.


An observation during the project was a tendency for the agent having trouble in finding good parameter values if there was a large time delay in the motor. Another interesting area of future researcher would be to do a more formal study if any of the disturbances is more difficult for the agents to handle.

Real world implementation

The results of this study is completely based on the simulator output. For a more robust conclusion the agent and training would need to be integrate in a real world system.