SUMMARY OF: Deep Model Predictive Control with Online Learning for Complex Physical System

- arXiv:1905.10094v1 [cs.LG]
**Source**: https://arxiv.org/abs/1905.10094 (May 2019).

Literature Review** **by S.Seal sayani.seal@mail.mcgill.ca

Edited by Q.Dang, D.Wu [di.wu5@mail.mcgill.ca]

**Disclaimer:*** This website contains copyrighted material, and its use is not always specifically authorized by the copyright owner. Take all necessary steps to ensure that the information you receive from the post is correct and verified.*

#### 1. motivation

Flow control is required in many fields of applications such as, energy, transportation, health and security. Though fluid flow has high-dimensional, multi-layer physics and nonlinear system characteristics, it can be approximated by some of the dominant low-dimensional system features. Since performance of a model predictive controller (MPC) significantly depends on the accuracy of its system prediction model, intractable complex systems pose difficulty in designing such a controller which is otherwise efficient for the particular application. The article [1] presents a DeepMPC controller where sensor-based observable low-rank system states are used to generate a recurrent neural network (RNN) based data-driven predictive system model for a real-time MPC implemented in fluid flow control.

#### 2. Main Contributions

i. DeepMPC architecture is implemented for complex fluid flow system exhibiting broadband phenomena.

ii. Instead of using assumptions of full system states, the “surrogate” predictive system model uses only observable system states for future prediction. Thus, the method achieves a trade-off between accuracy and efficiency in capturing the essential physical system mechanisms.

iii. The proposed learning approach for the RNN utilizes limited past information from the sensors.

Figure 1: DeepMPC with surrogate RNN prediction model presented in [1].

#### 3. METHOD

#### A. DeepMPC

i. Finite Open loop control problem with quadratic cost. Penalties assigned on deviation from reference trajectory, control input and any variation in the control input. The last component among the three restricts sudden change in the control input.

ii. Surrogate system state prediction model, based on deep RNN architecture, is generated using control relevant observable sensor-based system states. For this flow control model, the states are ** lift** and

*drag.*#### 1. RNN based predictive model design:

a. Decoder:

i. Performs actual prediction task

ii. N-cell for N time steps in the prediction horizon

b. Encoder:

i. Predicts latent states and thereby accounts for long-term dynamics.

2. RNN based MPC problem is solved using gradient based optimization method.

3. The gradient information with respect to the control inputs is calculated using backpropagation-through-time.

iii. Training RNN:

1. Offline three-stage training [2] with time-series data of observable system states.

2. Training data, i.e., a time-series data of the lift and the drag, is generated using random but continuously variable control sequence of rotation force on cylinder(s).

#### 4. Result Summarization

#### A. Setup:

A detailed simulation model of the full system is used instead of a real physical system. It is solved by OpenFOAM solver using finite volume discretization.

#### B. Experiments:

** Objective:
**The objective is to control the cylinder(s)
such that

Four flow (laminar regime) control models with different complexity levels are considered:

i. One cylinder: Flow around a single cylinder

1. RNN prediction evaluated on exemplary control input sequence which showed accurate prediction for both lift and drag except for a very small duration at the start of the experiment.

2. Successful showcase of tracking control of maintaining a schedules lift sequence for 20 sec with bounded rotation control input.

3. Reynolds number () is assumed to be 100.

4.
**Training dataset:**

a. Random rotation between -2 to +2 chosen at every 0.5 sec. Thus, high input frequencies are avoided

b. Intermediate control inputs are computed using spline interpolation for every 0.1sec.

c. A time-series with 110 000 datapoints are used for RNN training corresponding to 11 000sec.

ii. Fluidic Pinball: Control the flow around three cylinders, two of which can be rotated the third one is fixed, as shown in Figure 1.

Figure 2: System is controlled by rotating cylinders 1 and 2 with respective angular velocities and [1].

1. Objective is to follow three given lift trajectories for each cylinder by rotating cylinders 1 and 2.

2. considered as the base case, other two chaotic cases with and are analyzed.

3.
**Training dataset: **

a. Random rotation between -2 to +2 chosen for each cylinder at every 0.5 sec.

b. Intermediate control inputs are computed using spline interpolation for every 0.005sec.

c. Time series with 150 000, 200 000 and 800 000 are used for and respectively.

4. In order to improve performance for more chaotic systems with and , knowledge regarding physical system characteristic is used by incorporating symmetric input and corresponding lift data along the horizontal axis. This reduces the tracking error by 50%.

5. Robustness of the system is tested by performing five identical experiments with , using 10%, 15% and 100% of symmetrized training data points. No trend is observed with respect to the amount of training data.

Figure 3: DeepMPC lift tracking performance for laminar flow around rotating cylinders [1].

Figure 4: Re = 100 with online update [1].

6. Finally, online data is collected from the feedback loop at each time step and new data collected over 25sec for each update. These 500 datapoints within each interval is used to further train the RNN surrogate model. This has significantly improved the performance of the DeepMPC as compared to (a) [1] in Figure 3. Online update of the RNN system reduces both tracking error and control cost.

#### 5. SUGGESTED FUTURE WORK

The surrogate RNN prediction model proposed for the DeepMPC in this article can be very usefully implemented for many practical engineering problems where the complete system description is too complicated and poses significant difficulty in solving related control problems. This method can be used for system modelling with targeted observable states which predominantly define respective system behaviour. This improve real-time implementation of MPC for complex nonlinear systems.

**REFERENCES:**

[1] K. Bieker, S. Peitz, S. L. Brunton, J. K.- arXiv preprint arXiv, and 2019, “Deep model predictive control with online learning for complex physical systems,” 2012.

[2] I. Lenz, R. Knepper, and A. Saxena,
“DeepMPC: Learning Deep Latent Features for Model Predictive Control,” in *Robotics:
Science and Systems XI*, 2015.

————————–

<*End of Review*>