Machine Learning Techniques in System Identifiction
System identrification is the area within Automatic Control that deals with estimating models of dynamical systems based on observed input-output data. This area is well established with textbooks, software and industrial applications. It traditionally builds on standard statistical techniques such as Maximum Likelihood parameter estimation, criterion minimizatiion, consistency and accuracy analysis.
Machine learning is an area with roots in artificial intelligence and statistical learning which in many respect has the same goals as system identification. It is therefore essential and fruitful to explore and exploit the common grounds. This is the background and rationale of this CADICS project. It is conducted partly with cooperation with an Italian team in Padova and is also linked with the project LEARN which has an European Research Council advanced grant.
The CADICS resarchers that work with this project are Lennart Ljung, Henrik Ohlsson and Tianshi Chen.
Estimation of Impulse Response Functions, Regularization and "Gaussian Processes"
The impulse response g of a linear system is defined as y(t) = g1 u(t-1)+g2 u(t-2) + g3 u(t-1) + .... The coefficients gk can be estimated by traditional system identification techniques from parameteric input-ouput models. They can also be estimated by linear regression techniques as so called FIR models. By allowing regularization of the corresponding Least squares criterion a better bias/variance trade-off can be secured with careful tuning of the regularization matrix. This is closely related to Bayesian posterior estimates. It also corresponds to the so called Gaussian Processes regression appoach studied in machine learning. Thr figure below shows the fit between the estimated and true impulse responses (100% means a perfect fit) over a set of 5000 test systems. Each system is marked by a cross, and the box denotes the 75 and 25 percentiles, with the median marked by a red line. The left plot is the conventional approach and the right one is for the regularized approach. Clearly the latter has the potential of providing more robust estimates.