Master Thesis Code
by Simon Moser
Loading...
Searching...
No Matches
Part 4: Extended Kalman Filter

Last updated: 14. March 2024

Introduction

This document is the continuation of Part 2: Kalman Filter II and Part 3: RTS Smoother. Here, the state vector will be extended to include sensitivity.

Since the estimation of the sensitivity is nonlinear, the Kalman filter will be extended to an Extended Kalman Filter (EKF). The theory is based on Becker (2023) while the superscript notation of Simon (2006) is used.

‍I don't use boldface for vectors or matrices nor do I mark them in any other way that would distinguish them from scalars. This is due to the fact that I think itis simply obvious if a variable is a vector or matrix. If you don't think it is obvious, I guess this is your problem. (Not a real Quote, but close to what Dr. Franz Müller often said.)

Background

The main idea of an EKF is to linearize the nonlinear system model and/or the nonlinear measurement model at a working point. This can be done by using a first-order Taylor expansion. The linearized models are then used in the standard Kalman filter equations.

According to Taylor's theorem a function \( f(x) \) around a working point \( x_0 \) is given by an infinite Taylor expansion (Taylor, 1717, as cited in Becker, 2023).

Also, an approximation of the nonlinear function \( f(x) \) is given with a finite first-order Taylor expansion (Becker, 2023):

\[ f(x) \approx f(x_0) + \frac{\partial f(x_0)}{\partial x} (x - x_0) \]

A Jacobian matrix is nothing else than the matrix of the first-order partial derivatives of a vector-valued function.

Redefinitions

EKF state transition matrix

The state transition matrix is no longer given by the matrix \( F \) but by the nonlinear function \( f(x) \). The state transition model is then given by

\[ x_k = f\left(x_{k-1}\right) + w_k \]

where \( w_k \) is the process noise.

The state update equation is then given by

\[ \hat{x}_{f,k}^+ = f\left(\hat{x}_{f,k-1}^-\right). \]

To propagate uncertainty while keeping the gaussian PDF, the Jacobian of the state transition model is used. The Jacobian can be written as

\[ \frac{\partial f}{\partial x} \ \text {or} \ \frac{\partial f(x)}{\partial x}. \]

Note
Sometimes the Jacobian is also written as \( F \) in the literature. But this can be a bit confusing since \( F \) is also used for the state transition matrix in the linear case.

Then, the covariance extrapolation equation is given by

\[ P_{f,k}^- = \frac{\partial f}{\partial x} P_{f,k}^+ \left(\frac{\partial f}{\partial x}\right)^T + Q. \]

EKF observation matrix

The observation model is no longer given by the matrix \( H \) but by the nonlinear function \( h(x) \). The observation model is then given by

\[ z_k = h\left(x_k\right) + v_k \]

where \( v_k \) is the measurement noise.

The state update equation is then given by

\[ \hat{x}_{f,k}^+ = \hat{x}_{f,k}^- + K_{f,k} \left(z_k - h\left(\hat{x}_{f,k}^-\right)\right). \]

To propagate uncertainty while keeping the gaussian PDF, the Jacobian of the observation model is used. The Jacobian can written as

\[ \frac{\partial h}{\partial x} \ \text {or} \ \frac{\partial h(x)}{\partial x}. \]

Note
Sometimes the Jacobian is also written as \( H \) in the literature. But this can be a bit confusing since \( H \) is also used for the observation matrix in the linear case.

Then, the Kalman gain Equation is given by

\[ K_{f,k} = P_{f,k}^- \left(\frac{\partial h}{\partial x}\right)^T \left(\frac{\partial h}{\partial x} P_{f,k}^- \left(\frac{\partial h}{\partial x}\right)^T + R\right)^{-1} \]

and the covariance update equation is given by

\[ P_{f,k}^+ = \left(I - K_{f,k} \frac{\partial h}{\partial x}\right) P_{f,k}^- \left(I - K_{f,k} \frac{\partial h}{\partial x}\right)^T + K_{f,k} R K_{f,k}^T. \]

Comparison to the linear Kalman filter

Components

Component LKF EKF
State \( x \) \( n \times 1 \) \( x \) \( n \times 1 \)
State Transition \( F \) \( n \times n \) \( f(x) \) \(f: \mathbb{R}^{n \times 1} \rightarrow \mathbb{R}^{n \times n} \)
Process Noise \( Q \) \( n \times n \) \( Q \) \( n \times n \)
Covariance \( P \) \( n \times n \) \( P \) \( n \times n \)
Observation Model \( H \) \( m \times n \) \( h(x) \) \(h: \mathbb{R}^{n \times 1} \rightarrow \mathbb{R}^{m \times n} \)
Measurement Noise \( R \) \( m \times m \) \( R \) \( m \times m \)
Kalman Gain \( K \) \( n \times m \) \( K \) \( n \times m \)

Equations

This table is based on Becker (2023), p. 255.

Equation LKF EKF
Prediction State extrapolation \( \hat{x}_{f,k+1}^- = F \hat{x}_{f,k}^+ \) \( \hat{x}_{f,k+1}^- = f\left(\hat{x}_{f,k}^+\right) \)
Covariance extrapolation \( P_{f,k+1}^- = F P_{f,k}^+ F^T + Q \) \( P_{f,k+1}^- = \frac{\partial f}{\partial x} P_{f,k}^+ \left(\frac{\partial f}{\partial x}\right)^T + Q \)
Update Kalman gain

\begin{align*} K_{f,k} &= P_{f,k}^- H^T \\ &\times \left(H P_{f,k}^- H^T + R\right)^{-1} \end{align*}

\begin{align*} K_{f,k} &= P_{f,k}^- \left(\frac{\partial h}{\partial x}\right)^T \\ &\times \left(\frac{\partial h}{\partial x} P_{f,k}^- \left(\frac{\partial h}{\partial x}\right)^T + R\right)^{-1} \end{align*}

State update \( \hat{x}_{f,k}^+ = \hat{x}_{f,k}^- + K_{f,k} \left(z_k - H \hat{x}_{f,k}^-\right) \) \( \hat{x}_{f,k}^+ = \hat{x}_{f,k}^- + K_{f,k} \left(z_k - h\left(\hat{x}_{f,k}^-\right)\right) \)
Covariance update

\begin{align*} P_{f,k}^+ &= \left(I - K_{f,k} H\right) P_{f,k}^- \\ &\times \left(I - K_{f,k} H\right)^T + K_{f,k} R K_{f,k}^T \end{align*}

\begin{align*} P_{f,k}^+ &= \left(I - K_{f,k} \frac{\partial h}{\partial x}\right) P_{f,k}^- \\ &\times \left(I - K_{f,k} \frac{\partial h}{\partial x}\right)^T + K_{f,k} R K_{f,k}^T \end{align*}

Single-Axis Movement Estimation

Now, the previously used single-axis movement estimation will be extended to include sensitivity.

Note

EKF Equations

State Extraploation

The new state vector is given by

\[ x = \begin{bmatrix} p \\ v \\ a \\ b \\ s \end{bmatrix}, \]

where \( p \) is the position, \( v \) is the velocity, \( a \) is the acceleration of the object and \( b \) and \( s \) are the bias and sensitivity of the accelerometer, respectively.

The state transition model remains linear and is given by

\[ f(x) = F = \begin{bmatrix} 1 & \Delta t & \frac{1}{2} \Delta t^2 & 0 & 0 \\ 0 & 1 & \Delta t & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}. \]

Therefore, there is no need to linearize the state transition model.

The initial state is given by

\[ x_0 = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}. \]

Covariance Extrapolation

The covariance estimate covariance is identical to an LKF given by

\[ P = \begin{bmatrix} \sigma_p^2 & \sigma_{pv} & \sigma_{pa} & \sigma_{pb} & \sigma_{ps} \\ \sigma_{pv} & \sigma_v^2 & \sigma_{va} & \sigma_{vb} & \sigma_{vs} \\ \sigma_{pa} & \sigma_{va} & \sigma_a^2 & \sigma_{ab} & \sigma_{as} \\ \sigma_{pb} & \sigma_{vb} & \sigma_{ab} & \sigma_b^2 & \sigma_{bs} \\ \sigma_{ps} & \sigma_{vs} & \sigma_{as} & \sigma_{bs} & \sigma_s^2 \end{bmatrix}. \]

The a priori estimate of the initial covariance is given by

\[ P_0 = \begin{bmatrix} 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.59^2 & 0 \\ 0 & 0 & 0 & 0 & 0.03^2 \end{bmatrix}, \]

where the position and velocity are assumed to be known perfectly. According to the datasheet, the accelerometer has a bias which es expected to be normally distributed around \( 0 \) with a standard deviation of \( 0.59 \ \mathrm{\frac{m}{s^2}} \). The sensitivity is expected to be normally distributed around \( 1 \) with a standard deviation of \( 3 \% \).

Process Noise

The process noise is identical to an LKF and given by

\[ Q = \begin{bmatrix} \frac{\Delta t^4}{4} & \frac{\Delta t^3}{2} & \frac{\Delta t^2}{2} & 0 & 0 \\ \frac{\Delta t^3}{2} & \Delta t^2 & \Delta t & 0 & 0 \\ \frac{\Delta t^2}{2} & \Delta t & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix} \sigma_a^2, \]

where \( \sigma_a^2 \) is set to \( 0.0025 \ \mathrm{\frac{m^2}{s^4}} \). The process noise of the bias and sensitivity is assumed to be zero.

Note
For real sensors, bias and sensitivity are expected to change over time and/or temperature. Setting these to zero is a simplification.

Measurement Equation

The measurement vector is given by

\[ z = \begin{bmatrix} p_{\text{meas}} \\ v_{\text{meas}} \\ a_{\text{meas}} \end{bmatrix} = \begin{bmatrix} p \\ v \\ a \cdot s + b \end{bmatrix}. \]

This means that the measurements of the position and velocity are identical to the actual state. This obviously holds only in our case, where we have pseudo-measurements. The acceleration measurement model is close to a real-world accelerometer, while still ignoring temperature and other influences.

The Jacobian is given by

\begin{align*} \frac{\partial h}{\partial x} &= \begin{bmatrix} \frac{\partial h_1}{\partial x_1} & \cdots & \frac{\partial h_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial h_m}{\partial x_1} & \cdots & \frac{\partial h_m}{\partial x_n} \end{bmatrix} \\[3mm] &= \begin{bmatrix} \frac{\partial (p)}{\partial (p)} & \frac{\partial (p)}{\partial (v)} & \frac{\partial (p)}{\partial (a)} & \frac{\partial (p)}{\partial (b)} & \frac{\partial (p)}{\partial (s)} \\ \frac{\partial (v)}{\partial (p)} & \frac{\partial (v)}{\partial (v)} & \frac{\partial (v)}{\partial (a)} & \frac{\partial (v)}{\partial (b)} & \frac{\partial (v)}{\partial (s)} \\ \frac{\partial (a \cdot s + b)}{\partial (p)} & \frac{\partial (a \cdot s + b)}{\partial (v)} & \frac{\partial (a \cdot s + b)}{\partial (a)} & \frac{\partial (a \cdot s + b)}{\partial (b)} & \frac{\partial (a \cdot s + b)}{\partial (s)} \end{bmatrix} \\[3mm] &= \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & s & 1 & a \end{bmatrix}. \end{align*}

Measurement Uncertainty

The measurement uncertainty is identical to a LKF and is given by

\[ R_k = \begin{bmatrix} \sigma_{p_{\text{pseudo}_k}} & 0 & 0 \\ 0 & \sigma_{v_{\text{pseudo}_k}} & 0 \\ 0 & 0 & \sigma_a^2 \end{bmatrix}, \]

where \( \sigma_{p_{\text{pseudo}_k}} \) and \( \sigma_{v_{\text{pseudo}_k}} \) are set accordingly to reflect the certainty of the pseudo-measurements. The acceleration measurement is assumed to be normally distributed around \( 0 \) with a standard deviation of \( 0.0449 \ \mathrm{\frac{m}{s^2}} \) (result from Allan Deviation Experiments).

Matlab Implementation Filter

A working example can be found in the example derivation4ExtendedKalmanFilter.m. The following code is a simplified version of the example.

% observations
z; % to be defined prior
% time step
dt; % to be defined prior
% number of observations
N = width(z);
% Initialize state vector
x = nan(5, N);
x_0 = [0; 0; 0; 0; 1];
% Initialize covariance matrix
P = nan(5, 5, N);
P_0 = zeros(5, 5); P_0(4,4) = 0.59^2; P_0(5,5) = 0.03^2;
% State Transition Matrix
F = [1, dt, 0.5*dt^2, 0, 0;
0, 1, dt, 0, 0;
0, 0, 1, 0, 0;
0, 0, 0, 1, 0;
0, 0, 0, 0, 1];
% Process Noise Matrix
Q = [dt^4/4, dt^3/2, dt^2/2, 0, 0; ...
dt^3/2, dt^2, dt, 0, 0; ...
dt^2/2, dt, 1, 0, 0; ...
0, 0, 0, 0, 0; ...
0, 0, 0, 0, 0];
% Measurement Noise Matrix
R = [0.5, 0, 0 ; ...
0, 0.01, 0 ; ...
0, 0, 0.00449^2];
% iterate over observations
for k = 1:N
if k == 1
% initial prediction
x(:, k) = F * x_0;
P(:, :, k) = F * P_0 * F' + Q;
else
% prediction
x(:, k) = F * x(:, k-1);
P(:, :, k) = F * P(:, :, k-1) * F' + Q;
end
% calculate Jacobian
h = [x(1, k) ; ...
x(2, k) ; ...
x(3, k) * x(5, k) + x(4, k)];
dH = [1, 0, 0, 0, 0 ; ...
0, 1, 0, 0, 0 ; ...
0, 0, x(5, k), 1, x(3, k)];
% handle missing measurements
if isnan(z(1, k))
% no position measurement
dH(1, :) = 0;
z(1, k) = 0;
end
if isnan(z(2, k))
% no velocity measurement
dH(2, :) = 0;
z(2, k) = 0;
end
% Kalman gain
K = P(:, :, k) * dH' * inv(dH * P(:, :, k) * dH' + R);
% update
x(:, k) = x(:, k) + K * (z(:, k) - h);
P(:, :, k) = (eye(5) - K * dH) * P(:, :, k) * (eye(5) - K * dH)' + K * R * K';
end

Results after Filtering

The results of the filtering can be seen in the following figure. The position, velocity, and acceleration are shown in the first three subplots. The bias and sensitivity errors are shown in the last subplot.

Results of the extended Kalman filter

RTS Smoothing Equations

Since our extrapolation model is linear, the RTS smoother equations are identical to the LKF and are given by

\begin{align*} \mathcal{I}_{f,k+1}^- &= \left(P_{f,k+1}^-\right)^{-1} \\ K_{b,k} &= P_{f,k}^+ F^T \mathcal{I}_{f,k+1}^- \\ \hat{x}_{b,k}^+ &= \hat{x}_{f,k}^+ + K_{b,k}(\hat{x}_{b,k+1}^+ - \hat{x}_{f,k+1}^-) \\ P_{b,k}^+ &= P_{f,k}^+ - K_{b,k}(P_{f,k+1}^- - P_{b,k+1}^+) K_{b,k}^T. \end{align*}

Matlab Implementation Smoother

A working example can be found in the example derivation4ExtendedKalmanFilter.m. The following code is a simplified version of the example.

% iterate over observations
for k = N-1:-1:1
if k == N-1
% initialization
x_bp(:, N) = x_fp(:, N);
P_bp(:, :, N) = P_fp(:, :, N);
end
I = P_fm(:, :, k+1)^-1;
K = P_fp(:, :, k) * F' * I;
x_bp(:,k) = x_fp(:,k) + K * (x_bp(:,k+1) - x_fm(:,k+1));
P_bp(:,:,k) = P_fp(:,:,k) - K * (P_fm(:,:,k+1) - P_bp(:,:,k+1)) * K';
end
Note
The designations _fp, _fm, and _bp are used to distinguish between the a priori, a posteriori, and a posteriori smoothed estimates, respectively.

Results after Smoothing

The results of the smoothing can be seen in the following figure. The position, velocity, and acceleration are shown in the first three subplots. The bias and sensitivity errors are shown in the last subplot.

Results of the extended Kalman filter and RTS smoother

Conclusion

The Extended Kalman Filter (EKF) was derived and applied to a single-axis movement estimation. The results show that the EKF is able to estimate the bias and sensitivity of the accelerometer. However, an EKF with a sensitivity performed worse than a LKF without sensitivity in our tests. But, if smoothing, the EKF-RTS approach performed a bit better than the LKF-RTS approach.