The American Society of Mechanical Engineers is probably not the first source you'd consult for a fundamental paper in applied mathematics. Nonetheless, in 1960 and 1961, ASME's Journal of Basic Engineering published not one, but two highly influential mathematics papers--papers that helped man set foot on the moon in the late 1960s and that today help commercial airline pilots get from New York to Seattle without winding up in, say, San Diego.
Titled "A New Approach to Linear Filtering and Prediction Problems" and "New Results in Linear Filtering and Prediction Theory," the two papers--the first by Rudolf Kalman, then of the Research Institute for Advanced Studies in Baltimore, the second by Kalman and Richard Bucy, then of the Johns Hopkins Applied Physics Laboratory--introduced a technique now widely known as Kalman filtering. In the past 30 years, Kalman filtering (often also called Kalman-Bucy filtering) has established itself as a workhorse among techniques in signal processing and control theory.
Kalman filtering addresses an age-old question: How do you get accurate information out of inaccurate data? More pressingly, How do you update a "best" estimate for the state of a system as new, but still inaccurate, data pour in? Much as a coffee filter serves to keep undesirable grounds out of your morning mug, the Kalman filter is designed to strip unwanted noise out of a stream of data.
The applications are endless. Kalman filtering has proved useful in navigational and guidance systems, radar tracking, sonar ranging, and satellite orbit determination, to name just a few areas. Kalman and Bucy's original papers have generated thousands of other papers on aspects and applications of filtering. Their work has also stimulated mathematical research in such areas as numerical methods for linear algebra.
The basic idea behind the Kalman filter, which is relatively simple, is described most easily in a discrete setting. (Loosely speaking, Kalman knocked off the discrete case in the first paper, and he and Bucy dealt with the technically more difficult continuous case in the second paper.)
Suppose you have a system, described by an m-dimensional "state" vector $x$, which you observe at times $k = 0,1,2, . . .$ . Each of your observations is corrupted by noise, however, so that what you actually see at time $k$ is $y(k) = x(k) + n(k)$ , where $y$ is the observed value and $n$ is the noise. (More generally, $y(k) = f(k) x(k) + n(k)$, where $f(k)$ is some linear transformation. For example, what you observe might be a single number, such as the bearing to a target, whose state vector has coordinates for position and velocity.) What you want is a "best" estimate, in some suitable sense, for $x(k)$ based on the observed values $y(0)$, $y(1), . . . y(k)$ and knowledge of the statistical properties of the noise.
Thus stated, the problem can't be solved. The missing ingredient is an equation that connects the current state, $x(k)$, with the previous state, $x(k - 1)$. In Kalman filtering, the connection is assumed to be linear, and there is an additional "noise" vector representing uncertainties in the dynamics of the system: $x(k) = M(k)x(k - 1) + u(k)$, where $M(k)$ is a linear transformation describing the time-evolution of the system and $u(k)$ has known (or postulated) statistical properties.
Moreover, Kalman filtering assumes that the noise is all Gaussian, with mean value 0, and "white." In effect, then, everything you need to know about $n(k)$ and $u(k)$ is contained in their respective covariance matrices.
Out of all this, Kalman plucked an eye-opening conclusion: The best estimate for the state $x(k)$, together with a covariance matrix for the error in the state-vector estimate, can be obtained recursively from the previous best estimate and its covariance matrix. In effect, the Kalman filter uses each new observation to update a probability distribution for the state of the system, with no need ever to refer back to any earlier observations.
Conceptually, this is analogous to updating a running average, $A_n$, of numbers $a_1 , a_2, . . .$ by the recursive formula $A_n = ((n - 1)/n)A_n - 1 + (1/n)a_n$ rather than by starting over each time with the formula $A_n = (1/n)(a_1 + a_2 + . . . + a_n)$.
The significance is clear: The Kalman filter does no more work for the millionth estimate than it does for the first. Moreover, the covariance computations do not depend on the observed values $y(k)$ but involve only the covariance matrices of the noise; in many cases, therefore, that part of the filter can be precomputed. The net result is an algorithm tailored to real-time applications, where data keep coming in and decisions have to be made on the spot.
"It's used in just about every inertial navigation system," says James Burrows of Boeing Computer Services in Seattle. Navigation systems do more than help the pilot keep track of where the plane is headed: "Essentially every system on the airplane gets some information from the inertial navigator," Burrows explains.
Hexad, the "fault-tolerant" gyro system on the new Boeing 777, uses a Kalman filter. The system, which was developed by Honeywell Inc., consists of six ring-laser gyros, each of which acts as a primary inertial rate sensor for a particular direction. It is the redundancy of Honeywell's design--three gyros (ideally) suffice for navigation--that produces the fault tolerance: A single malfunctioning component is easily identified and safely ignored.
One of the functions of the Kalman filter, Burrows says, is to estimate the errors of each gyro with respect to the others. The result should be a substantial reduction in the number of "false alarms," which make for unnecessary (and thus expensive) maintenance.
Important as it is, inertial navigation is not the only use for Kalman filters. "Anything that moves, if it's automated, is a candidate for a Kalman filter," says Blaise Morton, a research scientist at Honeywell's Systems and Research Center in Minneapolis. One possible use, still well in the future, is in the "smart" highway systems that will eventually orchestrate the smooth movement of cars at high speeds in tightly bunched (read: tailgating) groups known as platoons. "That's a natural problem just begging for a Kalman filter," Morton says.
Another possibility, under study at Boeing and elsewhere, is a problem known as active noise control. "Noise" in this case refers to ordinary unwanted sounds--from an airplane's engine or a neighbor's stereo, for example.
Richard Burkhart at Boeing is particularly interested in eliminating noise that penetrates the cockpit or cabin of an airplane. The idea is to create zones of quiet by broadcasting "antinoise" to cancel the unwanted noise. A Kalman filter is needed to estimate the coefficients used in constructing the antinoise. The usual implementation of the filter isn't good enough, says Burkhart: The dimension of the state vector is in the thousands, which makes the customary matrix computations impractical. Burkhart and colleagues have come up with a nontraditional implementation of the Kalman filter that gets around this computational bottleneck.
Even though specific applications invariably call for specialized versions, part of the advantage of the Kalman filter is that, by now at least, it's a standard technique. Engineers can treat it as off-the-shelf technology. If the Kalman filter weren't around, Morton speculates, people would probably wind up using ad hoc algorithms for each application. In effect, "you'd have to reinvent the wheel every time" a new problem came up, he says.
That's close to where things stood 33 years ago. Before the development of the Kalman filter, practitioners were limping along with a device known as the Wiener filter, which was developed in the 1940s by Norbert Wiener of MIT in response to some of the very practical technological problems that arose during the Second World War. The Wiener filter was a way of obtaining best estimates by analyzing time series in the frequency domain--in other words, by taking the Fourier transform of everything in sight.
Wiener's theory was profoundly influential, but the filter itself had a serious limitation: It applied only to "stationary" time series--systems whose dynamics were constant. During the 1950s, with the advent of guided missiles and the first rumblings of the space program, the need to treat nonstationary problems became pressing. Wiener filtering alone wasn't getting the job done.
The Kalman filter solved the problem. Kalman filtering applies to both stationary and nonstationary problems. Moreover, by staying strictly in the time domain, the Kalman filter avoids some of the analytic and computational pitfalls of Wiener filtering. The newer theory's algebraic approach also lends itself more readily to multidimensional problems than does Wiener's analytic machinery.
Even so, the new approach didn't catch on right away. Bucy recalls that "no one was very interested in the papers originally." But then researchers at NASA latched onto the Kalman filter as a way of dealing with problems in satellite orbit determination. Kalman filtering rapidly became a mainstay of aerospace engineering. It was used, for example, in the Ranger, Mariner, and Apollo missions of the 1960s.
In particular, the on-board computer that guided the descent of the Apollo 11 lunar module to the moon had a Kalman filter. That computer was also communicating with a system of four Doppler radar stations on Earth that were monitoring the module's position. It was important that estimates from all sources be good: The Earth-based estimates were used to adjust the on-board system; if they had disagreed too much with the on-board estimates, the mission would have had to be aborted.
It could have happened. According to William Lear, an aerospace engineer who was then at TRW in Redondo Beach, California, NASA contacted him about nine months before Apollo 11's scheduled launch because their Earth-based tracking program wasn't working. Lear, who now works for Draper Labs at the Johnson Space Center, wrote a 21-state Kalman filter program, which went into the Doppler radar system. The final check of the program, Lear recalls, was done the day before Armstrong, Aldrin, and Collins took off.
Since then, the Kalman filter has found its way into data streams everywhere, from seismology to bioengineering to econometrics. "In fact," Bucy says, "I think sometimes it's overused."
But that's the fate of any tool for which there's essentially no alternative. (It happened with the Wiener filter as well, until Kalman and Bucy came along.) The main drawback to Kalman filtering is its reliance on linearity: The theory requires it, but real-world problems don't always play by the rules.
"There's a certain range of quite commonly encountered problems that it [the Kalman filter] handles effectively, but it certainly doesn't handle any serious nonlinearity," says Mark Davis, a control theorist at Imperial College in London. Researchers have developed a method, called the "extended" Kalman filter, that adjusts a linear approximation at each filter update. That works on some problems where the nonlinear effects are small. However, says Bucy, "there are problems where this is not the case, and people use the filter and they have troubles with it. They try all sorts of fixes, but basically the problem is such that the linear theory does not apply."
That would seem to militate for the creation of a new way of filtering, one not restricted by linearity. (After all, as the complaint goes, if we can put a man on the moon, why can't we find a decent nonlinear filter?) Actually, much of the theory is already in place. However, Davis says, "It hasn't really been employed very much because it's just too complicated."
But if history teaches any lesson, it's that everything could change with the appearance of one or two new papers. The challenge might be to estimate which journal might have published them.
Return to Welch and Bishop's Kalman filter page.
Return to Greg Welch's home page.
September 17, 1999