Machine Learning’s ‘Amazing’ Ability to Predict Chaos


Long-term prediction is difficult due to the "butterfly effect," as chaos theory pioneers found over a century ago. A complex system (like the weather, the economy, or just much anything else) can be perturbed in any way, and even the slightest change in that system can set off a chain reaction of events that can produce a significantly different future. We live in a world of uncertainty because it is impossible to determine the status of these systems with sufficient accuracy to forecast how they will behave.

The robots are here to help now though.

Researchers have used machine learning, the same computational method that has recently led to successes in artificial intelligence, to predict the future evolution of chaotic systems out to startlingly far-off horizons in a series of results published in the journals Physical Review Letters and Chaos. Outside experts are hailing the strategy as innovative and likely to find widespread adoption.

Herbert Jaeger, a professor of computer science at Jacobs University in Bremen, Germany, said: "I find it absolutely astounding how far into the future they forecast" a system's chaotic growth.

The research was conducted by seasoned chaos theorist Edward Ott and four University of Maryland colleagues. They used a reservoir computing machine learning technique to "learn" the dynamics of the Kuramoto-Sivashinsky equation, a classic chaotic system. This equation's developing solution exhibits a flame front-like behavior, flickering as it moves through a flammable material. According to Jaideep Pathak, Ott's doctoral student and the paper's main author, the equation also covers drift waves in plasmas and other phenomena and acts as "a test bed for investigating turbulence and spatiotemporal chaos."

The reservoir computer of the researchers was able to closely predict how the flamelike system would continue to evolve out to eight "Lyapunov times" into the future after training itself on data from the past evolution of the Kuramoto-Sivashinsky equation, which is roughly eight times further ahead than previous methods allowed. The length of time it takes for two nearly similar chaotic system states to exponentially diverge is known as the Lyapunov time. As a result, it frequently establishes the realm of predictability.

The eight-Lyapunov-time forecast was praised by chaos theorist Holger Kantz of the Max Planck Institute for the Physics of Complex Systems in Dresden, Germany, as being "very, really fantastic." "The use of machine learning is, in a sense, nearly as good as knowing the truth."

The program only examines data that have been recorded regarding the Kuramoto-Sivashinsky equation's developing solution; it is unaware of the equation itself. This makes the machine-learning technique effective since dynamicists' attempts to describe and forecast chaotic systems are frequently hampered by the lack of knowledge of the equations characterizing them. According to Ott and colleagues' findings, you simply need facts, not equations. With machine learning algorithms rather than complex atmospheric models, Kantz stated, "This study shows that one day we could be able to anticipate weather."

The machine-learning approach, according to experts, may also be useful for monitoring cardiac rhythms for indicators of imminent heart attacks and monitoring brain neuronal firing patterns for indicators of neuron spikes, in addition to weather forecasting. Speculatively, it may also aid in foretelling rogue waves, which put ships at risk, and perhaps even earthquakes.

Ott is especially optimistic that the new instruments may help in providing early warning of solar storms, like the one that exploded across 35,000 kilometers of the sun's surface in 1859. While blowing out some telegraph systems and causing aurora borealis to be seen all across the planet, the magnetic eruption also produced enough voltage to enable other lines to function with their power out. Experts warn that if such a solar storm struck Earth now, it would gravely harm its technological infrastructure. Ott advised just turning off the electricity and turning it back on later if you knew a storm was approaching.

By combining already-existing technologies, he, Pathak, and their coworkers Brian Hunt, Michelle Girvan, and Zhixin Lu (who is currently attending the University of Pennsylvania) were able to obtain their results. They began studying about machine learning and coming up with innovative methods to use it to organize chaos six or seven years ago, when the potent algorithm known as "deep learning" was just beginning to conquer AI tasks like picture and speech recognition. Before the deep-learning revolution, they learnt about a few promising outcomes. The "reservoir" in reservoir computing is a network of artificial neurons that are randomly linked. Jaeger and fellow German chaos theorist Harald Haas used this network to understand the dynamics of three chaotically coevolving variables in the early 2000s. The network was able to estimate the future values of the three variables out to an astonishingly far horizon after training on the three series of numbers. The computations proved impossible to manage when there were more than a few interacting variables, though. Ott and his coworkers sought a more effective plan to make reservoir computing applicable for big chaotic systems, which contain a vast number of interconnected variables. At instance, there are three spatial directions of velocity components for each place along the front of an approaching flame.

Years passed before the simple answer was discovered. In geographically extended chaotic systems, "what we exploited was the localization of the interactions," according to Pathak. Locality is the concept that factors in one location are impacted by factors in neighboring areas but not by factors in distant ones. Pathak said, "By employing it, we can basically divide up the problem into parts." Using one reservoir of neurons to learn about one patch of a system, another reservoir to learn about the next patch, and so on, with tiny overlaps of nearby domains to account for their interactions, the task may be parallelized.

Given sufficient computer resources, the reservoir computing method can handle chaotic systems of virtually any scale thanks to parallelization.

Ott outlined a three-step process for reservoir computing. Let's say you want to use it to forecast how a fire would develop. You start by measuring the height of the flame at five separate locations along the flame front. As the flickering flame moves forward over time, you keep measuring the height at these locations on the front. These data streams are fed into a pool of artificial neurons that have been selected at random. A cascade of signals is sent across the network as a result of the input data causing the neurons to fire, which in turn causes linked neurons to activate.

The second stage is to use the input data to teach the neural network about the dynamics of the developing flame front. To do this, you simultaneously measure the signal intensities of a few randomly selected neurons in the reservoir as data is fed into the system. These signals are weighted and combined in five distinct ways, yielding five numbers as outputs. The objective is to fine-tune the weights of the different signals used to calculate the outputs until those outputs consistently match the subsequent set of inputs, which are the five new heights measured along the flame front a minute later. What you want, according to Ott, is for the output to become the input at a later time.

The algorithm compares each set of predicted flame heights at each of the five points to the next set of inputs, or actual flame heights, to determine the proper weights by increasing or decreasing the weights of the various signals according to how their combinations would have produced the right values for the five outputs. The predictions get better over time as the weights are adjusted, until the algorithm is reliably able to forecast the condition of the flame one time step later.

Ott said that the prediction is really made in the third stage. The reservoir can predict how a system will develop after learning the dynamics of the system. In a sense, the network asks itself what will happen. In order to project how the heights at the five points on the flame front would change, outputs are sent back into the system as new inputs, whose outputs are then fed back into the system as inputs, and so on. The evolution of height elsewhere in the flame is predicted by additional reservoirs operating in tandem.

The researchers' PRL paper, which was published in January, includes a plot that demonstrates how closely their flamelike solution to the Kuramoto-Sivashinsky equation prediction matches the actual solution up to eight Lyapunov times before chaos ultimately triumphs and the system's actual and predicted states diverge.

The standard method for forecasting a chaotic system is to assess its circumstances as precisely as possible at one point in time, calibrate a physical model using this information, and then advance the model. A typical system's beginning circumstances would need to be measured 100,000,000 times more precisely to anticipate its future evolution eight times further in advance.

Ulrich Parlitz of the Max Planck Institute for Dynamics and Self-Organization in Göttingen, Germany, who, like Jaeger, used machine learning to low-dimensional chaotic systems in the early 2000s, described machine learning as "a highly helpful and powerful technique" as a result. I believe it is applicable to many other processes and systems and that it is not simply effective in the example they give. Parlitz and a coworker used reservoir computing to forecast the dynamics of "excitable media," like heart tissue, in an article that will shortly be published in Chaos. Parlitz believes that additional machine-learning techniques as well as deep learning will perform well for taming chaos, while being more complex and computationally costly than reservoir computing. Recently, researchers from the Massachusetts Institute of Technology and ETH Zurich used "long short-term memory" neural networks, which have recurrent loops that enable them to remember transient information for a long period, to obtain comparable findings to the Maryland team.

Ott, Pathak, Girvan, Lu, and other colleagues have made progress in putting their prediction approach into practice since the work in their PRL publication. They demonstrated how combining the data-driven, machine-learning technique with conventional model-based prediction makes it feasible to make better forecasts of chaotic systems like the Kuramoto-Sivashinsky equation in new research that has been approved for publication in Chaos. Ott believes that since we don't always have comprehensive, high-resolution data or ideal physical models, this is a more plausible path for enhancing weather prediction and related initiatives. In addition, he advised using machine learning to fill in any knowledge gaps if there were any due to ignorance. "What we should do is employ the good information that we have where we have it," he stated. With the Kuramoto-Sivashinsky equation, reliable predictions are extended out to 12 Lyapunov times. The reservoir's predictions can basically calibrate the models.

A Lyapunov time might last anywhere from a few milliseconds to millions of years depending on the system. (In the case of the weather, it's a few days.) A system is more fragile or vulnerable to the butterfly effect as comparable states depart more quickly for dissimilar futures the shorter it is. Nature is full with chaotic systems that go awry more or less fast. Strangely enough, chaos itself is hard to define. According to Amie Wilkinson, a professor of mathematics at the University of Chicago, "it's a word that most people in dynamical systems use, but they kind of hold their noses while saying it." She remarked, "You feel a little goofy for calling something chaotic when there is no accepted mathematical definition or necessary and sufficient circumstances. No notion is simple, Kantz concurred. In some circumstances, adjusting a single system parameter can change a system from chaotic to stable or vice versa.

Similar to the constant stretching and folding of dough while baking puff pastries, Wilkinson and Kantz both characterize chaos in terms of stretching and folding. Under the rolling pin, each piece of dough expands horizontally before splitting apart exponentially swiftly in two directions. The dough is then folded and pressed flat, squeezing neighboring portions vertically. The sun's stormy surface, wildfires, the weather, and all other chaotic systems behave in the same way, according to Kantz. Stretching is necessary to achieve the exponential divergence of trajectories, while folding—which results from nonlinear interactions between variables in the system—is necessary to prevent the trajectory from vanishing into infinity.

Stretching and compressing are corresponding to a system's positive and negative "Lyapunov exponents," respectively, in the various dimensions. The Maryland researchers revealed that its reservoir computer could effectively learn the values of these characterizing exponents from information about a system's development in another recent work published in Chaos. Beyond the notion that the computer tweaks its own formulae in response to input until the formulas mimic the system's dynamics, it is not yet clear why reservoir computing is so effective at learning the dynamics of chaotic systems. Ott and some of the other Maryland researchers want to employ chaos theory to better understand the inner workings of neural networks because the approach works so effectively.