Viewing a single comment thread. View all comments

CrustalTrudger t1_j7u44hi wrote

In the simplest sense, you're guaranteed to get a pattern, one that we already know, i.e., seismic hazard is the highest around plate margins. Beyond that, sure, there's been a lot of interest in considering whether various machine learning or AI approaches might have value in forecasting. For example, there's been interest in using such approaches to perform "nowcasting", e.g., Rundle et al., 2022, which is basically trying to leverage ML techniques to figure out where in the seismic cycle we might be for particular areas (and thus improve the temporal resolution of our forecasts, i.e., trying to narrow down how far into the future we might expect a large earthquake on a given system).

Ultimately though, for anyone who's even dabbled with ML approaches (and specifically with supervised learning type approaches which are largely what's relevant for an attempt to forecast something), you'll recognize that the outcomes of these are typically only as good as the training data you can provide and this is where we hit a pretty big stumbling block. We are considering processes that, in many cases, have temporal scales of 100s to 1000s of years at minimum, but may also have significant variations occurring over 100,000s to 1,000,000s of year timescales. In terms of relatively robust and complete datasets from global seismology records, we have maybe 50 years of data. The paleosesimology or archaeoseismology records are important for forecasting, but also very spotty so we are missing huge amounts of detail, such that trying to include them in a training dataset is pretty problematic. Beyond that, there are significant problems generally from the expectation that a method (which is agnostic to the mechanics of a system) will be able to fully extrapolate behaviors based on a super limited training data set.

At the end of the day, sure, you could pump global seismicity into a variety of ML or AI techniques (and people have), but it's problematic to have expectations of reasonable performance of these approaches when you're only able to train such methods with fractions of a percent of the data necessary to adequately characterize the system beyond very specific use cases (like those highlighted above).

6