Viewing a single comment thread. View all comments

marcus_hk t1_j2xqe50 wrote

>Are there other recent deep learning based alternatives?

Structured State Space Models

Transformers seem best suited to forming associations among discrete elements. That's what self-attention is, after all. Where transformers perform well over very long ranges (in audio generation for example) there is typically heavy use of Fourier transforms and CNNs as "feature extractors", and the transformer does not process raw data directly.

The S4 model linked above treats time-series data, not as discrete samples, but as continuous signal. Consequently it works much better.

2