Viewing a single comment thread. View all comments

bushrod t1_iyjxns1 wrote

The analysis relates to time series prediction problems. Isn't it fair to say vision and language do not fall under that umbrella?

17

mtocrat t1_iyk1n65 wrote

Consider spoken language, and you're back in the realm of time-series. Obviously simple statistical methods can't deal with those though.

13

bushrod t1_iyk33jc wrote

Right, even though language is a form of time series, in practice it doesn't use TSP methods. Transformers are not surprisingly being applied to TSP problems though.

6

Warhouse512 t1_iykw25k wrote

Eh, predicting where pedestrians are going, or predicting next frames in general. Even images have temporal forecasting use cases

3

ThePhantomPhoton t1_iyk2wnq wrote

I think you have a good argument for images, but language is more challenging because we rely on positional encodings (a kind of "time") to provide us with contextual clues which beat out the following form of statistical language model: Pr{x_{t+1}|x_0, x_1, ..., x_{t}} (Edit-- that is, predicting the next word in sequence given all preceding words in the sequence)

2