Submitted by No_Captain_856 t3_ys6pve in MachineLearning
_d0s_ t1_ivxk6y9 wrote
It's also used for spatial embedding of patches in an image
Besides the positional embedding transforms also use the attention mechanism which can be beneficial for some problems on its own
No_Captain_856 OP t1_ivxkvie wrote
So if I remove the positional encoding I can use it for not sequential data, like gene expression or such? Thanks!
eigenham t1_ivxwhgg wrote
The attention head is a set to set mapping. It takes the input set, then compares each input to a context set (which can be the input set itself, or another set), and based on those comparisons outputs a new set of the same size as the input set
Out of curiosity, how were you thinking of using that for gene expression?
No_Captain_856 OP t1_ivxwq1s wrote
I wasn’t, it didn’t seem the best model to me, at least conceptually. Anyway, my thesis supervisor asked me to, and I wasn’t so sure about its applicability to that kind of data, and also on the meaning of using it in that context 🤷🏻♀️
eigenham t1_ivy0qhf wrote
I mean it definitely captures a relationship between parts of input data in ways that many other models cannot. It also cannot do everything.
Like most real world problems, there's a question of how you will represent the relevant information in data structures that best suit the ML methods you intend to use. Similarly there's a question of whether the ML methods will do what you want, given the data is in that form.
Despite the fact that transformers are killing it for ordered data, I'd say their flexibility to deal with unordered data is definitely of interest for real world problems where representations are tricky.
Viewing a single comment thread. View all comments