Viewing a single comment thread. View all comments

alfredr OP t1_jcumejc wrote

I'm an outsider interested in learning the landscape so my intent is to leave the question open-ended, but I'm broadly interested in architectural things like layer-design, attention mechanisms, regularization, model compression, as well as bigger picture considerations like interpretability, explainability, and fairness.

9

millenial_wh00p t1_jcun8jw wrote

Well beware open ended questions about ai/ml research in the current “gold rush” environment. If you’re into explainability and interpretability, some folks are looking into combinatorial methods for features and their interactions to predict data coverage. This plus anthropic’s papers start to open up some new ground in interpretability for CV.

https://arxiv.org/pdf/2201.12428.pdf

11

alfredr OP t1_jcuoqg4 wrote

Point taken on the "gold rush". My background is CS Theory so the incorporation of combinatorial methods feels right at home. Along these lines, are you aware of the use of any work incorporating (combinatorial) logic verification into generative language models? The end goal would be improved argument synthesis (e.g. mathematical proofs, say)

4

millenial_wh00p t1_jcuq0zo wrote

No, unfortunately most of my work is with tabular data with a bit of computer vision- I haven’t looked into any application of language models in that area unfortunately. In theory the tokenization in language models shouldn’t be much different than features in tabular/imagery data. There probably are some parallels worth exploring there, I’m just not aware of any papers.

7

Expensive-Type2132 t1_jcvddj4 wrote

If you’re outside of the community, it might be more beneficial to look at applicative papers to get an understanding of tasks, objective functions, datasets, training strategies, etc. Especially during this period where there isn’t that much architectural diversity. But, nevertheless, read whatever you’re noticed to read!

2