notforrob
notforrob t1_j2lnrhg wrote
Reply to [D] What are good ways of incorporating non-sequential context into a transformer model? by abc220022
Assuming your goal is an autoregressive sequence prediction, I would just modify the start-of-sequence token. For example: Use some reasonable model which takes the non-sequential context and creates a vector. Add that vector to the the learned start-of-sequence token vector. Future time steps will be able to attend to the start-of-sequence token as needed to retrieve the context.
If you're only using a transformer encoder, and not doing the autoregressive thing, I would just add an additional token to the input. I would most likely used a learned position encoding to add to that context vector rather than the normal sequential position embedding. Any time step will be able to attend to this special token and take advantage of the context clue you're providing.
notforrob t1_je1lowh wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
This inspired me to ask GPT-4:
"Can you generate a leetcode easy problem that has never been seen?"
And then ask it to solve the problem it creates. In the few cases I tried it failed miserably.