Viewing a single comment thread. View all comments

firejak308 t1_iqvqnii wrote

Thanks for this explanation! I've heard the general reasoning that "transformers have variable weights" before, but I didn't quite understand the significance of that until you provided the concrete example of relationships between x1 and x3 in one input, versus x1 and x2 in another input.

2