Longjumping_Essay498 OP t1_j2f9r9s wrote on December 31, 2022 at 9:01 PM

Reply to comment by currentscurrents in [Discussion] is attention an explanation? by Longjumping_Essay498

Let say if for some example we dig into these attention maps, and find some perspective of some head for attending words. For an example in gpt some head focus on parts of speech. Will it always reliably do it for all example? What do you think. Can we manually evaluate and categorize the learnings??

IntelArtiGen t1_j2fcirf wrote on December 31, 2022 at 9:22 PM

You can just say "the network evaluated that it needed to give more attention to these parts to perform the task". You can speculate why but you can't be sure.

currentscurrents t1_j2fduvv wrote on December 31, 2022 at 9:31 PM

You can get some information this way, but not everything you would want to know. You can try it yourself with BertViz.

The information you do get can be useful though. For example in image processing, you can use the attention map from an object classifier to see where the object is in the image.