Viewing a single comment thread. View all comments

akore654 t1_irsttia wrote

If we use the language analogy, if you had a sequence of 100 words. Each of those words would come from a vocabulary of a certain size (~50,000 for english) words. So for a sequence of 100 words you can chose for each position in the sequence, any of those 50,000 words.

You can see how this explodes in terms of the number of unique combinations. it is the same thing for the 16x16 grid with a vocabulary of 1024 possible discrete vectors.

I'm not entirely sure what motivates it, I just know it's a fairly successful method for text generation. Hope that helps.

1