Viewing a single comment thread. View all comments

andreichiffa t1_j7t9ul8 wrote

I am pretty sure that was an Anthropic paper first (Predictability and Surprise in Large Generative Models). Makes me truly wonder WTF exactly is going on in Google lately.

As to your question, no one has stacked enough attention layers yet, but there is very high probability that they will. Someone already mentioned the ability to spell, but it could potentially help with things such as hands, number of hands/feet/legs/arms/paws/tails and other things that make a lot of generated images today disturbing.

The issue will most likely be with funding enough data, given that unlike texts most images on the internet are copyrighted (cough Getty cough).

6

currentscurrents t1_j7wk84r wrote

While those are on the same topic, they're very different papers. The Anthropic paper spends most of its time going on about safety/bias/toxicity, while the Google paper is focused on more useful things like the technical abilities of the models.

1