Viewing a single comment thread. View all comments

-ZeroRelevance- t1_iytvl5z wrote

It can’t really rhyme due to how the model perceives words. Everything is broken up into tokens, which may represent entire words, parts of words, or even individual letters. This inconsistency makes it extremely hard for LLMs to pick up on patterns in spelling or wording that allow for quality poetry, and basically forces the model to simply rote learn common patterns. There is more discussion on that here. However, this isn’t a hopeless problem. The obvious solution to me, which is also discussed in the prior link, is simply encoding each letter as a different token. This does lead to several improvements, but it’s ultimately a tradeoff between length and quality, because encoding each character individually means that you need far more tokens (3-4x) to represent an equivalent amount of text.

3