Headz0r t1_jdz5x1q wrote on March 28, 2023 at 7:01 AM

Reply to comment by eamonious in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy

How do you define difficulty of a word?

eamonious t1_je02go2 wrote on March 28, 2023 at 1:25 PM

I work with a database that draws off experimental data on response times in human trials from Wash U St. Louis. Alternatively you can use standardized grade level vocab lists that exist in a number of states. Frequency data is also associated.

Obviously there’s no true silver bullet to defining it, but I think we all have some intuition or recognize a degree of objectivity to what a reasonably correct ordering of a random selection of words would look like based on our understanding of language (including two-word terms like “ice cream”, idiomatic phrases like “rain cats and dogs”, or borrowed expressions like “deja vu” and “savoir faire”). Which in my mind means GPT should also be able to achieve an intuition. I encourage people to try this with GPT, it doesn’t perform well (at least by any human intuition standard) in my experience.

What’s interesting to me is the possibility that the model defines the “difficulty of words” as it itself experiences them. Words that are for whatever reason more “difficult” for the model itself to assess.

Sorry, I’ll try to report back with something more concrete.

evanthebouncy OP t1_je0d4mj wrote on March 28, 2023 at 2:41 PM

You might be better off asking it binary questions such as which word is more common and which is more rare.

Then attempt to sort it.