Viewing a single comment thread. View all comments

ddofer t1_j1375g8 wrote

  1. There's some nice big datasets, I cleaned an existing one from reddit for use in fact!

https://www.kaggle.com/datasets/danofer/sarcasm

  1. Regarding this being a challenging task: It's not as hard as you'd think, there's a much harder related problem though, in humor - how sarcastic, or funny something is - that's much harder! LLMs do very badly at it.

We presented a paper about this, and predicting winning jokes in games of Cards Against Humanity at EMNLP :)

"Cards Against AI: Predicting Humor in a Fill-in-the-blank Party Game"

https://arxiv.org/abs/2210.13016

https://github.com/ddofer/CAH

1