You must log in or register to comment.

Aphix t1_jc799qb wrote

All good, might want to link the other ones you posted in those other post comments, too (Smarty-GPT, etc) -- most of the time you can just link the GitHub directly from the post here.


gargolito t1_jc889m6 wrote

so... what are perplexity filters ?


usc-ur OP t1_jca4ot7 wrote

Sure! The idea is that you create a language model from a given corpus (let's say BNC) and then you use a similarity measure, in this case, perplexity, but can be another one to test how well your sample (sentence) "fits" into the model distribution. Since we assume the distribution is correct, this allows us to identified malformed sentences. You can also check the paper here: