Submitted by KD_A t3_127pbst in MachineLearning
Jean-Porte t1_jeg5xpd wrote
How does this compare to Huggingface zero shot NLI pipelines, eg https://huggingface.co/sileod/deberta-v3-base-tasksource-nli ?
KD_A OP t1_jegfh7i wrote
Great question! I have no idea lol.
More seriously, it depends on what you mean by "compare". CAPPr w/ powerful GPT-3+ models is likely gonna be more accurate. But you need to pay to hit OpenAI endpoints, so it's not a fair comparison IMO.
If you can't pay to hit OpenAI endpoints, then a fairer comparison would be CAPPr + GPT-2—specifically, the smallest one in HuggingFace, or whatever's closest in inference speed to something like bart-large-mnli
. But then another issue which pops up is that GPT-2 was not explicitly trained on the NLI/MNLI task in the same way bart-large-mnli
was. So I'd need to finetune GPT-2 (small) on MNLI to make a fairer comparison.
If I had a bunch of compute and time, I'd like to benchmark (or find benchmarks) for the following text classification approaches, varying the amount of training data if feasible, and ideally on tasks which are more realistic than SuperGLUE:
- similarity embeddings
- S-BERT
- GPT-3+ (they claim their ada model is quite good)
- sampling
- MNLI-trained models
- CAPPr
Viewing a single comment thread. View all comments