Viewing a single comment thread. View all comments

FirstOrderCat t1_iyw6ute wrote

Reply to comment by Ambiwlans in bit of a call back ;) by GeneralZain

I watch NLP/LLM papers, people sure will release arxiv paper and likely apply on conference with few % improvement.

2

Ambiwlans t1_iyw78uu wrote

What metric? 5% reduction in errors of 5% improvement in score? I mean, one might be a lot bigger.

Llms are basically doa waiting on gpt4 in a few months now anyways unless they offer something really novel.

4

FirstOrderCat t1_iyw8tuu wrote

Here is recent paper, they improved previous SOTA in GSM8K by 2%: 78->80: https://arxiv.org/pdf/2211.12588v3.pdf

​

>Llms are basically doa waiting on gpt4 in a few months now anyways unless they offer something really novel.

why are you so confident? Current gpt is very far from doing any useful work, it can't replace programmer, lawyer, accounter, the is a huge space for improvement before they reach some AGI and replace knowledge workers.

2

Ambiwlans t1_iywjrxk wrote

>why are you so confident?

I never made any claim of strong agi any time soon dude. And gpt4 certainly will not be strong agi.

Although automation is taking jobs today.

6

FirstOrderCat t1_iywkjp6 wrote

yes, hand coded automation empowered by LLMs can take many jobs.

0

Madrawn t1_iyxwdi2 wrote

The current codex-davinci model from openAI still blows me away.

I basically asked it nicely to write me a vscode plugin that takes the selected text, prompts the user for the instructions and sends it off to the edit-api endpoint and replaces the text with the response. Including the changes to the package json needed to expose the setting where you put the api key and and a prompt if the key setting is empty to fill the setting.

All that in around seven prompts and in only 2 of them I had to make some changes as it fucked up a bracket in one and one where it forgot to read the apikey setting first before checking it.

It's not perfect, you still need to be able to code to check for errors, but it's already more helpful than some of my colleagues.

6