antonivs

antonivs t1_je82r3j wrote

Well, I do need to be a bit vague. The main DSL has about 50 instructions corresponding to actions to be performed. There's also another different sub-DSL, with about 25 instructions, to represent key features of the domain model, that allows particular scenarios to be defined and then recognized when executing.

Both DSLs are almost entirely linear and declarative, so there's no nested structure, and the only control flow is a conditional branch instruction in the top-level DSL, to support conditional execution and looping. The UI essentially acts as a wizard, so that users don't have to deal with low-level detail.

There are various ideas for the GPT model, including suggesting instructions when creating a program, self-healing when something breaks, and finally generating programs from scratch based on data that we happen to already collect anyway.

NLP will probably end up being part of it as well - for that, we'd probably use the fine-tuning approach with an existing language model as you suggested.

2

antonivs t1_je7ws1v wrote

I was referring to what the OpenAI GPT models are trained on. For GPT-3, that involved about 45 TB of text data, part of which was Common Crawl, a multi-petabyte corpus obtained from 8 years of web crawling.

On top of that, 16% of its corpus was books, totaling about 67 billion tokens.

2

antonivs t1_je1cuw1 wrote

My description may have been misleading. They did the pretraining in this case. The training corpus wasn't natural language, it was a large set of executable definitions written in a company DSL, created by customers via a web UI.

4

antonivs t1_je0pb85 wrote

Our product involves a domain-specific language, which customers typically interface to via a web UI, to control the behavior of execution. The first model this guy trained involved generating that DSL so customers could enter a natural language request and avoid having to go through a multi-step GUI flow.

They've tried using it for docs too, that worked well.

2

antonivs t1_jdyp1zw wrote

> I wouldn't get worried about training these models from scratch. Very few people are going to need those skills.

Not sure about that, unless you also mean that there are relatively few ML developers in general.

After the ChatGPT fuss began, one of our developers trained a GPT model on a couple of different subsets of our company's data, using one of the open source GPT packages, which is obviously behind GPT 3, 3.5, or 4. He got very good results though, to the point we're working on productizing it. Not every model needs to be trained on internet-sized corpuses.

49

antonivs t1_j0tkr0n wrote

As other comments have pointed out, you can’t.

It’s important to notice that this means that the term “ignition” here is misleading. The term is being used to imply that this means some physically important threshold has been reached, but that’s not true.

“Ignition” in this context is simply an arbitrary name for a symbolic point on the reaction efficiency chart. It has no physical meaning. No actual breakthrough in fusion physics has occurred, simply an improvement in efficiency.

9