Submitted by Carrasco_Santo t3_11yntgf in MachineLearning

I'm seeing the various attempts, all valid and very welcome, to create general conversation chats at the level of ChatGPT 4.0 or similar.

But I would find it very helpful if some of the current attempts to make a general conversation chat at the level of GPT 4, were a specific conversation chat (weak artificial intelligence, but strong - term coined by me now) that did everything that ChagGPT 4.0 in the area of informatics, mainly programming.

That is, instead of having a 7B model, for example, dedicated to general conversation that works more or less, we would have a 7B specifically designed just for computing, allowing each of us to have a private computing teacher on our own PC who teaches us and writes code for people if asked, but a teacher who "knows everything" about computers.

I'm doing a postgraduate course in Artificial Intelligence Engineering and I'm starting to enter the world, there are a lot of things I have to know. If I knew and had equipment for this, I would create a model just for this purpose.

1

Comments

You must log in or register to comment.

Nondzu t1_jd8uh3n wrote

I'm looking for the same thing as you. A model designed specifically for programming or at least similar to the capabilities that ChatGPT with DAN has. Training such a model from scratch would be an incredible challenge. However, it seems to me that a similar model may already exist, I just haven't found it yet. It would be great to simply be able to upload it to LLaMA and use it

5

Carrasco_Santo OP t1_jd9dkr6 wrote

A fully trained model with "knowledge" about programming and AI would be great, being able to interact with natural language, would make the perfect information technology home tutor.

1

darkshenron t1_jd9lmz6 wrote

I was looking for something similar and realised you can just apply an appropriate system prompt to GPT4 to narrow its focus. Some variant of “you are a helpful programming assistant. You help users answer questions related to programming in python language. If the question is not related to programming you decline to answer. “

3

currentscurrents t1_jdaq9xo wrote

Right, but you're still loading the full GPT4 to do that.

The idea is that domain-specific chatbots might have better performance at a given model size. You can see this with StableDiffusion models, the ones trained on just a few styles have much higher quality than the base model - but only for those styles.

This is basically the idea behind mixture of experts.

2

linverlan t1_jd9vckl wrote

I just wrote this computer science domain chatbot, it’s probably SOTA. You can just copy the code below and run it locally on your own machine. Let me know if you have any dependency issues, I can share a yaml file.

from googlesearch import search
import sys

query = ' '.join(sys.argv[1:]) + ' stackoverflow'
out = list(search(query, num=1, stop=1))

print(f"Answer is probably at: {out[0]}")
3

currentscurrents t1_jdaqd09 wrote

Google search uses BERT, you're just calling a language model via an API.

2

linverlan t1_jddepw6 wrote

lol you got me there. Although we are probably saving some compute by not generating.

1

Desticheq t1_jd9c1fe wrote

I'm looking to apply the PEFT technique for some llm to use in my Regis AI extension that works on top of leetcode. While GPT's fine for the hints and general conversation, there are other applications like code improvement or complexity estimation where I might benefit from a customized model

2