WarAndGeese t1_jdy5z29 wrote on March 28, 2023 at 1:20 AM

Reply to comment by tt54l32v in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9

I'll call them applications rather than neural networks or LLMs for simplicity.

The first application is just what OP is doing and what people are talking about in this thread, that is, asking for sources.

The second application has access to research paper databases, through some API presumably. For each answer that the first application outputs, the second answer queries it against the databases. If it gets a match, it returns a success. If it does not find the paper (this could be because it doesn't exist or becauase the title was too different from that of a real paper, either case is reasonable) it outputs that it was not found. For each paper that was not found, it outputs "This paper does not exist, please correct your citation". That output is then fed back into the first application.

Now, this second application could be a sort of database query or it could just consist of a second neural network being asked "Does this paper exist?". The former might work better but the latter would also work.

The separation is for simplicity's sake, I guess you can have one neural network doing both things. As long as each call to the neural network is well defined it doesn't really matter. The neural network wouldn't have memory between calls so functionally it should be the same. Nevertheless I say two in the same way that you can have two microservices running on a web application. It can be easier to maintain and just easier to think about.

tt54l32v t1_jdyc1h3 wrote on March 28, 2023 at 2:06 AM

So the second app might would fare better leaning towards search engine instead of LLM but some LLM would ultimately be better to allow for less precise matches of specific set of searched words.

Seems like the faster and more seamless one could make this, the closer we get to agi. To create and think it almost needs to hallucinate and then check for accuracy. Is any of this already taking place in any models?