Viewing a single comment thread. View all comments

visarga t1_j9y7624 wrote

Does it do only one round of retrieval?

3

davidmezzetti OP t1_j9y7tmq wrote

With the current version, yes it runs an embeddings query for each message. I plan to handle threaded conversations shortly. In that scenario, the chat history will be provided to the prompt.

1

dancingnightly t1_ja2lfup wrote

Is this current version mostly RAG + WebGPT semantic search to GPT answer, then?

Big fan of your recent work.

2

davidmezzetti OP t1_ja345mn wrote

Thank you.

This application is RAG with a local vector index combined with a LLM from the FLAN-T5 series of models.

The whole solution can be locally hosted with no remote runtime API dependencies.

1