Viewing a single comment thread. View all comments

[deleted] t1_j3i3rwu wrote

[deleted]

2

universal_explainer OP t1_j3ij9jz wrote

Hey, thanks for trying it out!

First, do you mind sharing an example of different queries that return the same results? I have not been able to reproduce that (unless, of course, the queries are semantically similar, in which case that would be expected).

Also, of course exact search is far superior if you know the title of the paper you are looking for! In that regime, Google Scholar wins every time. However, semantic search might be better if you either a) can't remember the title but do remember some of the content or b) are simply looking to explore papers based on a handful of keywords.

Finally, the size of the database has no bearing on the quality of the embeddings, since I'm using the pretrained model by OpenAI. There is no notion of "popularity" except to rank the 10 papers with the highest cosine similarity to the query embedding according a citation score (if it's available).

3

fakesoicansayshit t1_j47vdrd wrote

Man, connect it to a chatbot after fine tuning it with the citations numbers as a human feedback input and you got yourself an uncensored, local, ML assistant!

Willing to share embeddings?

1