Submitted by Cyalas t3_zg3bsd in MachineLearning

Hello!

I've developed a project NaimAI, to help PhDs and scientists in their scientific literaure review. To describe it brievely, it has 3 main features : 1 search in papers, 2 structures abstracts into objectives, methods and results and 3 generates automatically a (pseudo) literature review.

I wrote a medium article that goes through the details.

Github repos : https://github.com/yassinekdi/naimai

I've created a subreddit in case : r/naimai4science

I'd be happy to have your opinion about it and hopefully this could be useful!

92

Comments

You must log in or register to comment.

Nameless1995 t1_izg72td wrote

Seems like need some work. I searched some titles but didn't get any results (one was from 2019 arxiv). I found a result on https://arxiv.org/abs/1707.02786 but the result only shows the author name with et al. without the title. The review seemed pretty basic -- just a summarization of abstract (I am also a bit confused by your descriptions: are you attempting to implement a literature review where other related papers are suggested and described or a review in the sense of what reviewers in a conference provide?). And even structured abstract didn't structured much at least for this paper. I don't know may be I got unlucky with the specific papers I tried.

Edit: okay I see what you are doing with "review". You are generating "citation sentences" basically. I am not sure how useful it is as a feature because that requies minimal effort to do it in practice though. But some may find it useful.

13

Cyalas OP t1_izgb4wc wrote

For the results, you need to click on the result to get the structured abstract along with the title. The results showed are just the sentences that might be relevant to your research, classified as objectives, methods or results.

For the structured abstract, did you try other results ?

6

Nameless1995 t1_izgd34j wrote

I think I was using it in the wrong way by giving specific titles. Using keywords provides a more interesting results. It looks nice but not sure where I would stand with it. For example I tried "attention" (perhaps too broad of a search term), and only a few papers (even Transformers is missing). Should there have been a paging mechanism? Also sorting doesn't seem to be working either (sorting by date didn't change anything).

7

Cyalas OP t1_izgikyg wrote

As I explain in the medium article, it's mainly Tf Idf that is used. I've used language models but it was a bit slow to process. The code is available if you want to take a peek :) But I'll see what I can do!

Sorting by date would not change anything if the initial results are already sorted.

4

Nameless1995 t1_izgkbq7 wrote

I don't think it was sorted IIRC but may be I missed something. Also is there a way to sort in both ascending/descending directions?

5

Cyalas OP t1_izgp27h wrote

I'm afraid there is not. I thought people are generally interested into the most recent papers

2

arhetorical t1_izfi323 wrote

Sounds pretty cool! How recent are the papers that it searches? Does it automatically pull from arXiv or something?

6

Cyalas OP t1_izfoiz7 wrote

There are recent papers (2022). But no I've processed the papers before with the algorithms, as detailed on the Github page :)

4

snow-blade t1_izh6u3a wrote

This is really cool. I would, however, want to see an option where I could select a paper that falls into multiple categories. For instance, I was searching for a paper that may cover applications of NLP in history reconstruction, but I couldn't find any if I put as the keyword, let's say "history." This search term is very vague and it may apply to any result. It would be cool if one could access the results in multiple fields while searching for a vague/general theme.

2

Cyalas OP t1_izii37v wrote

Thanks, you're right, I've already got this remark before. I'll see what I can do, hopefully with the open source community (since I'm a bit taken these moments) :)

2

koiRitwikHai t1_izi177a wrote

I searched for a famous paper in my field (from a reputed conference)

It showed no results.

Paper --> Ai --> word level coreference resolution in EMNLP 2021

2

Cyalas OP t1_izihkmi wrote

I'm afraid I didn't considered it in my processed papers then. But I'm planning to get and process more papers, hopefully with help of the open source community!

1

Just_CurioussSss t1_izi8rhr wrote

In your article, you mentioned that "The search is mainly based on a v0 semantic algorithm (using TfIdf model mainly).... So the usage was pretty slow and the models were heavy (not the best user experience)."

Quick question: Have you heard of tensor search? It uses 2 key algorithms: CLIP and SBERT, where every components of the tensor can be associated with specific parts of a document, image, or video. Not only can this improve search semantics, but it can provide other key information like localization and explainability, without using text as an intermediate representation.

You can look them up: https://github.com/marqo-ai/marqo
Website: https://www.marqo.ai

2

Just_CurioussSss t1_izi8sim wrote

Also, TFIDF is lexical/algorithmic search (aka keywords-based search). It's faster, but has a lower accuracy and relevance outputs than tensor-based search. On the other hand, Marqo, with tensor-based search (where you can get the vectors from SBERT for example), allows semantic search by understanding the meaning of the text, rather than the keywords. Thus, users can search with questions, related terms or with images, audio or videos directly (or any combination thereof), allowing a better user experience and better relevant search yields.

1

Cyalas OP t1_izihswz wrote

Thanks for your comments :)

I've used tensor-based search before using Faiss Index and finetuned bert models (it's still in the code). As I mentioned in my article, that slowed down a bit the process since, each time a field is chosen, the bert model is loaded and took about 4 seconds more. That's why I switched to TF IDF. But I plan to optimize the tensor-search part more (I'll check Marqot!), hopefully with the help of the open source community :)

2

Due-Wall-915 t1_izft0rg wrote

Can I upload the list of papers ?

1

Cyalas OP t1_izftu03 wrote

On the website, under the "custom" tab.

Using the naimai Package, you can check the github repo :)

1