koolaidman123 t1_iy6hhbj wrote on November 29, 2022 at 2:35 AM

Reply to comment by [deleted] in [D] Difference between sparse and dense information retrieval by itsyourboiirow

sparse retrieval isn't mutually exclusive to deep learning, splade v2 and colbert v2 are sparse methods, because they still produce higher dimensional sparse vectors, but both leverage bert models to create the sparse representations

also cross-encoders aren't considered retrievers, but rerankers

[deleted] t1_iy6ze2s wrote on November 29, 2022 at 5:09 AM

[deleted]

koolaidman123 t1_iy89kz0 wrote on November 29, 2022 at 2:23 PM

Colbert v2 is literally listed in the beir sparse leaderboards...

Sparse refers to the embedding vector(s), not the model

And ranking/reranking refers to the same thing, but its still distinct from retrieval, which was my point

DinosParkour t1_iy8ei4w wrote on November 29, 2022 at 3:00 PM

ColBERT is both on the dense and the sparse leaderboard :^)

I would describe it as a (sparse?) collection of dense embeddings per query/document, so it's hard to classify it between the two (although I'm leaning more toward the dense categorization).

koolaidman123 t1_iy8hbxd wrote on November 29, 2022 at 3:21 PM

splade-v2 and sparta are exclusively on the sparse leaderboards, and they uses bert

the point is dispelling the notion that sparse retrieval somehow = no dl involved. it's conflating dense retrieval with neural retrieval