Viewing a single comment thread. View all comments

koolaidman123 t1_iy6hhbj wrote

sparse retrieval isn't mutually exclusive to deep learning, splade v2 and colbert v2 are sparse methods, because they still produce higher dimensional sparse vectors, but both leverage bert models to create the sparse representations

also cross-encoders aren't considered retrievers, but rerankers

2

[deleted] t1_iy6ze2s wrote

[deleted]

1

koolaidman123 t1_iy89kz0 wrote

Colbert v2 is literally listed in the beir sparse leaderboards...

Sparse refers to the embedding vector(s), not the model

And ranking/reranking refers to the same thing, but its still distinct from retrieval, which was my point

1

DinosParkour t1_iy8ei4w wrote

ColBERT is both on the dense and the sparse leaderboard :^)

I would describe it as a (sparse?) collection of dense embeddings per query/document, so it's hard to classify it between the two (although I'm leaning more toward the dense categorization).

2

koolaidman123 t1_iy8hbxd wrote

splade-v2 and sparta are exclusively on the sparse leaderboards, and they uses bert

the point is dispelling the notion that sparse retrieval somehow = no dl involved. it's conflating dense retrieval with neural retrieval

1