Submitted by silverstone1903 t3_10iucs0 in MachineLearning
Hi all,
I have an e-commerce product data. It contains product description and product type. I’m using embeddings with ANN (annoy) to find similar products. However, I don’t know how to implement evaluation of vector search results. There are some metrics such as hit rate, recall but like I said above I’m confused to use them. Most of the examples I come across has a label (interaction data, explicit score etc.) therefore they can calculate metrics. Any ideas or recommendations will be appreciated!
vwings t1_j5gowwn wrote
For such retrieval systems, you would usually use Top-1, Top-5 or Top-something accuracy. Concretely, you have a list of product types (embeddings) in your database (let's say 100 or 100,000 whatever). Then you get your product description, you embed it with your ANN and then you compare it with all product type embs. then you check on which rank the correct product type ends up. From that you can calculate mean rank or top-k accuracy ..