Viewing a single comment thread. View all comments

jonas__m t1_j45cym1 wrote

Data Shapely is one option but can be computationally expensive. If you’re looking for practical code to try running on real data, here are some tutorials to find the least useful data:

https://docs.cleanlab.ai/stable/tutorials/image.html

https://docs.cleanlab.ai/stable/tutorials/outliers.html

as well as the MOST useful data to label next (or collect an extra label for):

https://github.com/cleanlab/examples/blob/master/active_learning_multiannotator/active_learning.ipynb

2