Viewing a single comment thread. View all comments

120pi t1_j080qtj wrote

Since I'm getting the down vote love here let me add some context to this. A human reader would see "white man" to mean Caucasian, not a man that is either dressed in all white clothing or has their skin painted white or has little melatonin. Annotating white in this context when training an NER would not make sense contextually if the goal is to identify color entities; labeling "white-skinned/light-skinned" would make sense as a color annotation.

A Finnish accountant during tax season and a Finnish-American surfer in Hawaii probably have different levels of melatonin in their skin but are both "white" (racially).

1

EatTheRichBabies t1_j0crvix wrote

Nah, this is a super ambiguous example that even humans don't agree on. Maybe try something like "buffalo buffalo buffalo" :) or some word like "the tortoise leapfrogged the hare" what animals were involved in the race? Should be 2 and not 3.

Doesn't mean specialized ners aren't better tho, just that this white man example ain't a good test.

3