Viewing a single comment thread. View all comments

120pi t1_j056sml wrote

Generative models are not really the most effective approach for NER. Using another example as someone else provided:

What are the colors in the following sentence: "The white man like his burgers medium rare. He doesn't mind getting the red blood on his new green shirt." Return as JSON.

The colors in the sentence "The white man like his burgers medium rare. He doesn't mind getting the red blood on his new green shirt." are:

White

Red

Green

Here is the same information in JSON format:

Copy code

{ "colors": [ "White", "Red", "Green" ] } 

I hope this helps! Let me know if you have any other questions.

A properly trained NER would not have made the mistake of labeling a racial token as a color.

−4

Odd_Science t1_j05z49z wrote

You call it a mistake, but I (as a human, yes, really) would have included it in the list.

8

NoRexTreX t1_j0604jd wrote

Really? Is that convention or just a common design choice? Is it because white people are not literally white, just relatively white?

2

120pi t1_j080qtj wrote

Since I'm getting the down vote love here let me add some context to this. A human reader would see "white man" to mean Caucasian, not a man that is either dressed in all white clothing or has their skin painted white or has little melatonin. Annotating white in this context when training an NER would not make sense contextually if the goal is to identify color entities; labeling "white-skinned/light-skinned" would make sense as a color annotation.

A Finnish accountant during tax season and a Finnish-American surfer in Hawaii probably have different levels of melatonin in their skin but are both "white" (racially).

1

EatTheRichBabies t1_j0crvix wrote

Nah, this is a super ambiguous example that even humans don't agree on. Maybe try something like "buffalo buffalo buffalo" :) or some word like "the tortoise leapfrogged the hare" what animals were involved in the race? Should be 2 and not 3.

Doesn't mean specialized ners aren't better tho, just that this white man example ain't a good test.

3