Viewing a single comment thread. View all comments

FaustusC t1_ixh5ydr wrote

But that's the thing: unless someone's getting hit with completely falsified evidence, the arrest itself doesn't become less valid. It's irrelevant to the data whether or not a crime is uncovered because of a biased interaction or an unbiased one. The prediction model itself will still function correctly. The issue isn't measuring the data, it's getting you to start acknowledging data accuracy. A crime doesn't cease to be a crime just because it wasn't noticed for the right reasons.

1

rvkevin t1_ixjrv88 wrote

>But that's the thing: unless someone's getting hit with completely falsified evidence, the arrest itself doesn't become less valid.

It still doesn’t represent actual crime; it represents crime that the police enforced (i.e. based on police interactions). For example, if white and black people carry illegal drugs at the same rate, yet police stop and search black people more, arrests will show a disproportionate amount of drugs among black people and therefore devote more resources to black neighborhoods even when the data doesn’t merit that response.

> It's irrelevant to the data whether or not a crime is uncovered because of a biased interaction or an unbiased one.

How is a prediction model supposed to function when it doesn’t have an accurate picture of where crime occurs? If you tell the model that all of the crime happens in area A because you don’t enforce area B that heavily, how is the model supposed to know that it’s missing a crucial variable? For example, speed trap towns that gets like 50% of their funding from enforcing speed limits in a mile stretch of highway. How is the system supposed to know that speeding isn’t disproportionately worse there despite the mountain of traffic tickets given out?

>The issue isn't measuring the data, it's getting you to start acknowledging data accuracy.

How you measure the data is crucial because it’s easy to introduce selection biases into the data. What you are proposing is exactly how they are introduced since you don’t even seem to be aware it’s an issue. It is more than just whether each arrest has merit. The whole issue is that you are selecting a sample of crime to feed into the model and that sample is not gathered in an unbiased way. Instead of measuring crime, you want to measure arrests, which are not the same thing.

1