You must log in or register to comment.

AutoModerator t1_ivbejcc wrote

Welcome to r/science! This is a heavily moderated subreddit in order to keep the discussion on science. However, we recognize that many people want to discuss how they feel the research relates to their own personal lives, so to give people a space to do that, personal anecdotes are now allowed as responses to this comment. Any anecdotal comments elsewhere in the discussion will continue to be removed and our normal comment rules still apply to other comments.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.


TaserLord t1_ivbhs60 wrote

I'd love to see this research broadened - is this specific to Facebook, or do other platforms have the same tendencies? Does the strength of this effect vary with platform?


-Wesley- t1_ivchbuz wrote

If I read the article and associated paper correctly, it’s based on the police departments own publishing of arrests. Nothing to do with FB selecting what you see.


-Wesley- t1_ivcii2b wrote

Read the article and associated paper. If cops arrest 100 people for assault (30 are black and 70 other), then the cops own FB page will publish 3 of the black cases and 5 of the other cases. So now it looks like 60% the cases in town are black people when it’s actually half that amount. The paper does a good job of differentiating between crimes.


mrbeamis t1_ivculpg wrote

And what's black/white arrest rate?


Maktesh t1_ivdl5vc wrote

I'd also be curious if this pertains to the types of crimes.

Certain types of crimes are just more "exciting." (e.g. CCTV footage of Walmart shenanigans.) I wouldn't be surprised if the social media sharing has to do with how unfamiliar or bizarre a certain event is to the reposter.


Perunov t1_ivdwn50 wrote

Wait, so crimes that were reported but didn't result in any arrest would automatically "over-report" because they're comparing different stats? Or did I misunderstand something?

Part 1 along with murder include burglary, assault and robbery. Robbery's resolve rate in 2020 is what, around 30 percent? Larceny is 17%? So if you throw in those rates together you already can't compare rates of "reported" to "arrest made", and I can't find any mentions of trying to correct for this (I don't even know if you can automatically control this without manual mapping of reports).


adam_demamps_wingman t1_ivdxkdc wrote

In American law enforcement, apparently the only thing required to determine a suspect is Black is announcing the subject is Black.

That’s part of the reason why the FBI states no racial inferences should be made using its annual crime report. Some other reasons not to trust the annual crime report? Incomplete and intentionally inaccurate reporting to the FBI.

BTW, the FBI has stated for decades that white supremacist organizations have been infiltrating law enforcement agencies at all levels of government. Any racial data produced by them should be rejected out of hand.

EDIT: here is an early FBI report on penetration of law enforcement by white supremacist organizations issued in 2006. A redacted copy became public in 2017. This unredacted copy was made public in 2020.


Yes_hes_that_guy t1_ivdy8z6 wrote

This isn’t about the platform itself. It’s about the departments’ choices of which arrests to report on publicly. This study does focus on Facebook because it’s the most popular platform, making the dataset larger, but it isn’t about algorithms like I initially assumed before reading the study.


Perunov t1_ivdzmta wrote

It does not seem to be.

Quote (emphasis mine): > we identify nearly 100,000 posts that report on the race of individuals suspected of or arrested for crimes

So "suspected" is included in addition to arrests. If someone is suspected but arrest is never made, the initial "suspect" post will be counted but it won't result in matching "arrest" record for ethnic counts. Or are they misrepresenting the methodology?

It would be more useful to just compare reported arrest with actual arrest records and see how large the variation is, but then they also mention how unreliable some agencies' reporting on ethnicity is.


jorg2 t1_ive5fvw wrote

I mean, this report is a nice representation of this fact, no? Police departments are 25% more likely to post about the arrest of a black suspect, this seems like the proof you need to point at a hidden anti-black agenda amongst a significant portion of them.


carpeson t1_ive7n2j wrote

Haven't read the discussion but that comes on top of a very likely emphasis on focusing on neighborhoods and suspects that show a higher concentration of skin-pigmentation.


peer-reviewed-myopia t1_ived3nk wrote

Regardless of how I feel about this topic, this study is layered with so many questionable assumptions and manipulations of data, it's hard to take any of their conclusions seriously.

With an initial sample of 15,851 agencies and 11,058,289 posts, there better be good reason for excluding ~63% and ~94% of them respectively.

> > We took several steps to clean the data.

> - A very small number of agency-months report 10,000, 20,000, 30,000, 40,000, 50,000 or 60,000 arrests for a given crime type. Because these figures are improbably large, we assumed they actually reflect missing values. > - In the rows the total number of arrests for at least one crime type is smaller than the sum of race-specific arrests for that crime, we replaced the total number of arrests with the sum of race-specific arrests. > - We dropped incidents in which the most serious offense was not a UCR Part I offense. > - We drop rows for suspects who lack race information.

> > We then compute agency-level measures of the proportion of reported offenders who are Black based on the remaining rows. >

Wow. I guess they really did "clean" the data.


Strazdas1 t1_ivenqk1 wrote

A more surprising thing is that US law enforcement mantain 14 000 facebook pages. That sounds exessive. Also this means they examined an average of 8 posts per page, per 10 year period. So less than 1 post per year if they were being representative.


Strazdas1 t1_ivenuej wrote

Theres also he confidentiallity issue. They will not share fottage of something that could be considered a breach of the persons data, so for example financial crimes are pretty much nonexistant on such pages.


shitposts_over_9000 t1_iveogs3 wrote

Community moderation 101:

Topics that directly affect the community will attract more attention by said community.

Topics that contain tropes will perform better than ones that do not.

Outrage & shock will outperform Tropes.

Taken in order:

Random crimes like car break ins get discussed as a warning to others, then celebrated when the perpetrator is finally caught far more than crimes that don't directly effect people on platforms like Facebook but far less on platforms like Twitter. Demographically those crimes are almost never evenly distributed in the first place. Where I live car break ins are 27yr old white junkies on average, across town not so much.

Who discusses crime online also varies by demographic and it mostly tends to be the poor and the upper middle class when a crime is committed somewhere unexpected.

The tropes kind of write themselves these days. Any time you see security video of a gas station robbery the ones where the assailant doesn't know how to hold the gun properly will always outperform one that is less colorful. Bonus points if they obviously have more gold jewellery than the cash value of the take from the robbery.

Outage and shock has also become a self-fulfilling prophecy of sorts as well. In areas with very low case close rates it is much easier for things like the assaults on elderly Asians to happen because it is unlikely anyone will ever be punished. It also means that it is a good bet that intimidating or eliminating a witness gives better chances of not being caught by the cops. Again, demographics of who is doing this vary from place to place a little, but less than in the previous example because serious gangs are usually a prerequisite and those tend to have very little ethnic diversity.

Which platforms have more or less of this depends greatly in what the platform is primarily used for.

Twitter is for arguing politics and complaining to the social media managers of large corporations so it sees less than Facebook which is more general, but not as much as something like nextdoor which has a very local focus.

Places like Reddit are also influencing this as well. When popular sites start issuing ban warnings and perform admin removals over links to government issued crime statistics it motivates people to post individual examples as news to that platform and creates Streisand effect additional discussions of said statistics and the platform's political motivation in suppressing them.

Tl; Dr the type of platform affects the level of attention & the people most likely to discuss crime they were the victim of are far from evenly distributed so neither are the perpetrators. The type and details of the crime heavily affect it's likelihood to be discussed as well.


Strazdas1 t1_iveoz0o wrote

another reason not to trust the report - they keep changing the system so most departments cant even report the data properly and get excluded. It got so bad that last year they didnt even release the report due to less than half the departments sending in the data.


carpeson t1_ivetw0f wrote

Very good point to bring up. Let's look at the difference of "focusing more police activity somewhere where it is needed more" and the resulting better probability of crime being discovered there.

This comparison of two different ideas can be seen if you look at two numbers - the first is the number of crimes actually committed - let's say it's 0.05 in neighborhoods with less pigmentation (for different reasons - like more money and better education just to mention a few) and 0.2 in neighborhoods with more pigmentation. (numbers are made up but that shouldn't matter) Now the above number of crimes actually being committed is a hidden one - we can never know the exact number because we won't be able to find every crime that is being committed. Now we have somewhat of an idea that higher pigmentation neighborhoods are more prone to crime so we focus more police activity there simply because we assume the probability of us finding a crime being committed is higher than in neighborhoods with less pigmentation. Now what happens is we get a second number, let's call it "crimes actually discovered by police" (we work with fixed numbers but in reality we would work with probabilities here). This number is way lower in richer neighborhoods because police activity isn't as thorough there - we get 0.02. The number in poorer neighborhoods is along the lines of 0.1.

You see that there are two reasons numbers regarding commited crimes can be higher. Firstly because there are more crimes commited but also because we have more police activity there. It can be very easily overlooked but EVEN though it makes sense to have more police activity in regions where people are expected to commit more crimes this also influences the statistics towards a higher discoverage-rate in sich regions compared to other regions. This interaction is easily overlooked but should be known by researchers of the field.

If you didn't understand my question previously I now hope you know what I was referring to.


Strazdas1 t1_ivf52v0 wrote

> UCR part 1

Part I Offenses include murder, rape, aggravated assault, robbery, burglary, larceny, motor vehicle theft, arson, human trafficking – commercial sex acts, and human trafficking – involuntary servitude.

So basically any arrests for things that are lighter than that got dropped from the data.

The second bullet point is if the sum of race-categorized arrests is not the same as sum of arrests in the dataset they replaced the total with the sum of race-categorized ones. This theoretically shouldnt be an issue and are probably rounding errors.


howelftw t1_ivfkgk6 wrote

Yes I agree with you it's not about the platform itself it's about the departments of choices.

The algorithm of these platforms shows the people what they want to see and the people nowadays are more interested in discrimination and racial abuse.


vem777 t1_ivfr344 wrote

It is a very good move taken by FBI because there shouldn't be any racial preferences to show about crimes and what kind of crimes that the person have committed.

If any kind of racial data is produced it should be rejected and shouldn't be considered on first hand.


PDAP-JoshChamberlain t1_ivfusrc wrote

I work at an org which locates police data for consumption by researchers / journalists. We've been asked to help compare press releases to underlying data for similar's wild how many departments use facebook as their website!


kysiseen t1_ivgl8cz wrote

Yes I agree with you that all the ML based companies suffer from the same bias.

They are the one who promote the racism and other kind of discrimination and after that they are the ones who reports to these kind of things first and wants to gather more and more spotlight.


carpeson t1_ivix3mi wrote

Do you mean we have qualitative interviews of people who support consistency between arrest rates and the race of people being arrested? I don't quite see where you are going with this one but it doesn't quite resonate with what I previously said.

I was talking about a nonlinear relation between the actual crimes committed and the crimes discovered - this 'non-linearity' is moderated by the amount of police activity. Meaning: even though poorer neighborhoods commit more crimes and therefore get more police activity this in turn also leads to a higher discovery rate of crimes. The discovery rate is higher in poor neighborhoods compared to rich neighborhoods. Remember we are talking about discovery rate, not about crime rate.


carpeson t1_ivix8nh wrote

Hey UP, do you have any tips for me how I could have differently formulated the 'non-linearity' between crime-rate and crime-discovery-rate (moderated by police activity)? What I mean seems to go over some people's heads - so maybe I missed something or didn't explain something well enough.


i_have_thick_loads t1_ivj7o57 wrote

>I was talking about a nonlinear relation between the actual crimes committed and the crimes discovered - this 'non-linearity' is moderated by the amount of police activity

And again, police presence is driven by homicide rate and reports to police.

>The discovery rate is higher in poor neighborhoods compared to rich neighborhoods.

We know what you said. Just because you typed out a stupid thought experiment doesn't make your concern valid. There are independent measures of crime to which you could measure reasonably appropriate level of police activity. One measure is homicide rate. And there's more or less a general factor for crime. Areas with higher homicides will most probably have higher rates of other crime.


carpeson t1_ivjktpv wrote

>And again, police presence is driven by homicide rate and reports to police.

I understand where you are coming from, I really do, but this discussion is not about what is driving police precence but about a possibly-not-considered interaction between the "local arest rates" (=the documented crime) vs. the "real number of crimes commited" and the moderator of "police activity".

You keep explaining to me - over and over again - that our moderator is determined by x, y, z and I am very proud of your attempt of taking part in the discussion and this information is surely great for any later (and very basic) modeling but at this point it is redundant at best.


>We know what you said. Just because you typed out a stupid thought experiment doesn't make your concern valid.

The subreddit you are dwelling on right now is called r/science; a heavily moderated forum for people that want to talk about science. The thought experient was for the purpose of understanding solely, it´s redundancy can be measured based on the general r/science-users ability to know what a "non-linear interaction between x and y, moderated by z" means.I must also question not only your temper but also your comprehention of basic-research - the field I wanted to expand my ideas on where questions are asked not because they have a direct implication for anything but because we like to understand things. Coincidentally also my field of research in the real world.

>There are independent measures of crime to which you could measure reasonably appropriate level of police activity. One measure is homicide rate. And there's more or less a general factor for crime.

Now we are getting somewhere. Let´s take the general factor for crime and call it g. g is great but how do we actually find out the g of a zoned area? Well we can´t but we can approximate it. Let´s call this approximation g´. Does this approximation take into account that more police activity from a higher g will probably also result in a g´ that is closer (maybe even further away - we don´t know yet) to the original g? Well what I propose is we find out whether or not the distance between g and g´ (let´s call that distance d) is always the same regardless of how big g is. What I propose as a reasonable hypothesis is that d is moderated by police activity.

If you were so kind to NOT give me another analysis of what factors contribute to police activity because I feel like I am talking to a toddler. What I really wanted was people comming up with intresting study designs or recommentations for papers that tackled similar ideas. I don´t want to conduct the study myself but a few hours ago I was still intrested in talking about it. I should have research the topic myself - would have saved some time but probably missed the opportunity to talk to other scientists that might criticise my approach or my idea in a well thought out manner.

>Areas with higher homicides will most probably have higher rates of other crime.

Let´s have a little quiz, shall we? Why is this information redundant to me?

Yes, because it talks not about what I was talking about but rather about a general positive correlation between g and g´. Well done.

Give me a break.


i_have_thick_loads t1_ivk68nx wrote

Yes; you continue claiming i didn't understand your point that police presence mediates a lower actual - documented crime gap in low income urban settings, but this is unlikely. Homicide rates are measurement invariant, and because there's a positive manifold for criminality, you should be able to extract theoretical actual crimes rates from homicide rates plus a few other hopefully somewhat orthogonal (and measurement invariant such as reported stolen vehicles to insurance companies or law enforcement?) input variables. The gap between theoretical crime - documented crime would give you the evidence for which regions have the highest crime gaps, and whether crime gap variance is associated with law enforcement presence variance to establish an unlikely hypothesis.


carpeson t1_ivl15ep wrote

Great input. I still don't believe that's the whole picture. In most of Europe less weapons have a strong positive correlation with less homicides so we can't use this metric to encompass most of the crime spectrum. Car theft is much more common and can definitely be used as another way to approximate g. This still doesn't include drug trade and prostitution. Most of the times such cases of organized crime have their own Para-justice systems in place where allowing one crime doesn't automatically mean you allowed every crime (most notably homicides, which are a big no-go even in communities where crime is normalized).

I also believe we are working with a moderating force, not a mediating one but that's besides the point and might be quite a high-level criticism.

There is still much to be discovered in this field. Looking forward to some new discoveries.


grahamster00 t1_ivlggqi wrote

I'd imagine this has something to do with sampling bias; the Facebook users tend to be spread out across the country while crime, especially violent crime, in the US tends to be centered in urban areas, so using "Local arrest rates" as opposed to "National arrest rates" when you're examining national crime stories is a bit misleading, no?