Analyzing Racial Disparities in Fatal Police Shootings: A Call for Comprehensive Reform

Gaining insights into this pressing issue requires a comprehensive analysis of data spanning from 2015 to 2023. The information indicates a generally stable pattern in the monthly occurrence of events, albeit with minor fluctuations. This demonstrates the enduring nature of the problem over time.

scenario. White individuals, who constitute a significant portion of the U.S. population, are involved in approximately 50.89% of fatal police shootings. The data about the black community is particularly alarming. Despite representing only about 13% of the U.S. population, they account for a disproportionately high 27.23% of fatal police shootings. Hispanics follow, making up around 17.98% of such incidents, while other racial groups, including Asians, Native Americans, and others, are represented in smaller percentages, at 1.99%, 1.62%, and 0.29%, respectively.

This information underscores the critical need for deeper exploration and potential enhancements in policing practices, especially considering the stark disparities in how different racial groups are affected.

Solving some flukes

When I dug further into the data, I tried to answer a question I had from the beginning about the age difference between blacks and whites.

The age difference between blacks and whites is approximately 7 per cent.

Now the question is: can it really be possible, or is this just a fluke?

Now, when we plot a histogram of the ages of blacks and whites from the data, we can see that the mean of both graphs is different.

 

We can see that the mean for the grape deviates from the normal, which means the data is not normally distributed.

Now, if we want to know that 7% is a fluke, we can try to do a t-test, but here, because the data is not distributed normally, the t-test can produce a suspicious p-value because the data is not distributed normally.

So, in a case like this, we can use a Monte Carlo method to find the p-value.

In this, what we do is make a pool of data, take random data from the pool and try to estimate a p-value with that

As a result, I observed that the probability of getting an age difference of 7 years is nearly 0, which means statistically there is not a single case in which the age difference was 7 years.

So, for the next step, I try to find a pattern, like the statistics of a person who died who is armed or something like this

“Uncovering Patterns: Exploring the Relationship Between Gun Violence, Geography, and Gang Influence in the United States”

Today in class we learned some new functions and a whole new library which is Geopy.

So, what geopy does is, it helps us to locate some geographical location on the globe using some third-party coders.

This could be helpful for plotting a geomap based on data like this

Now here is what we can observe the shooting in coastal areas such as the East Coast and West Coast is very.

Large in numbers

Now there can be several reasons for the same as population, crime rate etc.

One thing in this which I suspected is that it can be a possibility that the number of guns in a state can also affect the data of shootings but in this map, we can see that Texas does not have that many locations as compared to other states in the West coast. this observation be affected by further observation because it can be affected by the population, in Texas alone there are two hotspots, maybe they are because more people live there. I think that if I further dig into this it can answer some types of questions.

The second thing I want to know is, if is there any gang influence in the area where the density of the shootings is high. By this, I can get answers to two of my questions one being whether any gang influence is there or not and the second is if there is then how much that is influencing the youth which can be a justification for early death in some races.

Finding answers

So today to find the answers to the questions from the previous blog I analyzed the data even further.

First, I try to find some type of pattern like is there any special community which is at the top and is there any significant difference between the community at the top or not?

Before answering this question the blog in Washington Post which can be found here, states that the number of deaths in the black community is significant and there is some sort of racial discrimination by the police, but when I analyzed the data I found out that the number of white people died is approximately twice than the number of the black community and have a significant difference between the other communities also.

Here two things can be observed:

  1. Due to the new data for the deaths continuously being added to the given data, it can alter the findings.
  2. When we try to look at a different point of view, we can say that the number of deaths in black communities is higher if we compare that to their population.

Now, one more question I faced is, are there any negative operations present in this data or is this just a normal finding?

In the upcoming days, I try to find the answers to these questions and develop stronger findings.

The second project’s first discussion!

Today we discussed the next project that we must do, and, in this project, we got some data on police shootings. In this data, it is given how many deaths were there and what were their race.

So initially if we look at the data, we find that the rate of death by police shooting of black people is significantly higher than that of white people.

Why is that? this is one of the answers we try to live by our analysis.

So today in class when we were discussing this project I asked the professor a question,

What are the initial questions we are trying to answer here?

To this, he explained, that we must find the questions by looking at the data, by this he meant that when we initially don’t what we can do just simply try to perform some basic commands on the data and simply just look at the data.

So, after I had done that, I observed that.

  1. The total number of deaths among whites is higher among all races.
  2. Although the total number of deaths is higher in whites if we look at the deaths as a ratio of their population then we find that blacks have a significantly higher number.
  3. Deaths at an early age are higher in blacks followed by Hispanics.

By this several questions rises

Why there is a significant difference in the deaths between races?

What are the reasons that the age of death for the black community is so low?

Is this just a fluke or the recorded data is true?

We try to answer these several questions in this project.

Report day 1

Today we started to write our report and to do so we must do a series of tasks.

  1. Collecting all the data and findings from all the other colleagues
  2. Addressing the issue for the report
  3. Concluding the report based on our findings.
  4. Presenting the results to the person we are addressing in the report.

Today we are focusing more on the first two steps.

The first thing is collecting all the data and findings from others on this project, like the different approaches they have taken and the results of those approaches.

But the most important aspect of this report is the issue.

What is the issue?

Why are we doing this?

What are we trying to find with this project?

So, for today, these are the two things we will tackle, after that, we will move to the next step of this project report.

The article “Navigating the Challenge of Predicting Diabetes: Defining Goals and Measuring Success”

After trying to find the relation between age and diabetics I found that it is simply not possible because if we want to club the state then we will end up with even less data to do our analysis

So, I tried other things to get some good results with my model like adjusting the data to perform WLS.

But even with all this, I am still not able to get a good result out of this model.

Which forced me to square one and left me with a big question?

What exactly we are trying to do?

Are we trying to make a model to predict diabetics with inactivity or obesity? Or

Are we trying to answer back to the CDC that can predict it or not?

If we are trying to make a model then we have not achieved any success till now, how can we decide that yes this is the accuracy we are looking for?

This is a very stupid but very important question for me because it changes the final report so much.

And on the topic of the final report we are parallelly we are starting to write a report because we have to submit a report on upcoming Monday.