Income Inequality and Violent Crime Rates Across U.S. States
How to set up your research design, operationalize your variables, write testable hypotheses, run and interpret bivariate and multiple regressions, and structure each section of a political science research paper on this topic — without getting lost in the data.
The topic sounds clear enough — income inequality and violent crime across U.S. states. But once you sit down with the data and a blank document, questions pile up fast. What variables do you actually use? How do you write a hypothesis that isn’t just a restatement of the obvious? What does a regression output mean in plain English? And how do you connect the numbers back to theory without just describing tables? This guide walks through each of those problems section by section.
What This Guide Covers
Why This Research Question Works
The central question — what factors explain variation in violent crime rates across U.S. states? — is well-formed for a political science research paper. It’s specific enough to be answerable with data, and broad enough to allow competing explanations. That’s the sweet spot.
Crime policy is politically contested. Some camps argue enforcement is the answer. Others say it’s structural — poverty, inequality, lack of opportunity. A paper that tests these competing claims using actual state-level data is doing exactly what quantitative political science research should do: putting theory against evidence.
It Has Multiple Explanations
Income inequality, poverty rate, and police spending each represent a different theoretical tradition. Testing all three lets you compare explanatory power directly — which is analytically interesting, not just descriptive.
The Data Exists and Is Public
FBI UCR, U.S. Census Bureau, Bureau of Justice Statistics — all the data you need is freely accessible. You’re not inventing measures or relying on surveys you designed yourself, which keeps validity high for this type of paper.
It Has Real Policy Stakes
The answer matters. If inequality drives crime more than policing does, that has direct budget implications. A paper that connects empirical findings to policy recommendations shows you understand why political science research matters.
Choosing Your Research Design
This paper uses a cross-sectional ecological research design. That sounds technical, but it just means you have one observation per state at a fixed point in time, and you’re drawing on aggregate data (state-level averages) rather than individual-level survey responses.
A cross-sectional design has two important limits you need to name in your paper. First, you can’t establish causal ordering. Your independent variables (poverty, inequality, police spending) and your dependent variable (violent crime rate) are all measured around the same time period — which means you can observe association, not cause and effect. Say that explicitly. Second, because you’re using aggregate data, your findings apply to states as units, not to individuals within states. This is called the ecological level of analysis, and it means you can’t make claims about why individual people commit crimes.
It’s common in these papers to have the dependent variable (violent crime) measured in one year — say 2020 — while independent variables come from other years (poverty estimates from 2024, police spending from 2021). That creates a temporal mismatch. You can’t call the relationship causal even with the right direction of variables. Address this directly in your methods or limitations section rather than hoping the reader doesn’t notice.
Variables: What to Use and Why
Every variable in your paper needs a conceptual definition (what it means) and an operational definition (how you’re actually measuring it). Don’t skip this. Markers look for it, and it demonstrates that you understand the difference between a theoretical concept and its real-world indicator.
| Variable | Role | Concept | Operational Measure |
|---|---|---|---|
| Violent crime rate | Dependent variable | Serious interpersonal crimes involving force or the threat of force | Average violent crime rate per 100,000 residents, from FBI UCR (murder, rape, robbery, aggravated assault) |
| Income inequality | Independent variable 1 | The degree to which income is distributed unevenly within a state’s population | Gini coefficient (0–1 scale; 0 = perfect equality, 1 = perfect inequality), typically from U.S. Census Bureau |
| Poverty rate | Independent variable 2 | The proportion of the population experiencing material deprivation | Percentage of population below the federal poverty line, from Census Bureau SAIPE estimates |
| Police spending per capita | Independent variable 3 | The level of law enforcement resources available in a state relative to its population | State and local police protection expenditure divided by population, from Bureau of Justice Statistics or Census of Governments |
One thing worth explaining clearly is the Gini coefficient. It’s not immediately intuitive. A value of 0 means every person in the state earns exactly the same income. A value of 1 means one person earns everything. In practice, U.S. states range from around 0.42 to 0.52. The higher the number, the more unequal the distribution. You should explain this in your methods section so your reader doesn’t have to guess what “higher values mean more inequality” looks like in practice.
Always use the crime rate per 100,000 residents — not the raw count of crimes. California and Wyoming both have a single observation in your dataset, but California has 40 times the population. Comparing raw crime totals would be meaningless. The rate normalizes for population size and makes cross-state comparisons valid. The FBI UCR provides the rate directly — you don’t have to calculate it yourself.
Where the Data Comes From
Use federal sources. They’re transparent, documented, and methodologically consistent across states. Here’s where each variable lives:
FBI Uniform Crime Reporting Program
Publishes annual state-level violent crime rates. The UCR collects data voluntarily from law enforcement agencies. Note that not every agency reports every year, and the FBI publishes estimates for non-reporting agencies. See ucr.fbi.gov — the 2020 data is under Crime in the U.S. 2020.
U.S. Census Bureau — American Community Survey
The ACS publishes Gini coefficients at the state level annually. Access through data.census.gov using table B19083. These are model-based estimates, and the ACS publishes margins of error alongside point estimates.
Census Bureau SAIPE Program
Small Area Income and Poverty Estimates combines survey data, administrative records, and population estimates to produce annual state poverty rates. More reliable than single-year ACS estimates at the state level. Available at census.gov/programs-surveys/saipe.
Bureau of Justice Statistics / Census of Governments
State and local government police protection expenditure is published by the BJS in its Justice Expenditure and Employment series. The Census of Governments provides a complementary source. Per capita figures require dividing by Census population estimates.
Document Every Variable’s Source Year
Create a table in your methods section showing the variable name, source, and year for each measure. If your DV is 2020 and your IVs are 2021–2024, say so. Explain why those years are used and acknowledge the temporal mismatch.
SPSS, Stata, or R
Most undergraduate POSC courses use SPSS. Set up your dataset with one row per state, columns for each variable. Run descriptive statistics first. Then bivariate regressions for each independent variable separately, then a single multiple regression with all three predictors.
Writing Your Hypotheses
A hypothesis is a testable, directional claim about the relationship between two variables. It’s not a question. It’s not a statement of possibility (“might be related to”). It says: when X goes up, Y goes up (or down) — and it should be grounded in a specific theory.
For this paper, you have three independent variables, so you need three hypotheses — one for each. Here’s the structure each one should follow:
Income inequality → relative deprivation theory (Zhuang et al., 2025; also classic work by Runciman and Merton). Poverty → strain theory. Police spending → deterrence theory. You don’t need to write a dissertation on each theory — a sentence or two explaining the causal mechanism is enough. But the theory has to be there. A hypothesis without a theory is just a guess.
How to Structure the Literature Review
The literature review isn’t a summary of articles you found. It’s an argument for why your hypotheses are reasonable predictions. Each section should correspond to one of your three explanations, walk through what the existing research says about that factor, and close with a statement of why you expect the relationship to hold (or not) in your dataset.
Economic Inequality and Crime
This is where you bring in research showing the link between Gini coefficients or income inequality measures and crime rates. Classic work in this area includes studies showing that societies with greater income gaps tend to exhibit higher rates of interpersonal violence — not just poverty, but the distance between the rich and the poor. The relative deprivation framework, developed by researchers like Runciman and extended in criminology, argues that it’s the perception of unfairness — not absolute poverty — that generates frustration and social tension. Cite 2–3 empirical studies, not just theoretical pieces, and note where findings are consistent or contested.
What to avoid: Don’t just list studies. Synthesize. “Multiple studies find a positive relationship between inequality and crime; however, the strength of this relationship varies by crime type and country context.” That’s synthesis. “Smith (2020) found X. Jones (2019) found Y.” That’s a summary, and it won’t earn full marks.Poverty and Structural Disadvantage
This section covers the evidence that concentrated poverty — especially in geographic clusters within states — predicts higher crime rates. The strain theory tradition (Merton, later Agnew’s general strain theory) argues that crime emerges when people lack legitimate means to achieve socially valued goals. Research by scholars like Ulmer, Harris, and Steffensmeier (2012) in Social Science Quarterly documents the relationship between structural disadvantage and crime across racial and ethnic lines, noting that socioeconomic context is a stronger predictor than enforcement intensity in many models. Connect this directly to your second hypothesis.
Watch the overlap: Inequality and poverty are related but distinct. A state can be relatively equal and still have high poverty (everyone is poor but equally so). Or a state can have very low average poverty but high inequality (a small wealthy elite and a large middle class). Your multiple regression is what separates their independent effects — mention this upfront so the reader understands why you’re testing both.Police Spending and Deterrence
This section covers the deterrence literature. The core argument is that visible enforcement increases the perceived risk of punishment, which should suppress crime. More spending means more officers, faster response times, and higher arrest rates — theoretically. But the empirical literature here is messy. Some studies find a significant crime-reducing effect of police presence; others find weak or null effects when controlling for socioeconomic conditions. Set this up honestly — if the evidence is mixed, say so. That actually strengthens your paper because it sets up your own findings against realistic expectations.
Don’t cherry-pick: If the literature is divided, represent both sides. A literature review that only cites studies supporting your hypothesis looks one-sided — and markers who know the field will notice. Acknowledging contradictory evidence and explaining why you still predict your hypothesized direction is more analytically sophisticated than ignoring it.Writing the Methods and Data Section
The methods section has one job: tell the reader exactly what you did and why, in enough detail that someone else could replicate your study. For a cross-sectional regression paper, this section should cover your research design, unit of analysis, data sources, variable definitions, and your analytical approach.
State Your Research Design and Unit of Analysis
Name the design explicitly: cross-sectional ecological design. State the unit of analysis: U.S. states and jurisdictions (N = 52 if you include D.C. and Puerto Rico). Explain why this design is appropriate for your research question — comparing state-level outcomes across a single time period is consistent with the question of what factors explain variation across states.
Define Each Variable — Conceptually and Operationally
For each variable, give the concept (what it represents theoretically), the operational measure (how it’s actually quantified in your dataset), the source, and the year. Include a descriptive statistics table: N, mean, median, standard deviation, min, max, and a note on the distribution shape. If a variable is right-skewed, say so — it matters for how you interpret outliers and model sensitivity.
State Your Analytical Approach
Tell the reader you used bivariate linear regression to test each hypothesis individually, and then a multiple regression model to test all three predictors simultaneously. Explain why: bivariate regressions show the raw association between each predictor and the outcome; the multiple regression isolates each variable’s independent effect while holding the others constant. Briefly mention the statistical threshold you’re using — typically p < .05 as the cutoff for statistical significance.
Acknowledge the Limitations Upfront
Name the key limitations of your design — temporal mismatch between variable years, inability to establish causal order, ecological fallacy risk (state-level patterns may not reflect individual behavior), voluntary reporting to UCR, and any missing values in your dataset. This doesn’t weaken your paper. Acknowledging limitations shows methodological maturity and protects your claims from being overstated.
Running and Interpreting Regressions
SPSS generates a lot of output. Here’s what actually matters for a political science research paper at the undergraduate or early graduate level — and what to say about it.
For each bivariate regression, you need: the unstandardized slope coefficient (B), its standard error, the t-score, the p-value, and the R-squared. The slope tells you the direction and magnitude of the relationship — for every one-unit increase in X, how much does Y change? The t-score and p-value together tell you whether that relationship is statistically significant. The R-squared tells you what percentage of the variation in violent crime is explained by that single predictor alone.
In the multiple regression, each coefficient now represents the relationship between that predictor and the outcome controlling for the other two predictors. Watch for two things: which variables remain statistically significant when controls are added, and whether the coefficients shift in size compared to the bivariate models. If a coefficient shrinks substantially, it means that variable was partly capturing variance that actually belongs to one of the other predictors. The adjusted R-squared gives you the model’s overall explanatory power, corrected for the number of predictors.
A p-value below .05 tells you the relationship is unlikely to be due to chance — that’s statistical significance. It doesn’t tell you the relationship is large or practically important. With 52 observations, even modest relationships can reach significance. Always report the coefficient alongside the p-value so the reader can judge both. “The slope was positive and statistically significant (B = 37.69, p < .001)” is more informative than “the result was significant.”
If police spending doesn’t reach p < .05 in your multiple regression, that doesn’t mean police spending has no effect on crime. It means you didn’t find a statistically significant relationship in this particular dataset with this research design. That’s a bounded claim. Say exactly that in your results and discussion — don’t overstate the null result in either direction.
The Coefficients table gives you B (unstandardized slope), Std. Error, Beta (standardized slope), t, and Sig. (p-value). For comparing predictors within the same model, look at Beta — the standardized coefficient — because it’s on a common scale. A Beta of .47 on police spending and -.015 on the Gini coefficient in the same model tells you police spending is doing more work in that model than inequality, regardless of the raw scale differences. Report both B and Beta, and explain what each one is.
Writing Up Your Results
The results section reports what you found — not what it means. That comes in the discussion or conclusion. Keep these two things separate. Mixing interpretation into your results section makes the paper harder to read and signals that you’re not fully in control of the structure.
What Belongs in the Results Section
- A brief statement of what analysis you ran
- The key output from each bivariate regression: direction, significance, R-squared
- A summary statement about whether each hypothesis is supported, partially supported, or not supported
- The multiple regression results: which predictors are significant, what the adjusted R-squared is
- Reference to your SPSS output tables — include them as figures or appendices
- A note on any unexpected findings — if the direction was opposite to what you predicted, say so
What Does Not Belong in Results
- Explanations of why the findings make sense — that’s discussion
- Policy recommendations — that’s conclusion
- New literature or theory — that belongs in your lit review or discussion
- Speculative statements about why a variable wasn’t significant
- Causal language (“X causes Y”) — you have a cross-sectional design and cannot establish causality
- Raw SPSS output dumps without explanation — always translate what the table shows into plain English
Walk through your results in the same order as your hypotheses — bivariate regression 1 (inequality), bivariate regression 2 (poverty), bivariate regression 3 (police spending), then the multiple regression model. That structure makes it easy for your reader to track what you found and connect it back to the predictions you made earlier.
The Conclusion: What It Should Actually Do
The conclusion isn’t a summary. Don’t open with “in this paper I examined…” — the reader just read the paper. The conclusion should do three specific things: state which hypothesis received the strongest support and why, connect the findings back to the theoretical frameworks from your literature review, and draw policy implications without overclaiming.
Which Hypothesis Won — and Why That Matters
If income inequality consistently shows the strongest relationship with violent crime — in both bivariate and multiple regression — that’s your lead finding. Say it clearly. Then explain what that tells us theoretically: relative deprivation and the erosion of social cohesion appear to be more consistent predictors of state-level crime than absolute poverty or enforcement resources. That’s a substantive claim backed by your data.
Connect Back to the Theoretical Debate
Your paper opened with a debate — enforcement versus structural factors. Your results say something about that debate. If socioeconomic variables outperform police spending in your models, that’s evidence that structural explanations have more empirical traction in this dataset. Tie this back to the literature you cited. This is how your paper contributes to the broader conversation rather than just describing your 52 observations.
Policy Implications — With Appropriate Hedging
This is where you can say something substantive without overclaiming. “These results suggest that policies addressing income inequality and poverty — through tax policy, social safety net programs, and targeted economic development — may be at least as important as law enforcement investment in reducing violent crime at the state level. However, the associative nature of this design limits causal claims.” That sentence does a lot of work in a short space: it draws a meaningful implication, grounds it in your findings, and acknowledges the design limit.
Mistakes That Cost Marks
These come up repeatedly in papers on this type of topic.
Using Raw Crime Counts Instead of Rates
Comparing states by total number of violent crimes ignores population size. Always use the rate per 100,000. A state with 100 crimes and 1,000 residents has a vastly different situation than a state with 100 crimes and 10 million residents.
Use the UCR Rate Variable Directly
The FBI UCR publishes the violent crime rate per 100,000 — use that as your dependent variable. Document in your methods that you’re using a population-normalized rate and explain why this is appropriate for cross-state comparison.
Calling Associations “Causes”
“Higher inequality causes higher crime” is a causal claim your cross-sectional design cannot support. You can only say the variables are associated, or that higher inequality is linked to / predicts / is positively correlated with higher crime rates.
Use Associative Language Throughout
Replace “causes,” “leads to,” and “results in” with “is associated with,” “predicts,” “is positively correlated with,” or “is linked to.” This is accurate given your design — and markers who know methods will flag causal language as an error.
Treating a Non-Significant p-Value as Proof of No Relationship
If police spending isn’t significant (p > .05), that doesn’t prove spending has no effect on crime. It means you didn’t find a significant relationship in this sample with this design. These are very different statements.
Frame Non-Significant Results Accurately
“The relationship between police spending and violent crime was not statistically significant in this model (p = .898), suggesting that policing resources alone may not be the primary driver of cross-state variation when socioeconomic factors are controlled.” That’s accurate and analytical.
Skipping Descriptive Statistics
Jumping straight to regression without first describing your data is a structural error. Descriptive stats show you understand your variables before testing them — distribution shape, outliers, and range all matter for interpreting regression results.
Include a Descriptive Statistics Table
N, mean, median, standard deviation, min, max, and a note on skewness for each variable. Note any missing values. This section also lets you flag potential outliers — like a jurisdiction with an unusually high poverty rate — before they show up as influential observations in your regression.
Political science research papers are heavily scrutinized for citation integrity. Every empirical claim borrowed from a published source — a statistic, a finding, a theoretical argument — needs a citation. Paraphrasing without attribution is still plagiarism. And if you’re working from existing datasets or published results sections, make absolutely sure your own writing and analysis are original. For a full guide to citation practice and avoiding plagiarism in academic papers, see Citing Sources and Avoiding Plagiarism: What Every Student Needs to Know.
Frequently Asked Questions
Need Help Getting Your Political Science Paper Right?
From research design and statistical interpretation to citation formatting and full paper writing support — our political science specialists work across POSC and quantitative methods courses at all levels.
Political Science Assignment Help Get StartedThe Bigger Picture
A paper on income inequality and violent crime is ultimately a paper about whether the way a society distributes its resources shapes whether people in it harm each other. That’s not a dry statistics exercise — it’s a real question with real stakes for how governments spend money, design policy, and prioritize between enforcement and welfare.
Your job isn’t to solve the debate. It’s to test three explanations against state-level data, report honestly what you found, connect it back to theory, and be clear about what your design can and can’t tell you. That’s what political science research at this level is supposed to do.
Get the structure right, cite everything properly, and don’t overclaim your findings. A paper that honestly reports a weak result for police spending is more valuable than one that shoehorns an expected outcome out of data that doesn’t support it.