Home

Awesome

Gentrification and demographic analysis — 2000 to 2017

This repository contains data, analytic code, and findings that support portions of the BuzzFeed News article, “These 11 Maps Show How Black People Have Been Driven Out Of Neighborhoods In Five Of The Most Gentrified US Cities,” published February 27, 2020. Please read that article, which contains important context and details, before proceeding.

Data

The data used in this analysis come from three sources: US Census Bureau, censusreporter.org, and Logan et al.’s Longitudinal Tract Data Base.

Data from the US Census Bureau

The analysis uses two Census datasets, described below.

American Community Survey results for 2013–2017

The analysis uses data from the American Community Survey’s 2013–2017 estimates, the most recent five-year demographic estimates available from the Census Bureau.

To obtain this data, BuzzFeed News downloaded it from the Census’s API. (The Python code used to do so can be found in this repository’s 01-download-census-data.ipynb notebook.)

BuzzFeed News downloaded the data for every tract in in every county in the following Metropolitan Statistical Areas (MSA):

The results can be found in output/census_tracts.csv. For each tract in the MSA, that dataset includes the following variables:

Census tract shapefiles

BuzzFeed News also downloaded shapefiles detailing the geographic boundaries and Census tracts for California, Georgia, Maryland, New York, and the District of Columbia from the Census Bureau’s website. These files have been saved in data/censusTracts/states/.

Data from censusreporter.org

BuzzFeed News used Census Reporter to obtain a list of Census tracts that intersect with the official Census boundaries of the five cities to be analyzed:

These files have been saved in data/city_tracts/.

Data from Logan et al.’s Longitudinal Tract Data Base

Every decade, the Census updates some of its tract boundaries, based on population increases and decreases. To make tract-level Census data from the 2000s and 2010s comparable, Logan et al. have created the Longitudinal Tract Data Base (LTDB). BuzzFeed News used this dataset to obtain demographic estimates for the year 2000, and to link them to the data for the tract-level data from the 2013-2017 American Community Survey.

Due to republishing limitations, the LTDB files are not included in this repository, but can be downloaded in full from the project’s website. To replicate the analysis, follow these steps:

The LTDB’s data dictionary can be found here.

Gentrification Methodology

BuzzFeed News’ analysis uses a methodology devised by Governing Magazine (which in turn is similar to the definition from a Columbia University study). The methodology focuses on a median income, median home value, and educational attainment metrics.

The methodology is comprised of the following two tests, as described by Governing Magazine:

Test 1: Does the tract qualify for gentrification?

A tract qualifies for potential gentrification if it meets all three of following criteria at the beginning of the study period (in this case, the year 2000):

Test 2: Has it gentrified?

A tract is considered to have gentrified if it passes the test above, and also if it meets these three additional criteria at the end of the study period (in this case, in the 2013-2017 ACS survey results):

Data analysis

The data analysis was performed in the following Jupyter notebook, using the Python programming language.

02-gentrification_measure_and_race_changes_2017.ipynb

The Python code for BuzzFeed News analysis, implementing the methodology above, can be found in the 02-analyze-gentrification-and-demographic-changes.ipynb notebook. The notebook additionally calculates percentage-point changes for six non-overlapping race/ethnicity groups.

The notebook produces the following files:

Licensing

All code in this repository is available under the MIT License. All data files in the output/ directory are available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. All data files in the data/ directory are available, under their own terms, from the sources described above.

Feedback / Questions?

Contact Lam Thuy Vo at lam.vo@buzzfeed.com and Lo Bénichou from Mapbox at lo.benichou@mapbox.com.

Looking for more from BuzzFeed News? Click here for a list of our open-sourced projects, data, and code.