Home

Awesome

Analysis of Harvey-related TCEQ emissions reports

This repository contains data and Python code used to analyze emissions reports submitted by industrial facilities to the Texas Commission on Environmental Quality's Air Emission Event Reporting Database.

Please see the related article for additional context.

Table Of Contents

Data

Inputs

The main inputs are the TCEQ emissions reports, scraped from the commission's database. We started with report number 265500 and incremented the report number until we could find no more reports. The raw report pages, as HTML files, are available in the inputs/scraped-reports folder.

We also created a text file, disaster-declaration-counties.txt, listing the 54 counties that Gov. Greg Abbott included on the State Disaster Declaration through Aug. 27.

Finally, the file reports-to-ignore.txt includes five emissions reports that either (a) predated Harvey's landfall and didn't clearly indicate a connection to the storm, or (b) appeared to be duplicative of previous reports.

Outputs

In the 00-parse-reports notebook, we extract structured data from the raw HTML reports, and save it to two files:

In the 01-analyze-reports notebook, we analyze the data extracted from the reports, limiting the findings to reports (a) in the 54 counties above, (b) indicating an event-beginning date of August 23 or later, and (c) of the type "AIR SHUTDOWN" or "EMISSIONS EVENT". The main results can be found in these two files:

Reproducibility

To reproduce the findings, you'll need to do the following:

Feedback / Questions?

Contact Jeremy Singer-Vine at jeremy.singer-vine@buzzfeed.com.

Looking for more from BuzzFeed News? Click here for a list of our open-sourced projects, data, and code.