Awesome
chicago-crimes
Exploring Chicago crimes dataset with DuckDB, Malloy Data, and soon new Panel/PyScript data and dashboard tools ...
Data Source
Data from: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/ijzp-q8t2
Note: Chicago crimes data is too large for a github repository. You can download it from the the link above.
Raw Data Views
In VSCode
Raw dataset view in VSCode with Tabular Data Viewer and Rainbow CSV:
In DBeaver
Crimes CSV
data imported in DBeaver for comparison:
In Tad Viewer
Crimes CSV
data imported into Tad Viewer:
With Polars
Quick Chicago crimes CSV
data scan and Arrests query with Polars in one cell code block :
With Polars Parquet
Loading Chicago crimes .parquet
data file with polars.read_parquet()
:
With PyArrow
Loading Chicago crimes raw CSV
data with PyArrow CSV:
With PyArrow Feather and Parquet
Writing and reading Chicago crimes PyArrow Table data in Feather and Parquet data file formats:
With DuckDB
Loading Chicago crimes CSV
data into a blank in-memory
DuckDB instance:
With DuckDB Parquet
Loading Chicago crimes .parquet
data via DuckDB read_parquet()
:
With DuckDB SQLMagic
Loading Chicago crimes CSV
data into a blank in-memory
DuckDB with ipython-sql
SQLMagic in VSCode Jupyter Notebook:
With Malloy Data
Loading Chicago crimes 2022 parquet
data with Malloy Data tools via DuckDB parquet data table source, with queries, data schema, Malloy queries outline, data preview, and query results displayed in VSCode Malloy extension query editor and views:
With Pandas
Loading Chicago crimes CSV
data with Pandas:
In R Studio
Loading Chicago crimes CSV
data with DBI R library and DuckDB R API in R Studio:
With Julia REPL
Reading Chicago crimes CSV
data with DuckDB Julia Package in VSCode Julia lang, and running it in Julia REPL:
With Julia CSVFiles and DataFrame
Loading Chicago crimes CSV
data via Julia CSVFiles into native Julia DataFrames:
In Emacs
Reading Chicago crimes CSV
data with SBCL + cl-duckdb in Emacs + SLY:
Visualizations
Collection of Jupyter notebooks and data apps visualizing Chicago crimes data from above.
With Matplotlib
Visualizing Chicago crimes data loaded with Pandas using Matplotlib:
With Altair Charts
2001-2022 Chicago crimes data loaded from a parquet
file and summarized with Pandas and Altair charts:
With PyScript
2022 Chicago crimes data loaded from a CSV
file with data summary Altair charts in a browser, using Pyodide runtime and Pandas:
https://randomfractals.github.io/chicago-crimes/apps/pyscript/
With Malloy Charts
Displaying Chicago crimes 2022 parquet
data with Malloy Charts using Malloy Import with table source, measures, and data queries defined in Malloy Data section above:
With Malloy Composer
View and query 2022 Chicago crime reports data loaded from parquet
file with Malloy Composer app in your browser:
https://randomfractals.github.io/chicago-crimes/apps/malloy-composer/
With Malloy Fiddle
View 2022 Chicago crime reports data, schema, Malloy model, and queries with Malloy Fiddle app in your browser:
https://randomfractals.github.io/chicago-crimes/apps/malloy-fiddle
With DuckDB Sql Tools
Loading and querying 7,687,725 Chicago crime reports recorded from 2001 through the end of November 2022 from a large 1.68 GB CSV data file with new VSCode DuckDB Sql Tools extension:
Exporting in-memory DuckDB instance with DuckDB Sql Tools:
Exporting DuckDB instance in Parquet data format and importing it into new test DuckDB memory instance:
Prior Works
Links to our prior works on Chicago Crimes EDA circa 2017/2018:
🔗 Chicago Crimes EDA Summary on Linkedin
📚 Chicago Crimes EDA 2017 Jupyter Notebooks
📚 Observable JS Chicago Crimes Notebook Collection 2018
📚 Chicago Homocides Observable JS Notebook Collection 2018
🐦 Chicago Crimes EDA, Data Preview and Tabular Data Viewer tweets