Home

Awesome

<h1 align="center"<p> <img src="https://github.com/GeostatsGuy/GeostatsPy/blob/master/geostatspy_logo.png?raw=true" width="200" height="200" /> </p></h1>

Documentation Status

<h1 align="center">GeostatsPyDemos: GeostatsPy Python Package for Spatial Data Analytics and Geostatistics Demonstration Workflows Repository (0.0.1)</h1> <h3 align="center">Approximately 40 Well-Documented Spatial Data Analytics and Geostatistics Workflows with the GeostatsPy Package!</h3>

It is challenging to learn a new Python package. For me, great examples for common workflows are crtical. So I built out over 40 well-documented demonstration workflows that apply GeostatsPy to accomplish common spatial modeling tasks to support my students in my Data Analytics and Geostatistics, Spatial Data Analytics and Machine Learning courses and anyone else learning data analytics and machine learning.

Michael Pyrcz, Professor, The University of Texas at Austin, Data Analytics, Geostatistics and Machine Learning

Twitter | GitHub | Website | GoogleScholar | Book | YouTube | LinkedIn


Cite As:

Pyrcz, M.J., 2024, GeostatsPyDemos: GeostatsPy Python Package for Spatial Data Analytics and Geostatistics Demonstration Workflows Repository (0.0.1). Zenodo. https://zenodo.org/doi/10.5281/zenodo.12667035

DOI


Recent Updates

Here's some highlights from recent updates:

What's New with Version 0.0.1

I spent quite a bit of time checking, updating and improving all of the workflows.

I'm quite happy with the current state. I feel that this set of well-documented workflow for spatial data analytics and geostatistics in Python with GeostatsPy now lives up to its goal - to launch anyone into building spatial data analytics and geostatistics workflow with GeostatsPy! I'm stoked to help out, Michael


Setup

A minimum environment includes:

The required datasets are available in the GeoDataSets repository and linked in the workflows.

Repository Summary

More than 40 well-documented demonstration workflow for common geostatistical workflows with GeostatsPy.

Common geostatistical workflows that are included:


Installing GeostatsPy

Firstly, if you haven't installed GeostatsPy, here's the GitHub repository GeostatsPy GitHub. GeostatsPy is available on the Python Package Index (PyPI) GeostatsPy PyPI.

To install GeostatsPy, use pip

pip install geostatspy

GeostatsPy Package Dependencies

The functions rely on the following packages:

  1. numpy - for ndarrays
  2. pandas - for DataFrames
  3. numpy.linalg - for linear algebra
  4. numba - for numerical speed up
  5. scipy - for fast nearest neighbor search
  6. matplotlib.pyplot - for plotting
  7. tqdm - for progress bar
  8. statsmodels - for weighted (debiased) statistics

These packages should be available with any modern Python distribution (e.g. https://www.anaconda.com/download/).

If you get a package import error, you may have to first install some of these packages. This can usually be accomplished by opening up a command window on Windows and then typing 'python -m pip install [package-name]'. More assistance is available with the respective package docs.

GeostatsPyDemos Repository Author:

Michael Pyrcz, Professor, The University of Texas at Austin

Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions

With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development.

For more about Michael check out these links:

Twitter | GitHub | Website | GoogleScholar | Book | YouTube | LinkedIn

Want to Work Together?

I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.

I'm always happy to discuss,

Michael

Michael Pyrcz, Ph.D., P.Eng. Professor, Cockrell School of Engineering and The Jackson School of Geosciences, The University of Texas at Austin

More Resources Available at: Twitter | GitHub | Website | GoogleScholar | Book | YouTube | LinkedIn


Information about the GeostatsPy Python Package for Spatial Data Analytics and Geostatistics

The GeostatsPy Package brings GSLIB: Geostatistical Library (Deutsch and Journel, 1998) functions to Python. GSLIB is a practical and extremely robust set of code for building spatial modeling workflows.

I created the GeostatsPy Package to support my students in my Data Analytics, Geostatistics and Machine Learning courses. I find my students benefit from hands-on opportunities, in fact it is hard to imagine teaching these topics without providing the opportunity to handle the numerical methods and build workflows. Last year, I tried to have them use the original FORTRAN executables and even with support and worked out examples, it was an uphill battle. In addition, all my students and I are now working in Python for our research. Thus, having access to geostatistical methods in Python directly impacts and facilitates the research of my group. This package retains the spirit of GSLIB:

This package contains 2 parts:

  1. geostatspy.geostats includes GSLIB functions rewritten in Python. This currently includes all the variogram, distribution transformations, and spatial estimation and simulation methods. I will continue adding functions to support modeling operations for practical subsurface model cosntruction.

  2. geostatspy.GSLIB includes reimplimentation of the GSLIB visualizations and low tech wrappers of the numerical methods (note: the low-tech wrapper require access to GSLIB executables).

The GeostatsPy Authors

The GeostatsPy package is being developed at The University of Texas in the Texas Center for Geostatistics.

Package Inventory

Here's a list and some details on each of the functions available.

geostatspy.GSLIB Functions

Utilities to support moving between Python DataFrames and ndarrays, Data Tables, Gridded Data and Models in Geo-EAS file format (standard to GSLIB):

  1. ndarray2GSLIB - utility to convert 1D or 2D numpy ndarray to a GSLIB Geo-EAS file for use with GSLIB methods
  2. GSLIB2ndarray - utility to convert GSLIB Geo-EAS files to a 1D or 2D numpy ndarray for use with Python methods
  3. Dataframe2GSLIB(data_file,df) - utility to convert pandas DataFrame to a GSLIB Geo-EAS file for use with GSLIB methods
  4. GSLIB2Dataframe - utility to convert GSLIB Geo-EAS files to a pandas DataFrame for use with Python methods
  5. DataFrame2ndarray - take spatial data from a DataFrame and make a sparse 2D ndarray (NaN where no data in cell)

Visualization functions with the same parameterization as GSLIB using matplotlib:

  1. pixelplt - reimplemention in Python of GSLIB pixelplt with matplotlib methods
  2. pixelplt_st - reimplemention in Python of GSLIB pixelplt with matplotlib methods with support for sub plots
  3. pixelplt_log_st - reimplemention in Python of GSLIB pixelplt with matplotlib methods with support for sub plots and log color bar
  4. locpix - pixel plot and location map, reimplementation in Python of a GSLIB MOD with MatPlotLib methods
  5. locpix_st - pixel plot and location map, reimplementation in Python of a GSLIB MOD with MatPlotLib methods with support for sub plots
  6. locpix_log_st - pixel plot and location map, reimplementation in Python of a GSLIB MOD with MatPlotLib methods with support for sub plots and log color bar
  7. hist - histograms reimplemented in Python of GSLIB hist with MatPlotLib methods
  8. hist_st - histograms reimplemented in Python of GSLIB hist with MatPlotLib methods with support for sub plots

Data transformations

  1. affine - affine distribution transformation to correct feature mean and standard deviation
  2. nscore - normal score transform, wrapper for nscore from GSLIB (GSLIB's nscore.exe must be in working directory)
  3. declus - cell-based declustering, 2D wrapper for declus from GSLIB (GSLIB's declus.exe must be in working directory)

Spatial Continuity

  1. make_variogram - make a dictionary of variogram parameters to for application with spatial estimation and simulation
  2. gamv - irregularly sampled variogram, 2D wrapper for gam from GSLIB (.exe must be in working directory)
  3. varmap - regular spaced data, 2D wrapper for varmap from GSLIB (.exe must be in working directory)
  4. varmapv - irregular spaced data, 2D wrapper for varmap from GSLIB (.exe must be in working directory)
  5. vmodel - variogram model, 2D wrapper for vmodel from GSLIB (.exe must be in working directory)

Spatial Modeling

  1. kb2d - kriging estimation, 2D wrapper for kb2d from GSLIB (GSLIB's kb2d.exe must be in working directory)
  2. sgsim_uncond - sequential Gaussian simulation, 2D unconditional wrapper for sgsim from GSLIB (GSLIB's sgsim.exe must be in working directory)
  3. sgsim - sequential Gaussian simulation, 2D and 3D wrapper for sgsim from GSLIB (GSLIB's sgsim.exe must be in working directory)
  4. cosgsim_uncond - sequential Gaussian simulation, 2D unconditional wrapper for sgsim from GSLIB (GSLIB's sgsim.exe must be in working directory)

Spatial Model Resampling

  1. sample - sample 2D model with provided X and Y and append to DataFrame
  2. gkern - make a Gaussian kernel for convolution, moving window averaging (from Teddy Hartano, Stack Overflow)
  3. regular_sample - extract regular spaced samples from a 2D spatial model
  4. random_sample - extract random samples from a 2D spatial model
  5. DataFrame2ndarray - convent spatial point data in a DataFrame to a sparse ndarray grid
geostatspy.geostats Functions

Numerical methods in GSLIB (Deutsch and Journel, 1998) translated to Python:

  1. correct_trend - correct the order relations of an indicator-based trend model
  2. backtr - GSLIB's backtr function to transform a distribution
  3. declus - GSLIB's DECLUS program reimplimented for cell-based declustering in 2D
  4. gam - GSLIB's GAM program reimplimented for variogram calculation with regular data in 2D
  5. gamv - GSLIB's GAMV program reimplimented for variogram calculation with iregular data in 2D
  6. varmapv - GSLIB's VARMAP program reimplimented for irregularly spaced spatial data in 2D
  7. vmodel - GSLIB's VMODEL program reimplimented for visualization of nested variogram models in 2D
  8. nscore - GSLIB's NSCORE program reimplimented for normal score distribution transformation
  9. kb2d - GSLIB's KB2D program reimplimented for 2D kriging-based spatial estimation
  10. ik2d - GSLIB's IK3D program reimplimented for 2D indicator-based kriging estimation
  11. kb3d - GSLIB's kt3d program reimplimented for 3D kriging-based spatial kriging estimation
  12. sgsim - GSLIB's sgsim program reimplimented for 2D spatial simulation
  13. postsim - GSLIB's postsim program reimplimented for summarizing over multiple realizations

More functionality will be added soon.