Awesome

Introduction

A Rust library with FFI bindings for fast conversion between WGS84 longitude and latitude and British National Grid (epsg:27700) coordinates, using a Rust binary. Conversions use a standard 7-element Helmert transform with the addition of OSTN15 corrections for accuracy.

Motivation

Python (etc.) is relatively slow; this type of conversion is usually carried out in bulk, so an order-of-magnitude improvement using FFI saves both time and energy.
Convertbng is an example Python Wheel which uses this binary via ctypes and cython.

Accuracy

Conversions which solely use Helmert transforms are accurate to within around 5 metres, and are not suitable for calculations or conversions used in e.g. surveying. Thus, we use the OSTN15 transform, which adjusts for local variation within the Terrestrial Reference Frame by incorporating OSTN15 data. See here for more information.

Library Use

As a Rust Library

Add the following to your Cargo.toml (the latest version is displayed on the fourth badge at the top of this screen)

lonlat_bng = "x.x.x"

Full library documentation is available here

Note that lon, lat coordinates outside the UK bounding box will be transformed to (NAN, NAN), which cannot be mapped.

The functions exposed by the library can be found here

FFI

The FFI C-compatible functions exposed by the library are:
convert_to_bng_threaded(Array, Array) -> Array
convert_to_lonlat_threaded(Array, Array) -> Array

convert_to_osgb36_threaded(Array, Array) -> Array
convert_to_etrs89_threaded(Array, Array) -> Array)
convert_osgb36_to_ll_threaded(Array, Array) -> Array
convert_etrs89_to_ll_threaded(Array, Array) -> Array

convert_etrs89_to_osgb36_threaded(Array, Array) -> Array
convert_osgb36_to_etrs89_threaded(Array, Array) -> Array

convert_epsg3857_to_wgs84_threaded(Array, Array) -> Array

FFI and Memory Management

The library does not allocate memory using new vectors or arrays; the longitude and latitude arrays you pass to it via FFI are converted into mutable slices (an inherently unsafe operation), then mutated in-place before being passed back across the FFI boundary as C-compatible arrays. Thus, the calling code retains ownership of the allocated memory at all times – it is up to the calling program to ensure that the data passed to lonlat_bng live long enough, and are correctly freed (in practice, they will be freed automatically if using a dynamic language).

Building the Shared Library

Running cargo build --release will build an artefact called liblonlat_bng.dylib on OSX, and liblonlat_bng.a on *nix systems. Note that you'll have to generate liblonlat_bng.so for *nix hosts using the following steps:

ar -x target/release/liblonlat_bng.a
gcc -shared *.o -o target/release/liblonlat_bng.so -lrt

As a Python Package

convert_bng is available from PyPI for OSX, Windows, and *nix:
pip install convertbng
More information is available in its repository

Benchmark

A CProfile benchmark was run, comparing 50 runs of converting 1m random lon, lat pairs in NumPy arrays.

Methodology

4 Amazon EC2 C4 (compute-optimised) systems were tested
The system was first calibrated by taking the mean of five calibration runs of 100,000 repeats
A benchmark program was then run for each of the three configurations. See the benches directory for details
The five slowest function calls for each benchmark were then displayed.

Results

EC2 Instance Type	Processors (vCPU)	Rust Ctypes (s)	Rust Cython (s)	Pyproj (s)	Ctypes vs Pyproj	Cython vs Pyproj
c4.xlarge	4	14.782	11.714	9.379	58.36%	24.97%
c4.2xlarge	8	8.647	6.421	9.256	-6.57%	-30.62%
c4.4xlarge	16	6.470	3.716	9.398	-31.49%	-60.25%
c4.8xlarge	36	4.913	2.501	9.308	-48.05%	-73.35%

Conclusion

Rust is faster than PROJ.4 on an 8-CPU system – even using ctypes – and outperforms it by greater margins as the number of CPUs increase: at 36 CPUs, Rust + Cython is over 3.7x faster.

Comparing Crossbeam and Rayon

Comparing how varying threads and weights affects overall speed, using cargo bench
On both 2- and 8-core i7 machines, running convert_bng_threaded_vec using one thread per core gives optimum performance, whereas Rayon does a good job at choosing its own optimum weight.

License

MIT