Awesome
lasr
Donny Keighley 6/8/2022
lasr is a package designed for reading and writing Log Ascii Standard (LAS) files in R. Currently it is in the beta testing stages. As such, it is subject to significant ongoing changes and is not complete. For instance it can’t write LAS files yet…
Goals
lasr is primarily designed to import LAS files at high speed and in large batches. To accomplish this, most of it is written in C++ and connected to R with Rcpp. It stores the data in lists of data.table’s for fast manipulation.
Currently, the focus is on supporting reading LAS 3.0 as fully as possible. There will be some effort to handle non-standard LAS files but nothing too cute. lasr is being written to load files using as little information from the header as possible which should alleviate many common issues. Beyond that, the aim is to output helpful error messages so that non-standard files can be fixed. Afterall, one of the great things about LAS files is that they are human readable and can often be fixed using a simple text editor.
lasr is not designed to accomplish traditional petrophysical workflows and there are no plans to do so. There’s plenty of industry software for that. lasr is intended as a building block to facilitate people doing new and creative things with large volumes of log data. It’s also intended to help build up the library of geoscience packages for R.
For a more traditional approach, or if you’d prefer to use Python, you should check out the excellent lasio package.
Installation
You can install lasr from
github using the
install_github
function from the devtools package. You
will need to install
RTools first since the
package needs compilation.
if (!require('devtools')) install.packages('devtools')
library(devtools)
install_github('https://github.com/donald-keighley/lasr')
Currently, the only function is read.las
which will import a vector of
LAS file paths into a multi-part list. Each section of the file is
stored as a separate element. In order to accomodate LAS 3.0 files which
may have multiple log data sections, the log parameter, log definition,
and log data are combined into numbered log elements. If your vector of
paths contains more than one file, the output list will have an element
for each file.
Here is an example reading a single LAS file that is included with the package:
library(lasr)
las = read.las(system.file("extdata", "las_3_cwls.las", package = "lasr"))
#Display the WELL section
head(las$well, 10)
## mnemonic unit value comment format
## 1: STRT M 1660.1250 First Index Value
## 2: STOP M 1660.8750 Last Index Value
## 3: STEP M 0.1250 STEP
## 4: NULL -999.25 NULL VALUE
## 5: COMP ANY OIL COMPANY INC. COMPANY
## 6: WELL ANY ET AL 01-02-03-04 WELL
## 7: FLD WILDCAT FIELD
## 8: LOC 1-2-3-4W5M LOCATION
## 9: SRVC ANY LOGGING COMPANY INC. SERVICE COMPANY
## 10: DATE 13/12/1986 Service DATE DD/MM/YYYY
## association
## 1:
## 2:
## 3:
## 4:
## 5:
## 6:
## 7:
## 8:
## 9:
## 10:
#Display the log curves
head(las$log$log.1$data, 10)
## DEPT DT DPHI NPHI YME CDES NMR[1] NMR[2]
## 1: 1660.125 123.45 0.11 0.37 1.45E+12 DOLOMITE W/VUGS 10 12
## 2: 1660.250 123.45 0.12 0.36 1.47E+12 LIMESTONE 12 15
## 3: 1660.375 123.45 0.13 0.35 2.85E+12 LOST INTERVAL 18 25
## 4: 1660.500 123.45 0.14 0.34 2.85E+12 LOST INTERVAL 18 25
## 5: 1660.625 123.45 0.15 0.33 2.85E+12 LOST INTERVAL 18 25
## 6: 1660.750 123.45 0.16 0.32 2.85E+12 SANDSTONE, SHALE STREAKS 18 25
## 7: 1660.875 123.45 0.17 0.31 2.85E+12 LOST INTERVAL 18 25
## NMR[3] NMR[4] NMR[5]
## 1: 14 18 13
## 2: 21 35 25
## 3: 10 8 17
## 4: 10 8 17
## 5: 10 8 17
## 6: 10 8 17
## 7: 10 8 17
Most LAS files are version 2 and only have one log data section. If you
know this is the case you can set flatten = TRUE
and only the first
log section will be returned. This makes referencing the log data
quicker.
las = read.las(system.file("extdata", "las_3_cwls.las", package = "lasr"), flatten=TRUE)
head(las$log$data, 10)
## DEPT DT DPHI NPHI YME CDES NMR[1] NMR[2]
## 1: 1660.125 123.45 0.11 0.37 1.45E+12 DOLOMITE W/VUGS 10 12
## 2: 1660.250 123.45 0.12 0.36 1.47E+12 LIMESTONE 12 15
## 3: 1660.375 123.45 0.13 0.35 2.85E+12 LOST INTERVAL 18 25
## 4: 1660.500 123.45 0.14 0.34 2.85E+12 LOST INTERVAL 18 25
## 5: 1660.625 123.45 0.15 0.33 2.85E+12 LOST INTERVAL 18 25
## 6: 1660.750 123.45 0.16 0.32 2.85E+12 SANDSTONE, SHALE STREAKS 18 25
## 7: 1660.875 123.45 0.17 0.31 2.85E+12 LOST INTERVAL 18 25
## NMR[3] NMR[4] NMR[5]
## 1: 14 18 13
## 2: 21 35 25
## 3: 10 8 17
## 4: 10 8 17
## 5: 10 8 17
## 6: 10 8 17
## 7: 10 8 17
Speed Test
Since the purpose of this package is to load LAS files as quickly as possible, a speed test is included here with a comparison to python’s lasio. First, download a test dataset from the KGS website. In this case we’re using the 2016 logs data. Download and unzip them into a folder called “C:/temp/logs”, or modify the code for wherever you put it.
Next, import the first 500 files. We’ll use 4 threads for this comparison, although if you have more cores you can increase the number of threads to speed it up further. Only use this option if you are importing more than a handful of files, otherwise the parallel overhead will slow it down.
files = list.files('C:/temp/logs', pattern = '.las?', full.names=TRUE)
start.time = Sys.time()
las = read.las(files[1:500],nthreads=4)
end.time = Sys.time()
time.taken = end.time - start.time
time.taken
## Time difference of 25.96407 secs
Now in Python in parallel using 4 cores:
import lasio, glob, datetime, multiprocessing
from joblib import Parallel, delayed
num_cores = 4
files = glob.glob('C:/temp/logs/*.las')
start_time = datetime.datetime.now()
if __name__ == "__main__":
las = Parallel(n_jobs=num_cores)(delayed(lasio.read)(file) for file in files[0:499])
end_time = datetime.datetime.now()
print('Duration: {}'.format(end_time - start_time))
## Duration: 0:04:45.961226
Clearly, lasr is faster, however, please don’t take this as a shot at lasio. The primary goal of this package is speed, and as such countless hours have been put into speed testing, de-bottlenecking, and enduring the pain of writing in C++. As with anything, there are tradeoffs, and lasr errs toward speed where lasio tends more toward user convenience. They are simply different products.
Good luck, and if you have any suggestions reach out!