Awesome
TileReduce
TileReduce is a geoprocessing library that implements MapReduce to let you run scalable distributed spatial analysis using JavaScript and Mapbox Vector Tiles. TileReduce coordinates tasks across all available processors on a machine, so your analysis runs lightning fast.
Install
npm install @mapbox/tile-reduce
Usage
A TileReduce processor is composed of two parts; the "map" script and the "reduce" script. The "map" portion comprises the expensive processing you want to distribute, while the "reduce" script comprises the quick aggregation step.
'map' script
The map script operates on each individual tile. It's purpose is to receive one tile at a time, do analysis or processing on the tile, and write data and send results to the reduce script.
See the count example processor's map script
'reduce' script
The reduce script serves both to initialize TileReduce with job options, and to handle reducing results returned by the map script for each tile.
See the count example processor's reduce script
Options
Basic Options
zoom (required)
zoom
specifies the zoom level of tiles to retrieve from each source.
tilereduce({
zoom: 15,
// ...
})
map (required)
Path to the map script, which will be executed against each tile
tilereduce({
map: path.join(__dirname, 'map.js')
// ...
})
maxWorkers
By default, TileReduce creates one worker process per CPU. maxWorkers
may be used to limit the number of workers created
tilereduce({
maxWorkers: 3,
// ...
})
output
By default, any data written from workers is piped to process.stdout
on the main process. You can pipe to an alternative writable stream using the output
option.
tilereduce({
output: fs.createWriteStream('output-file'),
// ...
})
log
Disables logging and progress output
tilereduce({
log: false,
// ...
})
mapOptions
Passes through arbitrary options to workers. Options are made available to map scripts as global.mapOptions
tilereduce({
mapOptions: {
bufferSize: 4
}
// ...
})
// map.js
module.exports = function (sources, tile, write, done) {
global.mapOptions.bufferSize; // = 4
};
Specifying Sources (required)
Sources are specified as an array in the sources
option:
tilereduce({
sources: [
/* source objects */
],
// ...
})
MBTiles sources:
tilereduce({
sources: [
{
name: 'osmdata',
mbtiles: __dirname+'/latest.planet.mbtiles',
layers: ['osm']
}
]
})
MBTiles work well for optimizing tasks that request many tiles, since the data is stored on disk. Create your own MBTiles from vector data using tippecanoe, or use OSM QA Tiles, a continuously updated MBTiles representation of OpenStreetMap.
URL
Remote Vector Tile sources accessible over HTTP work well for mashups of datasets and datasets that would not be practical to fit on a single machine. Be aware that HTTP requests are slower than mbtiles, and throttling is typically required to avoid disrupting servers at high tile volumes. maxrate
dictates how many requests per second will be made to each remote source.
sources: [
{
name: 'streets',
url: 'https://b.tiles.mapbox.com/v4/mapbox.mapbox-streets-v6/{z}/{x}/{y}.vector.pbf',
layers: ['roads'],
maxrate: 10
}
]
raw
By default, sources will be automatically converted from their raw Vector Tile representation to GeoJSON. If you set raw: true
in an MBTiles or URL source, the raw Vector Tile data will be provided, allowing you to lazily parse features as needed. This is useful in some situations for maximizing performance.
sources: [
{
name: 'streets',
url: 'https://b.tiles.mapbox.com/v4/mapbox.mapbox-streets-v6/{z}/{x}/{y}.vector.pbf',
raw: true
}
]
Specifying Job Area
Jobs run over a geographic region represented by a set of tiles. TileReduce also accepts several area definitions that will be automatically converted into tiles.
BBOX
A valid bounding box array.
tilereduce({
bbox: [w, s, e, n],
// ...
})
GeoJSON
A valid GeoJSON geometry of any type.
tilereduce({
geojson: {"type": "Polygon", "coordinates": [/* coordinates */]},
// ...
})
Tile Array
An array of quadtiles represented as xyz arrays.
tilereduce({
tiles: [
[x, y, z]
],
// ...
})
Tile Stream
Tiles can be read from an object mode node stream. Each object in the stream should be either a string in the format x y z
or an array in the format [x, y, z]
.
tilereduce({
tileStream: /* an object mode node stream */,
// ...
})
Line separated tile list files can easily be converted into the appropriate object mode streams using binary-split:
var split = require('binary-split'),
fs = require('fs');
tilereduce({
tileStream: fs.createReadStream('/path/to/tile-file').pipe(split()),
// ...
})
Source Cover
When using MBTiles sources, a list of tiles to process can be automatically retrieved from the source metadata
tilereduce({
sourceCover: 'osmdata',
sources: [
{
name: 'osmdata',
mbtiles: __dirname+'/latest.planet.mbtiles'
}
]
// ...
})
Events
TileReduce returns an EventEmitter.
start
Fired once all workers are initialized and before the first tiles are sent for processing
tilereduce({/* ... */})
.on('start', function () {
console.log('starting');
});
map
Fired just before a tile is sent to a worker. Receives the tile and worker number assigned to process the tile.
tilereduce({/* ... */})
.on('map', function (tile, workerId) {
console.log('about to process ' + JSON.stringify(tile) +' on worker '+workerId);
});
reduce
Fired when a tile has finished processing. Receives data returned in the map function's done
callback (if any), and the tile.
var count = 0;
tilereduce({/* ... */})
.on('reduce', function (result, tile) {
console.log('got a count of ' + result + ' from ' + JSON.stringify(tile));
count++;
});
end
Fired when all queued tiles have been processed. Use this event to output final reduce results.
var count = 0;
tilereduce({/* ... */})
.on('end', function () {
console.log('Total count was: ' + count);
});
Processor Examples
-
osm-coverage - a processor for computing statistics about OpenStreetMap coverage across countries.
-
osm-sidewalker - a processor for detecting potentially untagged sidewalks in OpenStreetMap.
Development
Testing
npm test
Linting
npm run lint
Test Coverage
npm run cover