Awesome
Okay, so you heard about the Extract / Transform / Load pattern, and you're looking for a PHP library to do the stuff. Alright, let's go!
bentools/etl
is a versatile PHP library for implementing the Extract, Transform, Load (ETL) pattern, designed to streamline data processing tasks.
Table of Contents
Concepts
Let's cover the basic concepts:
- Extract: you have a source of data (a database, a CSV file, whatever) - an extractor is able to read that data and provide an iterator of items
- Transform: apply transformation to each item. A transformer may generate 0, 1 or several items to load (for example, 1 item may generate multiple SQL queries)
- Load: load transformed item to the destination. For example, extracted items have been transformed to SQL queries, and your loader will run those queries against your database.
Installation
composer require bentools/etl
[!WARNING] Current version (4.0) is a complete redesign and introduces significant BC (backward compatibility) breaks. Avoid upgrading from
^2.0
or^3.0
unless you're fully aware of the changes.
Usage
Now let's have a look on how simple it is:
use BenTools\ETL\EtlExecutor;
// Given
$singers = ['Bob Marley', 'Amy Winehouse'];
// Transform each singer's name to uppercase and process the array
$etl = (new EtlExecutor())
->transformWith(fn (string $name) => strtoupper($name));
// When
$report = $etl->process($singers);
// Then
var_dump($report->output); // ["BOB MARLEY", "AMY WINEHOUSE"]
OK, that wasn't really hard, here we basically don't have to extract anything (we can already iterate on $singers
),
and we're not loading anywhere, except into PHP's memory.
You may ask, "why don't you just array_map('strtoupper', $singers)
?" and you're totally right.
But sometimes, extracting, transforming and / or loading get a little more complex. You may want to extract from a file, a crawled content on the web, perform one to many transformations, maybe skip some items, or reuse some extraction, transformation or loading logic.
Here's another example of what you can do:
use BenTools\ETL\EventDispatcher\Event\TransformEvent;
use BenTools\ETL\Loader\JSONLoader;
use function BenTools\ETL\extractFrom;
$executor = extractFrom(function () {
yield ['firstName' => 'Barack', 'lastName' => 'Obama'];
yield ['firstName' => 'Donald', 'lastName' => 'Trump'];
yield ['firstName' => 'Joe', 'lastName' => 'Biden'];
})
->transformWith(fn (array $item) => implode(' ', array_values($item)))
->loadInto(new JSONLoader())
->onTransform(function (TransformEvent $event) {
if ('Donald Trump' === $event->transformResult->value) {
$event->state->skip();
}
});
$report = $executor->process();
dump($report->output); // string '["Barack Obama", "Joe Biden"]'
Or:
$report = $executor->process(destination: 'file:///tmp/presidents.json');
var_dump($report->output); // string 'file:///tmp/presidents.json' - content has been written here
You get the point. Now you're up to write your own workflows!
Continue reading the Getting Started Guide.
Contribute
Contributions are welcome! Don't hesitate to suggest recipes.
This library is 100% covered with Pest tests.
Please ensure to run tests using the command below and maintain code coverage before submitting PRs.
composer ci:check
License
MIT.