Awesome
odbc2parquet-PSscripts
PowerShell script(s) to programatically run the ODBC to Parquet tool written by pacman82 (https://github.com/pacman82/odbc2parquet)
Overview
The scripts will download the latest version of ODBC to Parquet and proceed to extract database tables to a subfolder of the location where the script is located.
Usage
This has been designed to be easy to run with only three key steps.
- Where you run the script is where it will save the extracts. So pick a suitable location.
- The script should be reviewed and edited to suite your situation. In windows 10 you just need to right click and select 'Open' to open the script in Notepad.
- Then in Windows 10 all you need to do right click the script and select 'Run with PowerShell'.
Customisation
The script has a few areas that can be edited to make it work in your situation.
Database Selection
The below parameters are the core options. Server, database and bitness.
param (
[Parameter()]
[string]$winX = "win64", # win64 or win32 (version of odbc2parquet to use)
[Parameter()]
[string]$server = "localhost", # server\instance
[Parameter()]
[string]$database = "ContosoRetailDW" # database
)
Extract Selection
Limiting the exported rows can be achieved by changing this block SQL code.
"SELECT * FROM [$($row[0])].[$($row[1])]"
i.e. this would limit to the first 10 rows of each table.
"SELECT TOP 10 * FROM [$($row[0])].[$($row[1])]"
Table Selection
The table selection can also be changed by changing this block of SQL code.
"SELECT s.name AS schemaName, t.name AS tableName FROM sys.tables t INNER JOIN sys.schemas s ON t.schema_id = s.schema_id WHERE t.name <> 'sysdiagrams' ORDER BY 1, 2"
i.e. this would limit to tables in the database schema 'sales'.
"SELECT s.name AS schemaName, t.name AS tableName FROM sys.tables t INNER JOIN sys.schemas s ON t.schema_id = s.schema_id WHERE t.name <> 'sysdiagrams' AND s.name = 'sales' ORDER BY 1, 2"