Home

Awesome

Schedule master Build and merge dev into master GitHub GitHub commit activity GitHub contributors GitHub last commit GitHub release (latest by date)

Colombia Covid19 Pipeline

Pipeline to get data sources from Instituto Nacional de Salud - INS related to Covid19 cases daily report in Colombia to create datasets.

<a href="https://www.buymeacoffee.com/sebaxtian" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/default-orange.png" alt="Buy Me A Coffee" width="100" height="23" style="height: 23px !important; width: 100px !important;" ></a>


Context

The number of new cases is increasing day by day around the world. This dataset has information about reported cases from 32 Colombia departments.

Here you can find the result from my auto-learning process about data science, this dataset has a daily report from Instituto Nacional de Salud - INS about Covid19 cases reported in Colombia, also has a history report from Instituto Nacional de Salud - INS about Covid19 Samples processed in Colombia.

Content

This dataset uses the INS Covid19 report data source, I did clean the data source and fill the NaN values to generate this dataset with additional attributes like, day of the week, year, and month of the year.

covid19co.csv -> Daily report, Cases reported in Colombia

covid19co_samples_processed.csv -> Daily report, Samples processed in Colombia

covid19co_time_line.csv -> Timeline about cases reported, recuperated, and deceased in Colombia.

covid19co_samples_time_line.csv -> Timeline about samples processed in Colombia.

Date format DD/MM/YYYY for instance: 11/09/2001

This dataset is updated from an automatic pipeline, you can find the GitHub code repository here: Colombia Covid19 Pipeline

Acknowledgements

Dataset is obtained from Instituto Nacional de Salud - INS daily report Covid19 in Colombia. You can get the official dataset here: INS - Official Report

Inspiration

What questions do you want to see answered?

You can view and collaborate with the analysis here: colombia_covid_19_analysis Kaggle Notebook Kernel.


Colombia Covid19 Time Line


Requirements

Source Code

See ./src directory.

Datasets

See ./output directory.

Documentation

DirectoryReadme
./chartREADME.md
./docREADME.md
./inputREADME.md
./outputREADME.md
./srcREADME.md

How to use

Please read and execute each step below:

Step 1

Create and use Python virtual environment:

$promt> python3 -m venv .venv
$promt> source .venv/bin/activate

Step 2

Install all Python requirements:

$promt> pip3 install -r requirements.txt

Step 3

Run Pipeline script:

$promt> ./run.sh

The Pipeline output is generated within ./output directory.

Step N

Work in progress ...


Jupyter Nootebook to Python Script

$promt> jupyter nbconvert --to script ./src/covid19co_pipe.ipynb --output covid19co_pipe

That's all for now ...


Would you like contribute?

Getting in touch with @sebaxtianbach


License

MIT License

<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.

About me

https://about.me/sebaxtian