How to make the conda environment locally

This is a presentation about Data privacy and anonymization. Mostly on a data person level by that I mean those who work with data and those who are working with data person. You can simulate data to make the insurance data set. See the folder layout to learn how to do it.

Folders:
.
├── codebook - this folder has a description of the simulated dataset. Particularly what the columns of the dataframe mean.
│   ├── Insurance_data_ke.txt - this was created with CSVkit (csvstat) function.
│   └── insurance_report.html - this is generated by pandas profiling library. A short cut in doing Exploratory data analysis fast.
├── data - directory where the simulated data should be placed. Run utils/dataloader.py to generate it.
│   ├── feature_engineered_insurance2.csv - data which has undergone feature engineering used in the demo.
│   ├── feature_engineered_insurance.csv - data which was created for the same problem but has issues. Create a new one.
│   ├── Insurance_data_ke.csv - The insurance dataset created by running python utils/dataloader.py
│   ├── Insurance_data_ke_featureeng.csv - Insurance dataset created as an intermediate step for feature engineering.
│   └── Organs.csv - Single patient data who was recovering from surgery from a heart disease. Just contains data about their vitals from a thermometer, pulse oximeter.
├── Dockerfile - a blueprint to run the project in a reproducible way see. # How to run in docker image.
├── environment.yml - a conda virtual environment file.
├── Kenya Data Protection Act - Quick Guide 2021.pdf - a demo for privacy engineering strategy at Deloitte.
├── Makefile - workflow orchestrator. Helps automating code formating and running repetitive tasks.
├── presentation - this directory has the presentations that were used live.
│   ├── presentation.pdf - HTML to PDF using LaTeX.
│   ├── presentation.slides.html - reveal.js presentation. Open with your browser.
| ├── presentation2.html - Quarto version of the presentation.
| ├── presentation2.pdf - PDF version of the presentation.
├── presentation.ipynb - jupyter notebook with jupyter notebook extensions and reveal.js extension.
├── README.md - the file you are reading.
├── requirements.txt - what packages were used.
├── Screenshot from 2022-09-10 07-03-38.png - demo of PCA using the iris dataset.
└── utils - Scripts used to generate the simulated data
├── codebook.sh - this is bash script used to create the codebook Insurance_data_ke.txt
├── dataloader.py - data generator that uses methods from the faker library and numpy.
├── Feature_engineering.ipynb - a feature engineering workflow that I use for making the insurance dataset ready for statistical modeling aka machine learning.

How to make the conda environment locally

If you have anaconda/miniconda. In the data-privacy-pres directory, complete the following steps.

Create the virtual environment

conda env create -f data-privacy-env.yml

This will create an environment called data-privacy-env. You can activate it like this.

source activate data-privacy-env

How to run the docker image

Build docker image

sudo docker build -t data-privacy-env:v1 .

Run the docker image

sudo docker run -p 9999:9999 data-privacy-env:v1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to make the conda environment locally

How to run the docker image

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
codebook		codebook
data		data
presentation		presentation
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
Kenya Data Protection Act - Quick Guide 2021.pdf		Kenya Data Protection Act - Quick Guide 2021.pdf
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
Screenshot from 2022-09-10 07-03-38.png		Screenshot from 2022-09-10 07-03-38.png
data-privacy-env.yml		data-privacy-env.yml
presentation.ipynb		presentation.ipynb
requirements.txt		requirements.txt
todo.md		todo.md

License

Shuyib/data-privacy-pres

Folders and files

Latest commit

History

Repository files navigation

How to make the conda environment locally

How to run the docker image

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages