Skip to content
View pehls's full-sized avatar

Block or report pehls

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pehls/README.md

Hi there 👋

I am a Senior Data Scientist with a proven track record of driving innovative data science initiatives, leading the renewal of Machine Learning structures, implementing MLOps practices at various organization, with achievements in drug replacement algorithms, Machine Learning models for authorization of medical claims, and NLP. Previously, played a key role in establishing and leading the Data Science/BI area, overseeing DW creation using Python + Amazon RedShift + Power BI, and implementing BI processes. Demonstrated expertise in ETL, data visualization, and ML model applications across various sectors. Explore more about my journey at my Profile!

Platforms & Tools

Experienced with Python, R and Java;

Already performed data transformations inside Pyspark, Pandas, Dask, Oracle Data Integration/Data Flow, AWS Glue;

Create Pipelines of Data Engineering inside Databricks, Alteryx, AWS Glue + Athena;

Started a Data Science Area inside a software development company, starting with just me and exiting with a team of 1 BI Developer, 2 Data Eng, 1 Product Owner and 1 UI/UX Developer, formulating a Data Warehouse for Power BI tasks, and a Data LakeHouse structure after some maturity with the data, Deploying multiple models for production, with great performance for a streaming process with Pyspark;

Formulate AI Systems using Machine and Deep Learning Tasks, like Recommendation Systems, HealthCare Audit, Forecasts in different granularities, pricing elasticities development and analysis, key driver analysis using Machine Learning Models and Model Interpretability, key driver analysis using Structural Equation Modelling, Find Similarity between groups of data using Clustering, using LLM's with vector databases (RAG) to delivery faster results from internal processes and documents with LLangChain + different llm models and Muti-Stage Reasoning for automated processes inside a pipeline, between other applications of Data Science, Machine Learning and Deep Learning (including llms!);

Delivery models to a model store inside Amazon Sagemaker, Databricks, Azure Machine Learning Studio and Oracle Data Science / Model Catalog;

Visualized Data with Power BI, Tableau, Plotly/Seaborn inside Python, and ggplot2 inside R;

Follow DevOps and MLOps practices along this way, helping other developers as Tech Lead / at a Senior Position, Leading different projects and delivering Data Science and Machine Learning / AI Systems with great quality and adherence to business objectives, leading discoveries with different companies in different areas.

Studies

Currently, studying an intersect between MLOps, LLMOps and how large language models was made - inside the black box of "binarized models" and apis, how transformers, attention and other structures of Deep Learning and Feature Engineering made to transform text to numbers, inside matrices, and back to text, image, sounds, videos, codes, etc.

Contact

Pinned Loading

  1. descritor-de-ativos descritor-de-ativos Public

    Projeto para o curso de AI in Financial Market, da I2A2 - Data H, contando com uma descrição (no estilo relatório) do snapshot atual de um ativo, seja via yfinance (bolsa tradicional) ou cripto ati…

    Jupyter Notebook 4

  2. i2a2NaiveBayes i2a2NaiveBayes Public

    Desafio de aplicar naive bayes a um modelo que efetua compra e venda de ações utilizando PETR4 como base, e um modelo de gestão de risco baseado no desempenho de indicadores em cima do ativo.

    Jupyter Notebook 1

  3. gp27_techchallenge_4 gp27_techchallenge_4 Public

    Tech Challenge of the Postgraduate in Data Analytics, from FIAP, analyzing Brent Oil price data, in comparison with historical, economic and societal data, integrating correlation and causality ana…

    Jupyter Notebook

  4. mlops_structure mlops_structure Public

    MLOps full structure for llm/ml, from mlops study, done with Python.

    Python

  5. llm_studies llm_studies Public

    An repository for knowledge in Large Language Models, with examples and activities done inside the studies of this context

    Jupyter Notebook 3

  6. gp27_techchallenge_5 gp27_techchallenge_5 Public

    Tech Challenge of the Postgraduate in Data Analytics, from FIAP, developing an analysis about student churn in a ONG called "Passos Mágicos", in Brazil, with a churn model to guide the analysis and…

    Jupyter Notebook