Skip to content
View mouradap's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report mouradap

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mouradap/README.md

Welcome to my GitHub Profile!

Profile Banner

πŸ‘‹ About Me

Hello! I'm Denis Moura, a Senior Data Engineer with over 4 years of experience in building scalable and resilient data pipelines and data platforms. My passion lies in working with data and using modern technologies to solve complex data problems. I have extensive experience with data models, SQL, AWS, and a deep love for Python. You can find me on LinkedIn and explore my projects here on GitHub, although I haven't been much active in my public projects lately.

πŸš€ Skills and Technologies

  • Programming Languages: Python, SQL, JavaScript
  • Big Data Technologies: Hive, S3, Google Storage, Presto, Athena, BigQuery, Spark
  • Data Warehousing & ETL: Snowflake, Airflow, AWS Glue, AWS Step Functions, Lambda Functions, Kafka
  • Cloud Platforms: AWS, GCP
  • Data Modeling & Quality: Data Lakes, Delta Lakes, Data Governance, Data Validations
  • Tools: Terraform, Docker, Kubernetes, Git, GitHub Actions
  • Data Visualization: Sigma Computing, PowerBI, Looker Studio, Metabase
  • Methodologies: Agile/Scrum

πŸ’Ό Professional Experience

Lead Data Engineer @ Dexian Disys (2022 – Present)

  • Developed a data migration platform with custom, reusable operators in Airflow for large-scale ETL batch processes, reducing AWS costs significantly.
  • Managed multiple data projects migrating data from on-premises solutions and third-party APIs to Snowflake, handling up to 10 million records per day using Kafka for batch and streaming pipelines.
  • Created various reports and dashboards using Spotfire, Sigma Computing, and PowerBI.

Technologies: Python, Airflow, Snowflake, AWS, Git, GitHub Actions, Terraform, Docker, Kubernetes, Kafka

Data Engineer @ Varsomics – Hospital Israelita Albert Einstein (2021 – 2022)

  • Led a data migration project to create a data lake for 70 Terabytes of genomic data on AWS, employing S3, Glue, Athena, Lake Formation, and EMR to build a custom Delta Lake structure.
  • Developed and maintained numerous reports and dashboards for internal clients, streamlining genomics pipeline monitoring and final user results analysis.

Technologies: Python, AWS Glue, AWS Step Functions, AWS Athena, Terraform, Docker, Git, PySpark

Software Engineer @ PickCells (2020 – 2021)

  • Developed and maintained a microscopy solution, automating robot movements and enhancing camera focus using Python and C libraries.
  • Spearheaded an international data science project using Python for COVID-19 network analysis and led an on-premises to cloud data lake migration project using Airflow and AWS.

Technologies: Python, Airflow, AWS, Network Science, Kubernetes, Deep Learning, Computer Vision

πŸŽ“ Education

  • Ph.D. in Applied Biology (Bioinformatics), Universidade Federal de Pernambuco, 2022
  • M.Sc. in Applied Biology (Neuroscience & Bioinformatics), Universidade Federal de Pernambuco, 2018
  • B.Sc. in Biology, Universidade Federal de Pernambuco, 2015

🌟 Projects and Achievements

  • Data Migration Platform: Developed a reusable data migration platform in Airflow, optimizing cost and performance for a global client.
  • Genomic Data Lake: Led the creation of a genomic data lake, enhancing data governance and compliance with legislation.
  • On-premises to AWS Data Lake: Led the creation of a data lake in AWS, moving daily and almost real time data from On-Premises to AWS using Airflow.
  • Microscopy Automation: Built and maintained an automated microscopy solution, contributing to advanced research capabilities.

πŸ“« Get in Touch

Feel free to reach out via Email or connect with me on LinkedIn.


Thanks for visiting my GitHub profile! Explore my repositories and feel free to contribute or reach out if you have any questions or collaboration ideas.

Pinned Loading

  1. blood_cell_classification_keras_fastapi_react blood_cell_classification_keras_fastapi_react Public

    This is a complete service of blood cell image classification using a Convolutional Neural Network, with an FastAPI backend and ReactJS web app

    Jupyter Notebook 5 2

  2. covid19stats covid19stats Public

    A Covid-19 statistics tracking app distributed through docker

    Python

  3. MIT-hackathon MIT-hackathon Public

    Forked from iaradsouza1/MIT-hackathon

    Jupyter Notebook

  4. python_algorithms python_algorithms Public

    This repo stores Python implementations of shortest path algorithms, such as Dijkstra and A*.

    Python