Flávio Teixeira

Data Engineering Specialist - São Paulo, Brazil

About

Experienced data engineer with 8+ years of expertise in designing and implementing data pipelines, optimizing data storage and retrieval, and ensuring data quality and reliability. My experience in DataOps and Data Mesh, together with Python fluency, has helped me develop tools and frameworks that enable faster and more standardized work.
I also take pride in mentoring other data engineers to help them grow and develop their technical skills.

Technical Skills

• Python, Java, Scala
• AWS, GCP
• Docker, Kubernetes
• Hadoop, Hive, Spark, Pandas, DuckDB, Polars
• Databricks, Unity Catalog
• Delta Lake, Hudi, Apache Iceberg
• Airflow, Airbyte, DBT, dlt, Data Wrangler
• DataOps, DataMesh, Data Products
• PostgreSQL, MySQL, Cassandra, DynamoDB, MariaDB, Neo4j, ElasticSearch
• Git, Jira, Agile Methodologies

Education

2013-2016

Technologist in Systems Analysis and Development, FATEC São Caetano do Sul

Work Experience

2023 - Today

Data Engineer, Riot Games

Data Engineer Working on the Data Platform Team.

Working as a contractor via X-Team.

2023 - Today

Data Engineer, X-Team

Working as a contractor for Riot Games.

2020 - 2023

Data Engineer, iFood

Built and maintained ETL pipelines, with quality and reliability, for ingesting data from various sources such as databases, queues and APIs.

Lead the development of a multi-purpose platform based on the concepts of Data Mesh, which takes care of the data from its ingestion, also covering transformation and publishing. In addition to the data itself, it is also responsible for all data lake observability and support, from orchestration to access management, quality and contracts, used by the whole company’s data area (3k-ish people).

Lead the development of a series of Python libraries with different purposes, ranging from “commons”libraries for code-reuse to more complex ones responsible for being the official way of reading from and writing to the Data Lake itself.

Mentor other data engineers from juniors to seniors looking for technical improvements.

Main technologies: Main technologies: Python, AWS Cloud (S3, EC2, EKS, Lambda), Docker, Kubernetes, Terraform, GitlabCI , Hadoop, Hive, PySpark, Airflow and Databricks.

2019 - 2020

Data Engineer, Social Miner

Built and maintained ETL pipelines, with quality and reliability, for ingesting data from various sources such as databases, queues and APIs. Also structured and maintained the company’s Data Lake.

Designed and developed data models and schemas in order to support business requirements, and to ensure data quality and consistency.

Led the development of a dynamic marketing campaign tool, responsible for identifying the target customers based on any number of parameters over the whole website usage event base captured by the platform. It involved a number of different orchestration DAGs and processes, with a very modular code as well.

Led the development of a telemetry and quality stack built over ElasticSearch and Kibana, that held the real-time follow-up of the Data Lake and ingestions flow health and status.

Main technologies: Python, AWS Cloud (S3, Lambda, EMR, Athena), Terraform, Hadoop, Hive, PySpark, Airflow, ElasticSearch and Kibana.

2018 - 2019

Data Engineer, Yandeh

Built and maintained ETL pipelines for ingesting data from various sources such as databases and APIs. Also structured and maintained the company’s Data Lake.

Designed and developed data models and schemas in order to support business requirements, and to ensure data quality and consistency.

Worked with business stakeholders to gather and define requirements for new data products and features.

Led the development of the monitoring and telemetry for Big Data area structure, using Python, Elasticsearch and Kibana along other AWS managed services.

Main technologies: Python, AWS Cloud (S3, Lambda, Glue, Athena), Hadoop, Hive, PySpark, ElasticSearch and Kibana.

2017 - 2018

Data Engineer, Rivendel

Built and managed data lakes and systems that collect and process large scale data (Big Data Ecosystem), while working with DevOps and Cloud Computing components.

Developed and managed data ingestion and transformation pipelines.

Tech lead role in the development and management of data lakes.

Worked on projects for different companies: CVC, Ciclic, Dotz and MaxMilhas.

Main technologies: Java, Spark, AWS Cloud (S3, Lambda, EMR, Glue, Athena), Docker

Accomplishments

Contributed with Great Expectations, a data quality and profiling tool.

Had a medium post reposted by TowardsAWS blog.

Certificates

Nanodegree in Data Engineering, Udacity

Languages

Portuguese, Native

English, Fluent