Flávio Teixeira
Data Engineering Specialist - São Paulo, Brazil
About
Experienced data engineer with 8+ years of expertise in designing and implementing data pipelines, optimizing data storage and retrieval, and ensuring data quality and reliability. My experience in DataOps and Data Mesh, together with Python fluency, has helped me develop tools and frameworks that enable faster and more standardized work.
I also take pride in mentoring other data engineers to help them grow and develop their technical skills.
Technical Skills
• Python, Java, Scala
• AWS, GCP
• Docker, Kubernetes
• Hadoop, Hive, Spark, Pandas, DuckDB, Polars
• Databricks, Unity Catalog
• Delta Lake, Hudi, Apache Iceberg
• Airflow, Airbyte, DBT, dlt, Data Wrangler
• DataOps, DataMesh, Data Products
• PostgreSQL, MySQL, Cassandra, DynamoDB, MariaDB, Neo4j, ElasticSearch
• Git, Jira, Agile Methodologies
Education
2013-2016
Technologist in Systems Analysis and Development, FATEC São Caetano do Sul
Work Experience
2023 - Today
Data Engineer, Riot Games
Data Engineer Working on the Data Platform Team.
Working as a contractor via X-Team.
2023 - Today
Data Engineer, X-Team
Working as a contractor for Riot Games.
2020 - 2023
Data Engineer, iFood
Built and maintained ETL pipelines, with quality and reliability, for ingesting data from various sources such as databases, queues and APIs.
Lead the development of a multi-purpose platform based on the concepts of Data Mesh, which takes care of the data from its ingestion, also covering transformation and publishing. In addition to the data itself, it is also responsible for all data lake observability and support, from orchestration to access management, quality and contracts, used by the whole company’s data area (3k-ish people).
Lead the development of a series of Python libraries with different purposes, ranging from “commons”libraries for code-reuse to more complex ones responsible for being the official way of reading from and writing to the Data Lake itself.
Mentor other data engineers from juniors to seniors looking for technical improvements.
Main technologies: Main technologies: Python, AWS Cloud (S3, EC2, EKS, Lambda), Docker, Kubernetes, Terraform, GitlabCI , Hadoop, Hive, PySpark, Airflow and Databricks.
2019 - 2020
Data Engineer, Social Miner
Built and maintained ETL pipelines, with quality and reliability, for ingesting data from various sources such as databases, queues and APIs. Also structured and maintained the company’s Data Lake.
Designed and developed data models and schemas in order to support business requirements, and to ensure data quality and consistency.
Led the development of a dynamic marketing campaign tool, responsible for identifying the target customers based on any number of parameters over the whole website usage event base captured by the platform. It involved a number of different orchestration DAGs and processes, with a very modular code as well.
Led the development of a telemetry and quality stack built over ElasticSearch and Kibana, that held the real-time follow-up of the Data Lake and ingestions flow health and status.
Main technologies: Python, AWS Cloud (S3, Lambda, EMR, Athena), Terraform, Hadoop, Hive, PySpark, Airflow, ElasticSearch and Kibana.
2018 - 2019
Data Engineer, Yandeh
Built and maintained ETL pipelines for ingesting data from various sources such as databases and APIs. Also structured and maintained the company’s Data Lake.
Designed and developed data models and schemas in order to support business requirements, and to ensure data quality and consistency.
Worked with business stakeholders to gather and define requirements for new data products and features.
Led the development of the monitoring and telemetry for Big Data area structure, using Python, Elasticsearch and Kibana along other AWS managed services.
Main technologies: Python, AWS Cloud (S3, Lambda, Glue, Athena), Hadoop, Hive, PySpark, ElasticSearch and Kibana.
2017 - 2018
Data Engineer, Rivendel
Built and managed data lakes and systems that collect and process large scale data (Big Data Ecosystem), while working with DevOps and Cloud Computing components.
Developed and managed data ingestion and transformation pipelines.
Tech lead role in the development and management of data lakes.
Worked on projects for different companies: CVC, Ciclic, Dotz and MaxMilhas.
Main technologies: Java, Spark, AWS Cloud (S3, Lambda, EMR, Glue, Athena), Docker
Accomplishments
Contributed with Great Expectations, a data quality and profiling tool.
Had a medium post reposted by TowardsAWS blog.
Certificates
Nanodegree in Data Engineering, Udacity
Languages
Portuguese, Native
English, Fluent