👋 Hi, I'm

Florian Abgrall

Data Engineer

I build scalable data pipelines and transform complex data into actionable insights. Specialized in cloud infrastructure, ETL optimization, and modern data stack technologies.

Technical Skills

Technologies and tools I work with

Data Engineering

  • Apache Spark
  • Apache Kafka
  • Airflow
  • dbt

Cloud Platforms

  • AWS (S3, Redshift, EMR)
  • GCP (BigQuery, Dataflow)
  • Snowflake
  • Azure

Programming

  • Python
  • SQL
  • PySpark
  • Scala

Analytics & BI

  • Tableau
  • Power BI
  • Looker
  • Jupyter

Featured Projects

Real-world data engineering solutions

Real-time Streaming Pipeline

Production-ready real-time data streaming pipeline using Apache Kafka, Python, and PostgreSQL. Includes ML-powered anomaly detection, FastAPI WebSocket streaming, Schema Registry with Avro, dead letter queue handling, and comprehensive monitoring with Prometheus and Grafana.

Apache Kafka FastAPI PostgreSQL ML/scikit-learn Prometheus

Data Quality Framework

Comprehensive production-ready framework for automated data quality validation, profiling, and monitoring. Features real-time anomaly detection, customizable validation rules, interactive Streamlit dashboard, and modern Python tooling with UV and Ruff for 10-100x faster performance.

Python Pandas Streamlit Great Expectations UV/Ruff

Wind Power Analytics Pipeline

End-to-end data pipeline on Microsoft Fabric implementing Medallion architecture (Bronze/Silver/Gold). Features automated ingestion from GitHub, PySpark transformations, star schema modeling, and Power BI visualization. Complete orchestration with Data Pipeline for production-ready analytics.

Microsoft Fabric PySpark Delta Lake Power BI SQL

BrandSeeker

CLI tool powered by YOLOv5 for brand detection in videos. Processes video frames to identify and track brand logos with configurable framerate and confidence parameters. Outputs structured predictions for video analytics. Supports both local videos and YouTube URLs.

YOLOv5 Python Computer Vision PyTorch OpenCV

Chatbot API

Production-ready chatbot API backend with database migrations using Alembic. Includes comprehensive notebooks for model exploration, training scripts, and structured source code for scalable deployment. Features pre-start scripts and test automation for reliability.

Python FastAPI NLP SQLAlchemy Alembic

Feelingz - Emotion Analysis Platform

Full-stack application with FastAPI backend and Streamlit frontend for emotion detection. Uses NLP algorithms trained on Kaggle sentiment analysis and Twitter datasets to predict emotions from text. Provides data analysis reports and daily journal functionality with Docker deployment.

FastAPI Streamlit NLP Docker Python

Get In Touch

Let's discuss your next data project

Location

Lille, France

LinkedIn

Connect with me

GitHub

@Flockyy