👋 Hi, I'm
I build scalable data pipelines and transform complex data into actionable insights. Specialized in cloud infrastructure, ETL optimization, and modern data stack technologies.
Technologies and tools I work with
Real-world data engineering solutions
Production-ready real-time data streaming pipeline using Apache Kafka, Python, and PostgreSQL. Includes ML-powered anomaly detection, FastAPI WebSocket streaming, Schema Registry with Avro, dead letter queue handling, and comprehensive monitoring with Prometheus and Grafana.
Comprehensive production-ready framework for automated data quality validation, profiling, and monitoring. Features real-time anomaly detection, customizable validation rules, interactive Streamlit dashboard, and modern Python tooling with UV and Ruff for 10-100x faster performance.
End-to-end data pipeline on Microsoft Fabric implementing Medallion architecture (Bronze/Silver/Gold). Features automated ingestion from GitHub, PySpark transformations, star schema modeling, and Power BI visualization. Complete orchestration with Data Pipeline for production-ready analytics.
CLI tool powered by YOLOv5 for brand detection in videos. Processes video frames to identify and track brand logos with configurable framerate and confidence parameters. Outputs structured predictions for video analytics. Supports both local videos and YouTube URLs.
Production-ready chatbot API backend with database migrations using Alembic. Includes comprehensive notebooks for model exploration, training scripts, and structured source code for scalable deployment. Features pre-start scripts and test automation for reliability.
Full-stack application with FastAPI backend and Streamlit frontend for emotion detection. Uses NLP algorithms trained on Kaggle sentiment analysis and Twitter datasets to predict emotions from text. Provides data analysis reports and daily journal functionality with Docker deployment.
Let's discuss your next data project