l
leonidlupko

Leonid Lupko

@leonidlupko

Cloud Data Engineer, BigQuery, Snowflake, dbt, Python, ETL

Ukraine
Anglais, Ukrainien
Certaines informations sont présentées en anglais.
À propos de moi
Senior Data Engineer specializing in scalable cloud data platforms and production-ready ETL/ELT pipelines. I help businesses build reliable data solutions using AWS, BigQuery, Snowflake, dbt, Python, and SQL. Experience includes: - API integrations - automated data pipelines - large-scale web scraping - data warehouse design - Power BI backend optimization - serverless architectures - incremental processing - data quality and monitoring Focused on reliability, scalability, cost optimization, and clean architecture for long-term maintainability.... Plus d’infos

Compétences

l
leonidlupko
Leonid Lupko
hors ligne • 
Temps de réponse moyen de 1 heure

Voir mes services

Data engineering
I will automate API ingestion into bigquery with python
ETL de données
I will fix and optimize your dbt pipelines and sql models

Portfolio

Expérience professionnelle

Self-Employed

High-Load Web Scraping Platform (AWS)

Self-Employed • Freelance

Jan 2025 - Present1 yr 4 mos

Designed and implemented a scalable web scraping platform using serverless AWS infrastructure. Currently running in production for price monitoring across 120000 SKUs on 6 websites (total 0,72M SKUs ), with reliable change tracking and stable daily execution. The system uses curl_cffi for high-performance requests and integrates with Bright Data to bypass anti-bot protections (Cloudflare, Akamai, DataDome). Architecture: Distributed workers (AWS Lambda / ECS) with SQS queues S3-based data lake (raw → normalized → curated) Parquet + partitioning SQL analytics via Amazon Athena Scalability: 🚀 Designed to scale up to 5M+ pages/day Horizontal scaling via queue-based architecture Ready for TB-scale datasets Results: ⚡ 200–800 ms average request latency 💰 60–85% cost reduction vs browser-based scraping 📦 Efficient data pipeline with optimized storage 🔍 Athena queries in 2–10 seconds 📉 $0.01–$0.20 per query