l
leonidlupko

Leonid L

@leonidlupko

Data Engineer, Web Scraping, AWS, GCP ETL

Ukraine
Anglais, Ukrainien
Certaines informations sont présentées en anglais.
À propos de moi
Data Engineer specializing in web scraping, data pipelines, and cloud platforms. I build scalable systems for extracting, processing, and analyzing large datasets with a focus on performance and cost-efficiency. Expertise: - Web scraping (Cloudflare, Akamai, DataDome bypass) - AWS & GCP serverless pipelines - BigQuery, Athena data warehouses - API integrations & automation You get clean data, scalable solutions, and fast, reliable delivery. Let’s build your data solution.... Plus d’infos

Compétences

l
leonidlupko
Leonid L
hors ligne • 
Temps de réponse moyen de 1 heure

Voir mes services

Data engineering
I will automate API ingestion into bigquery with python
ETL de données
I will fix and optimize your dbt pipelines and sql models

Portfolio

Expérience professionnelle

Self-Employed

High-Load Web Scraping Platform (AWS)

Self-Employed • Freelance

Jan 2025 - Present1 yr 4 mos

Designed and implemented a scalable web scraping platform using serverless AWS infrastructure. Currently running in production for price monitoring across 120000 SKUs on 6 websites (total 0,72M SKUs ), with reliable change tracking and stable daily execution. The system uses curl_cffi for high-performance requests and integrates with Bright Data to bypass anti-bot protections (Cloudflare, Akamai, DataDome). Architecture: Distributed workers (AWS Lambda / ECS) with SQS queues S3-based data lake (raw → normalized → curated) Parquet + partitioning SQL analytics via Amazon Athena Scalability: 🚀 Designed to scale up to 5M+ pages/day Horizontal scaling via queue-based architecture Ready for TB-scale datasets Results: ⚡ 200–800 ms average request latency 💰 60–85% cost reduction vs browser-based scraping 📦 Efficient data pipeline with optimized storage 🔍 Athena queries in 2–10 seconds 📉 $0.01–$0.20 per query