a
ateetjss

Ateet Gupta

@ateetjss

Data Engineering, Pyspark, Azure, ETL Pipeline, SQL, Python

Inde
Anglais
Certaines informations sont présentées en anglais.
À propos de moi
I am a qualified data engineer who bring 12 years of expertise into Data Engineering side which includes design and implement ETL (Extract, Transform, Load) processes to move data between systems. Moving the On Prem SQL Databases into Cloud Delta formats. Designing the robust ETL pipeline through Azure Data Factory also creating a Technical Design Document for transforming requirements into the ground reality within the Cloud infrastructure.... Plus d’infos

Compétences

a
ateetjss
Ateet Gupta
hors ligne • 

Voir mes services

Conseil en ingénierie des données
I will integrate all your data and engineer your data pipelines

Portfolio

Expérience professionnelle

Capgemini

Manager

Capgemini • Temps plein

Jul 2024 - Jan 20256 mos

• Conducted data transformations and aggregations using SQL and Spark to derive actionable insights for business stakeholders. • Implemented Delta Lake for efficient data storage, enabling ACID transactions and version control for improved data governance. • Analyzed and optimized data processing jobs to enhance performance and reduce execution times through effective resource management and query optimization techniques. • Attained query performance and data processing efficiency using Spark Optimizations, resulting in a reduction in processing time from approximately ~5 hours to about ~2 hours and leading to an increase in throughput.

American_Express

Senior Manager - Data Science

American Express • Temps plein

Jan 2022 - Apr 20242 yrs 3 mos

• Lead a team of data engineers/data analyst in developing and maintaining scalable data pipelines for processing and analyzing large datasets reducing data processing time by 25% • Guided and designed the architecture for implementing Pyspark ETL processes to extract data from various sources and load it into cornerstone data warehouse with 100 % consistency. • Accelerated migrating the ETL code from Hive to Pyspark for better optimizations and throughput reducing the time by 30% • Collaborated with the Finance Business Team to automate the financial reconciliation process using PySpark and created an on-demand Power BI dashboard, helping the business team reconcile data between the Cornerstone Database and the IBM TM1 account book for credit card spending.

Senior Data Scientist

Optum • Temps plein

Feb 2015 - Nov 20216 yrs 9 mos

• Collaborate with the Optum Risk team to design and implement solutions for tracking and improving performance on various HEDIS (Healthcare Effectiveness Data and Information Set) measures using Pyspark and Azure Databricks and Azure Data Factory. • Develop scalable data pipelines using PySpark on Azure Databricks, ensuring efficient processing and transformation of large healthcare datasets. • Use Azure Data Factory to schedule and automate data pipelines, ensuring seamless data flow from survey responses and healthcare data to the target systems. • Utilize Python and Natural Language Processing (NLP) techniques to analyze UHG member survey data, extracting valuable insights from free-text responses.