Sandeep Singh, Senior AI Engineer

Sandeep Singh

Senior AI Engineer

TARGET

Location
India - New Delhi
Education
Bachelor's degree, Physics (Hons.)
Experience
10 years, 5 Months

Share My Profile

Block User


Work Experience

Total years of experience :10 years, 5 Months

Senior AI Engineer at TARGET
  • United Arab Emirates
  • My current job since November 2019

• Led a team of 4 engineers in developing and maintaining a high-performance, distributed Feature Dataset with more than 200 features
• Created and managed data pipelines, involving extracting data from various sources, transforming it into a usable format, and loading it into data storage or analytics platforms
• Designed and upheld a framework for automated ETL processes, guaranteeing seamless execution of data integration and transformation tasks while reducing the need for manual involvement
• Implemented the seamless migration of on-premise data systems to Google Cloud Platform (GCP), ensuring minimal disruption and maximizing efficiency in data storage, processing, and management
• Developed a library of common PySpark functions and deployed it in virtual environment which is being used across multiple teams, thus reducing the development time
• Executed data quality measures, data governance protocols, and data validation checks, resulting in a 60% decrease in data errors, thereby enhancing the precision of analyses and decision-making processes
• Designed and maintained data warehouses/data lakes to store structured and unstructured data efficiently, setting up schemas, and optimizing for query performance
• Implemented the deployment of several CPU-based AI/ML models to GPU using Docker, Kubernetes, GPU Array, resulting in enhanced efficiency across multiple models and reduced runtime
• Provided technical mentorship to junior team members, conducting code and design reviews, and enforcing coding standards and best practices

Big Data Consultant at XEBIA
  • United Arab Emirates
  • August 2018 to November 2019

• Led the migration of petabytes of unstructured/semi-structured data from legacy systems (TeraData, CR, and
Informatica) to AWS.
• Created and upheld a data lake housing more than 1 PB of data, facilitating data-informed decision-making for critical business endeavors
• Developed efficient framework for staging, cleansing, transforming, and loading data using HDP, HDFS, Spark, Hive, and Sqoop
• Optimized multiple batch and stream processing workflows for increased performance and reliability
• Worked closely with the data science team to comprehend their needs and convert data into the necessary formats

Senior Associate at Innodata
  • India - Noida
  • November 2013 to June 2018

Devised and executed a real time data pipeline for processing semi-structured data, amalgamating 150 million raw records sourced from over 30 data origins through Kafka and PySpark.
Developed an in-house Python library utilized for parsing and reformatting data obtained from external vendors, resulting in a 7% decrease in the error rate within the data pipeline
Created various lambda functions for data cleansing and transformation using Scala and Spark API

Education

Bachelor's degree, Physics (Hons.)
  • at Delhi University
  • April 2012

Specialties & Skills

Data Processing
Data Processing
PYSPARK
AMAZON WEB SERVICES
DATA QUALITY
DATA SCIENCE
GRAPHICS PROCESSING UNIT (GPU)
HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Languages

English
Expert
Hindi
Native Speaker