Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
POOJA SHINDE, Senior Consultant- Data Engineer

POOJA SHINDE

Senior Consultant- Data Engineer·Thoughtworks

India

Master's degree, Computer Engineering

Work experience

Total years of experience: 8 years, 4 months

Senior Consultant- Data Engineer

December 2024 - Present

Thoughtworks

Pune, India

December 2024 - Present

• Designed and optimized scalable data pipelines using Databricks to support Gileads biopharma data needs, enhancing data
accessibility for research and analytics.
• Developed AI-driven search capabilities leveraging Graph Databases, enabling complex relationship mapping and accelerating data
discovery across clinical and research datasets.
• Led critical data migration initiatives from diverse sources to AWS and Databricks, ensuring high data integrity, security, and
compliance with healthcare regulations.
• Architected Delta Lake solutions within Databricks to facilitate real-time analytics and seamless data versioning, supporting research
and operational decision-making.
• Managed and mentored a team of 4 data engineers, driving best practices, optimizing workflows, and ensuring successful project
delivery within Agile frameworks.
• Implemented real-time data streaming using Spark Structured Streaming and AWS Kinesis, enabling faster insights into clinical and
operational data.
• Built secure data solutions by incorporating data encryption, masking, and strict access controls, ensuring compliance with H IPAA
and other data privacy standards.
• Integrated AWS services like Glue, Redshift, S3, and Lambda with Databricks to create seamless ETL workflows supporting both
research and commercial teams.
• Utilized Terraform for infrastructure as code (IaC) to streamline resource provisioning, ensuring scalability and consistency across
environments.
• Optimized Spark jobs and Databricks clusters for performance and cost efficiency, reducing data processing times and cloud
expenses.
• Collaborated with cross-functional teams, including data scientists and healthcare analysts, to align data solutions with Gileads
evolving biopharma goals.
• Enhanced data lineage and governance using Databricks and AWS Glue, ensuring traceability and accuracy in regulatory reporting.
• Working for Client: Gilead - A Leading Biopharma Company

Company industry:
IT Services

Sr Solution Engineer

October 2021 - December 2024

Aligned Automation

Pune, India

October 2021 - December 2024

Orchestrated Agile ceremonies, streamlined sprint boards, and tactically prioritized tasks to amplify team productivity and ensure
punctual project delivery.
• Directed end-to-end project delivery, meticulously managing stakeholder communication and orchestrating seamless collaboration
between business requirements and technical solutions.
• Oversaw the intricate migration of data from over 35 diverse sources to AWS Data Lake utilizing cutting-edge technologies like AWS
Database Migration Service (DMS) and PySpark, handling various file formats such as Parquet, JSON, and more.
• Architected and automated cloud-based data pipelines, revolutionizing data ingestion and integration processes to support real-time
reporting needs with the utilization of advanced visualization tools like Power BI.
• Engineered secure and efficient data migration solutions leveraging AWS Redshift, Python, and PySpark, while enhancing operational
workflows using Apache Airflow orchestration.
• Guided and mentored a dynamic team of developers, steering them towards project success through effective collaboration,
mentorship, and the implementation of industry best practices.
• Configured and fine-tuned test environments for optimal performance, meticulously identifying and resolving system defects.
• Designed API Gateway pipelines to share data with PowerApps, improving data accessibility and usability for end-users.
• Implemented notification mechanisms using Amazon Simple Notification Service (SNS) and Simple Email Service (SES) to alert
stakeholders about pipeline statuses, enhancing real-time communication and decision-making.
• Developed a decoupled architecture for enhanced scalability and flexibility, ensuring seamless integration and maintenance of data
pipelines, leveraging AWS Virtual Private Cloud (VPC) for secure networking.
• Utilized advanced designing techniques and optimization strategies to enhance job performance in Amazon EMR (Elastic
MapReduce), leveraging OOP principles for efficient data processing and analysis.
• Employed data modelling techniques to structure and optimize data storage, retrieval, and analysis processes, ensuring data
accuracy and efficiency in decision-making processes, while adhering to AWS Identity and Access Management (IAM) best practices.
• Implemented advanced data security measures by handling PII data using AWS services and Advanced Encryption Standard (AES)
algorithms, ensuring compliance with data protection regulations and enhancing data confidentiality and integrity.
• Worked on both high-level and low-level design documents, detailing system architecture, data flow, and technical specifications to
ensure clarity and alignment across development teams and stakeholders.
• Managed the seamless transfer of data from AWS data pipelines to Google BigQuery, utilizing Google BigQuery for large-scale data
analysis and querying tasks. This migration optimized data processing and extraction, enabling the derivation of actionable insights
and driving strategic decision-making.
• Spearheaded transformative projects including GT Connect, LRL (Learning and Development), Payer Insurance Provider, and CMS
Middleware, showcasing a proven track record of delivering high-impact solutions in complex and challenging environments.

Company industry:
Software Development

AWS Cloud Engineer - Consultant

October 2019 - January 2021

Saksoft Ltd

Pune, India

October 2019 - January 2021

Developed a Data Lake pipeline for Aegonlife on AWS, integrating and processing data from upstream systems like Headless
Manufacture to make it available for downstream systems.
• Conducted data cleaning and transformations on information sourced from various channels and generated over 100 reports tailored
to suit diverse business requirements.
• Handled event-based data in JSON format within our ingestion pipelines, merging and converting it into Parquet format in the staging
pipeline for each entity.
• Utilized Step Functions for workflow management to streamline processes effectively.
• Provided APIs for testing teams to fetch entity status details and generate ad-hoc reports as required.
• Implemented local development endpoints using Docker for querying Parquet data efficiently.
• Developed email solutions for disseminating reports to various clients through publication report generating Glue job.
• Restructured existing jobs without altering their external behaviour, leveraging designed patterns and object-oriented programming
principles.
• Leveraged AWS Textract to process images, PDFs, and other files provided by clients for report generation.
• Created reports on Quicksight for specific systems using data from varied sources.
• Conducted reconciliation and generated reports for systems to identify and flag incorrect data shared in daily reports.
• Managed Aegonlife group products independently and collaborated directly with the client for new development requirements,
gathering, development, and issue resolution.
• Created a Kinesis pipeline for logging the entire data flow within the pipeline to validate data relevancy from source to target, ensuring
data integrity and accuracy throughout the process.
• Fuelled a 20% surge in sales of Policies for Aegonlife, igniting team enthusiasm and sparking a new era of customer engagement.
• Leveraged AWS services such as Lambda, Glue, SQS, SNS, Kinesis, DynamoDB, Aurora, S3, and API for efficient data processing,
integration, and management tailored specifically for Aegonlife's business needs.

Company industry:
IT Services

AWS Cloud Engineer

September 2018 - October 2019

Blazeclan

Pune, India

September 2018 - October 2019

Designed and developed a cloud-based data lake for reporting and analytics, showcasing expertise in architecting scalable and efficient
data storage solutions on AWS.
• Developed a serverless solution for real-time data flow to enhance sales performance, demonstrating proficiency in leveraging AWS
services for real-time data processing.
• Implemented Python-based Lambda functions for data processing, highlighting skills in utilizing AWS Lambda for serverless computing.
• Executed PySpark jobs on EMR for historical loads, illustrating proficiency in big data processing and analysis on Amazon EMR.
• Implemented ElasticSearch for audit logging with reporting in Kibana, showcasing expertise in utilizing AWS Elasticsearch Service for
log analytics and visualization.
• Created a UI-based testing automation solution for Data Lake projects, showcasing skills in developing automated testing frameworks
on AWS.
• Spearheaded the development of an internal UI-based testing automation solution at Blazeclan, reducing testing timelines by 50% for
client projects and enhancing customer satisfaction.
• Leveraged Python libraries like Pandas, NumPy, and Boto3 for data processing, demonstrating expertise in utilizing Python for data
manipulation and interaction with AWS services.
• Led transformative projects for clients such as Yoodo and HDFCLife, displaying a track record of delivering high-impact solutions in
complex environments.

Company industry:
IT Services

Software Engineer

November 2014 - January 2016

Persistent Systems Ltd.

Pune, India

November 2014 - January 2016

Led the development of a Drupal and Apache Solr-based website for ESakal Indian client, incorporating advanced search
functionality with faceting, highlighting, and stop words removal. Managed Apace Solr configurations in a clustered
environment with ZooKeeper and integrated a blog module in Drupal.
• Engineered an innovative recruitment portal for Persistent that significantly enhanced the hiring experience for both
interviewers and interviewees. The portal included advanced features to track fraudulent activities during interviews and
provided video monitoring capabilities for assessing candidates.
• Collaborated with cross-functional teams, including big data and frontend developers, to ensure seamless integration and alignment
of technologies for efficient project execution.
• Had the opportunity to work closely with experienced leaders in the industry, gaining insights into the future of software
development and honing skills in staying abreast of technological advancements.
• Demonstrated comprehensive knowledge of Apache Solr through hands-on experience in configuring and optimizing search
functionalities for enhanced user experience.
• Completed full-stack training, transitioning from a front-end role to backend development, showcasing versatility in technology
stack implementation.
• Leveraged expertise in technologies like Apache Solr, AWS, Drupal, Java, JavaScript, and ZooKeeper to deliver end-to-end solutions
and drive innovation in projects.

Company industry:
IT Services

Education

Malaviya National Institute of Technology

January 2018

January 2018

Master's degree, Computer Engineering

India

University of Pune, D.Y. Patil College of Engineering

January 2014

January 2014

Bachelor's degree, Computer Engineering

India

Skills

DATA ENGINEERING
Intermediate
DATA ENGINEERING
Intermediate
COMPONENT DESIGN
Intermediate
COMPONENT DESIGN
Intermediate
DATA PIPELINES
Intermediate
DATA PIPELINES
Intermediate
AMAZON WEB SERVICES
Intermediate
AMAZON WEB SERVICES
Intermediate
GLUE LOGIC
Intermediate
GLUE LOGIC
Intermediate
LAMBDA CALCULUS
Intermediate
LAMBDA CALCULUS
Intermediate
AMAZON S3
Intermediate
AMAZON S3
Intermediate
AMAZON REDSHIFT
Intermediate
AMAZON REDSHIFT
Intermediate
AWS KINESIS
Intermediate
AWS KINESIS
Intermediate
python
Expert
python
Expert
spark
Intermediate
spark
Intermediate
sql
Expert
sql
Expert