Big Data Engineer
NXN Kuwait
مجموع سنوات الخبرة :10 years, 2 أشهر
Customer Experience Management (CEM):
The business wishes to understand the sentiment on Zain social media (Facebook, Twitter, Instagram) and the interactions, usage
and habits of its customers.
Responsibility:
Extracted the data from different sources and store at HDFS.
Build real time streaming pipeline using Kafka and spark streaming.
Develop a ETL logic to aggregate the data on hourly and daily basis.
Develop ETL mapping using informatica BDM to run on Hadoop cluster.
Support provided in machine learning model deployment.
Tool Sets:
Informatica BDM, Spark Streaming, Kafka, Cloudera, Spark, Spark SQL, Hive, Shell Script
Customer Churn Model:
The aim of this project is to build a machine learning model that accurately identifies customers which have potential to churn in
the subsequent year, for taking appropriate measures to avoid their churn.
Responsibility:
• Extracted the data from different sources and store at HDFS.
• Loaded all the required data into hive tables.
• Develop a Alteryx workflow to create analytical dataset which used as a input data source for machine learning model.
• Provide exploratory data analysis and feature engineering
• Support provided in model building & validation
• Support provided in model deployment.
Tool Sets:
Maching Learning, Python, Alteryx Designer, Alteryx Gallery, MapR, Spark, Hive
Adobe Analytics:
MHHE.com is the website used for purchasing MHE products online. MHE used Adobe products (Abode Analytics, Adobe Target
and Abode Experience Management) for tracking online activities in MHHE.com. Adobe gives a feature to export data (Data Feed)
to FTP. So, we need to extract all the users activities data from ftp site to Hadoop and Identify how online activity is affecting
sales.
Responsibility:
Develop the shell script to extract all the historical data from ftp site to MapR environment.
Develop Alteryx workflow for data transformation, data lookup and data processing.
Develop Alteryx workflow for writing the transformed data into the HDFS
Create shell script to load the data from hive staging table to final hive ORC reporting table.
Created the final scripts to schedule the workflow into production.
Setup the daily jobs to load the data from ftp site to hive reporting table.
Data related issue in daily jobs.
Tool Sets:
MapR, MapR-FS, Alteryx Designer, Alteryx Gallery, Hive, Shell Script, Tableau, AppWorx
Customer Complaint Analysis (Consumer Analysis)
McGraw Hill Education is a learning science company and customer complaint regarding the digital product received by the JIRA.
Customer complaint Analysis is Handling Customer Dissatisfaction. This may be a critical Issue for the Customer. In this project we
Investigate the current sources of customer complaints and what are the causes of complaints and want to seek the effective ways
of handling customer complaints by examining different type of products and Issues. For achieving customer complaint analysis,
we are using Big data platform for storing of huge amount of data.
Responsibilities:
• Developed a java application to hit the JIRA API to get the complaint data.
• Developed a java application to parse the JIRA API response to CSV file from JSON format.
• Created shell script to move the data from local system to hdfs and performed data transformation using the hql.
• Moved the data from hive staging table to final ORC reporting table.
• Setup daily job to extract the data from JIRA API and moved into hive reporting layer.
TCS - Noida, UP
TCS - Noida, UP
TCS - Noida, UP
TCS - Noida, UP
,
Certifications • ITIL FOUNDATION