Sr Data Engineer - Consultant
Hulu
Total years of experience :11 years, 9 Months
Sr Data Engineer - Consultant
• Working as Senior Data Engineer. Some of the tasks I am working on are Data Quality issues. fact source updated subscriber and subscription data. these changes will be migrated to Snowflake and Hive databases.
• Campaign 360 project: Phase I: bring customer data who is targeted by emails, this data include customer behavior, watched, profile, activities
PhaseII: bring customer data who is not targeted by emails, holdout group to study their behavior and compare to the targeted customers.
• Campaign 360 project: helped in the QA, Data profiling and UAT issues
• Helped Subscription team when needed
Environment: Hadoop, SnowFlake, talend (data profile), SQL server EDW, PySpark
HGST, a Western Digital company San Jose, CA
12/2014 to 1/2019
Work on Big data project. Load data from hadoop Redshift using Redshift commands and SnapLogic ETL tool
• Files extracted from Hadoop and dropped on daily hourly basis into S3
• Write code or ETL processes using SnapLogic to load data into redShift tables
• Create code to bulk load or Upsert (Insert/Update) using Redshift and SnapLogic
• Build the data model to load the flat files (Stage)
• QDN project. Build the data model and ETL to move the data from DB2 (DW) to redshift. Include initial load and incremental. implement logic to check RD and load any new records or update
• Move data from MySQL to RS, include built the data model and ETL
• Work on create prod Data Warehouse in Redshift to load QDN data
• Work on HICAP to bring data directly from source instead of S3
Environment: AWS (Amazon Web Service), S3, SnapLogic, RedShift, Linux
Working park time with Habitat
Responsibilities
•Understand the new requirement and provide feedback on the data model for the new project
•Provide tips for the development of the new project
•Support the ETL team when needed
Environment: Informatica PowerCenter 9.x. Teradata (13, 13.10), SQL Server 2008, Oracle 11g. Modeling tool (ERwin)
Build the first DW for Catasys and create all the ETL processes
Responsibilities
•Create Requirement Document: Understand the business roles, gather the business requirement for the project, then create the project Requirement document System Architect.
•Create the system architect document for the project Data Warehouse Architect.
•Creates the DW architect diagram Logical and Physical Data Models.
•Creates logical and physical model for the Data Warehouse ETL packages.
•Create ETL packages to load the data to stage, then DW using SSIS Optimize the SQL Server DB for best performance Unit test and QA PRODUCTION.
•Copy the DW to create production copy, then migrate the data and ETL Data Marts.
•Create Data Marts and Aggregate table
Environment: SSIS and SQL server DB (2005 and 2008). Modeling tool (MS Visio)
Habitat for Humanity International
Responsibilities
•Implemented DW and Informatica best practices
•Developed and Unit test 1.1 release for the EDW, Worked on GL_ENTRY process and GL Sources
•Analyze and create ETL Design Document for EDW Release 1.2
•Architect and re-develop GL_ENTRY process (Improve the performance and usability)
•Worked on other components for Release 1.2 (GL_ACCT, GL_BUDGET and others)
•Provided feedback on the DQC process from the DW stand point and ETL
•Support other ETL developers
•QA and set up automated processes to provide DW load quality check
Environment: Informatica PowerCenter 9.x. Teradata (13, 13.10), SQL Server 2008, Oracle 11g. Modeling tool (Erwin)
Joined HealthNet Pharmacy team in May 2005 to architect and develop the Medicare Part D project and other projects assigned to the team.
Defined functional and detailed designs analyzed systems and developed solutions for highly complex problems that required extensive investigation.
•Systems Support Analyst from 1998 - 2000: Administered segmentation, aggregation and campaign management tools.
in
courses: Trainings Informatica (Power Center 7.x) Advanced Development