Jassim Moideen, Senior Data Scientist & Big Data Program Manager

Jassim Moideen

Senior Data Scientist & Big Data Program Manager

Vodafone

Location
Qatar
Education
Bachelor's degree, Certificate in Data Mining and Business Analytics
Experience
16 years, 9 Months

Share My Profile

Block User


Work Experience

Total years of experience :16 years, 9 Months

Senior Data Scientist & Big Data Program Manager at Vodafone
  • Qatar - Doha
  • My current job since May 2015

Currently working as the Senior Data Scientist and Manager responsible for the big data activities in Vodafone Qatar to setup an end to end solution for the first time as greenfield project within the OpCo. Key person for communicating and bridging the requirements between Business and Technology domains, understand and provide feasibility study, budget scoping and hold steering committee meetings along with the key CxO board members to track progress and provide the roadmap ahead.
The responsibilities include design, specification, evaluation, procurement, outsourcing, governance and support. Primary duties involves defining the scope, specification, technical implementation and delivery a lambda based Big Data platform and bespoke Data Analytics solution and dashboard for business units like marketing, CVM, Sales, Brand and Media.

The Big Data platform design was conceived and implemented by using EMC’s Isilon Storage Solution, VCE’s VBlock Computation solution and Cloudera Suite and Elastic ELK tool suites. The platform ingests data from 30 different source systems as batch and real time elements and send feeds for campaigns as its integrated with Unica IBM’s Campaign Management Solution is now provides the following capabilities:

• Lambda based batch & real time data processing streaming data handling 1 petabyte a day.
• Model deployment for Predictive & Prescriptive analytics was to provide 360 degree customer view
• Visualization Dashboard for Insights, Locations, Revenue, User Experience, Customer Care, Networks and Telco Immunity built in-house showing KPIs on Maps using Life Ray, D3 libraries and other visualisation elements
• Integration with IBM Unica Campaign Suite for real time campaigns based on DPI and Location.
• Advance Customer Analytics modelling for churn management using ML models via R, Python and SQL (Cloudera Impala).
• Semantic model to provide Social Network Analysis for customer and product affinity.
• Geofencing based on Customer Location for Location based services for Event and location based precision marketing.
• DPI based Customer Segmentation based on Internet Usage for smart marketing.
• Financial intelligence that provide a self-service reporting environment and early warning system to provide real time revenue calculation and warning system to indicate deviation via Telco Immunity Dashboard.
• Care KPI platform to provide analytics to the Customer Care Management for better decision making.
• Retail and Sales Channel Automation enabling Sales & Distribution Management to track end to end movement of inventory and sales analytic information
• Location Intelligence dashboard providing analytics and insights displaying the important KPIs i.e. churn, sales, revenue etc. geographically over the map by using standard map APIs and mapping database.
• Telecom CVM Campaign modelling for up-sell, cross, deep-sell using propensity scoring using IBM SPSS Modeller.
• Social Network Analysis for Telecom Customer Base using IBM SPSS Modeller and dynamic visualisation using Gephi.

Guest Faculty - Course : Data Science & Big Data at Indian Institute of Management, Kashipur
  • India - Delhi
  • February 2015 to April 2015

Worked as Guest faculty at the Indian Institute of Management Kashipur teaching core topics of Big Data and methods in Data Science to post graduate management students as per academic curriculum. Aiding and consulting project activities related to the application of big data technology for business.

Principal Data Scientist at EdGE Networks Pvt. Ltd.
  • India - Bengaluru
  • August 2014 to January 2015

Worked as the Principal Data Scientist with a start-up in Bangalore that specializes in semantic recruitment technologies and HR analytics. I am leading the design and development efforts towards constructing a generic multilingual workforce skills topological framework based on standardized occupation-specific descriptors from semantic information sources that provide occupational information across industrial domains using semantic networks and ontologies. This would rank, score, predict and recommend the right JDs to the mathematically matched resumes using our proprietary state-of-the-art semantic software platform ontology.

Responsible for hiring and mentoring a highly skilled set of development engineers and thoughtfully managing them to deliver and fostering an environment of growth. I Iead my teams in the design, development, unit testing, implementation and operations of products.

Exploring with an ensemble of open-source software options towards understanding and designing the graph mining & network analysis architecture using Apache UIMA, Gephi, Graph database such as Neo4j / OrientDB, Apache Jena, D3.js, Apache Clerezza, Apache Giraph and Okapi ML Library.

Big Data Architect - Machine Learning & Data Analytics at Positive Bioscience
  • India - Mumbai
  • April 2013 to July 2014

Worked as a technical architect and the head of the algorithm team with the cancer genomic start-up in Mumbai, analyzing human genetic data using Big Data and ML.

• Designed and implemented algorithms for NGS pipelines for Cancer Genomics using Hadoop, Spark, R and Python on Cloud platforms like Amazon S3 and EC2. Explored and evaluated the options of data acceleration for the pipeline on a local Hadoop based HPC cluster by using in-memory technologies like GridGain's In-Memory Accelerator for Hadoop stack.
• Designed and modeled a Support Vector Machine (SVM) for Cancer SNP detection that is used to train various ‘Variant Call Format’ (version 4.0) parameters obtained as the output file from the NGS pipelines, which will classify the detected SNPs as a list of true and false positives with a much higher prediction accuracy in detecting cancer SNPs.
• Applying text mining and machine learning techniques to drive near real-time clinical decision support using Big Data technologies to manage volumes of unstructured clinical documentation in various formats and effectively employ predictive algorithms to implement collaborative filtering (recommenders), clustering, and classification which in turn would help understand what might indicate an adverse medical factor before it occurs.
• The application set features automated clinical data curation and classifying data from different scientific literature sources as biomedical named entities and creating a weighted network model by employing an ensemble of open-source NLP software stacks such as:
• Design and development of algorithms for Bioinformatics in Next Generation Sequencing (NGS) to analyse large sets of whole genome sequenced genetic data that identifies, catalogs and interprets small genetic variations against a standard reference human genome and assists medical professionals to target medication based on preventive medicine methodologies thereby leading to more specialized and effective medical treatments.The alogorithims were developed to run on a HPC based next generation sequencing platform using Hadoop MapReduce framework, MangoDB, Amazon S3, Amazon EC2, Amazon Elastic MapReduce(EMR) for personal and cancer genomics.
• Developed Hadoop MapReduce program to perform custom Quality Check on genomic data. Novel features of the program included capability to handle file-format/sequencing-machine errors, automatic detection of base-line PHRED score and being platform agnostic (Illumina, 454 Roche, Complete Genomics, ABI Solid input format).
• Developed a Hadoop MapReduce program to perform sequence alignment on sequencing data. The MapReduce program implements compressed full-text substring index and employing other similar advanced succinct data structures like compressed suffix arrays for algorithms such as Borrows-Wheeler Transform (BWT), Ferragina-Manzini Index (FMI), Smith-Waterman dynamic programming algorithm using Hadoop distributed cache that was configured to ran all MapReduce programs on node cluster (spot instances on Amazon EC2) with Apache Hadoop-1.4.0 to handle the NGS genomics data.
Develop the web based cloud applications for tracking and data analysis.

Toolset: Apache Tika/RUTA, MAchine Learning for LanguagE Toolkit (Mallet), Deeplearning4j, Stanford CoreNLP (CoreNLP), Open NLP, NLTK, Gate, Apache UIMA based ytex / clinical Text and Knowledge Extraction System (cTAKES), Maui Indexer, Keyphrase Extraction Algorithm, Genism, LingPipe, Wikipedia Miner, Illinois NLP, NIF 2.0, SecTag, UMLS/MetaMap, cleartk, PDF Box, Apache POI, BoilerPipe, LibShortText, Apache Lucene, Lemur Indri.

HPC Analyst Programmer - High Performance Computing- Product Engineering at Wipro Technologies
  • India - Bengaluru
  • May 2012 to March 2013

OS: HP/Tandem NonStop™ Kernel, Windows, Linux.
Programming: C, TACL.
Domains: Fault Tolerant Operating System Internals, NSK Standard Millicode - Itanium/x86 processor software and hardware abstraction layer, Switching Fabrics - ServerNet / Infiniband.

Offshore team that contributes to the Hewlett-Packard Consortium for Advanced Scientific and Technical (HP-CAST) computing users group which works to increase the capabilities of HP solutions for large-scale, scientific and technical computing by providing technical solutions and cluster install guides for the most popular Linux solutions including DBMS products and commercial applications across a range of Linux distribution offerings. Responsibilities included design, implementation for enterprise server and cluster storage systems software. The feature enhancements included were porting from HP-Intel IA 64 to x86 architecture, the inclusion of open Infiniband based RDMA switching fabric from its existing ServerNet switching fabric.Ensured 24x7 availability for these applications with no down time.

Client: Hewlett-Packard.

HPC Systems Engineer - High Performance Computing at Wipro Technologies
  • India - Bengaluru
  • June 2010 to April 2012

OS: Windows, Linux - CentOS
Programming : C, R
Domains: Map Reduce, Big Data, HFT, Low Latency Market data systems, Infiniband RDMA, Clustering communications architecture and implementation, Performance Benchmarking, GPGPU.
Tools: Cloudera Hadoop, MS Dryard, Tortoise SVN, Puppet

Prototyped and set up high-performance Linux computing clusters from bare metal units. Set up, configuration, and maintenance like configuring the node and server hardware over the network using Puppet, operating system installation (CentOS), disk, network and package configuration, installing parallel processing packages and configuring the Hadoop suite and the HDFS filesystem and performance monitoring using Ganglia. Writing map-reduce jobs using R on Hadoop via RHIPE on large data sets. Run rough benchmarking to evaluate the cluster efficiency. Findings were presented as a internal client whitepaper.

Part of the analytic team that evaluated a live market data low latency client cluster that required hardware changes and scalability with increased performance requirements. Suggestions involved changes from the use of PCI x based SSD HDD, IA-64 processors to x86, design of Infiniband RDMA verbs.

Clients: Microsoft, SIX Telekurs.

GSM/3GPP Project Engineer - Wireless Telecom Development at Wipro Technologies
  • India
  • April 2008 to May 2010

OS: Windows, RTOS, DMXSEE
Programming : C, SDL (ITU-T Z-100), ASN.1.
Scripting: Shell
Domains:3GPP based Wireless Technologies like 2G/3G.
Tools: gcc, make, gdb, emacs, vi, CVS, Clear Case, MS - Office, Visio, UltraEdit, DMXSEE Suite
Protocols: SS7, 3GPP - GSM/GPRS/LTE.

Designed and developed various key features for voice call on TCH and location based cell broadcast features, both on the Abis side of the NSN S15-Flexi BSC and NSN McBSC which are based on the DX200 platforms which is a popular OS for the core telecom network that runs the BSC towers for mobile communication from NSN.

Client: Nokia Siemens Networks

Apple Certified Engineer - AppleCare Technical at Transworks Information Services Ltd
  • India - Bengaluru
  • August 2007 to March 2008

OS: MacOS Tiger Programming : C, iLog suite

End to end product engineering support for updates regarding firmwares, resolution for post-launch issues of the first iPhone and fix field bug to resolve known software and hardware issues for other Apple products like iPod, MacOS and their applications. Client: Apple


Client: Apple

Education

Bachelor's degree, Certificate in Data Mining and Business Analytics
  • at Indian Statistical Institute
  • July 2014

Data Mining and Business Analytics using R and Minitab

Bachelor's degree, Computer Science and Engineering
  • at University of Calicut
  • January 2007

computer science engineering with focus on systems engineering, high performance computing and machine learning

High school or equivalent, Physics,Chemistry,Maths,Electronics,English and French
  • at Christ College - Bangalore University
  • January 2002

Pre-University Course for 2 years. Physics,Chemistry,Maths,Electronics,English and French

High school or equivalent, Secondary School Leaving Certificate (commonly referred to as SSLC)
  • at A.V.Education Society
  • January 2000

Secondary School Leaving Certificate (commonly referred to as SSLC)

Specialties & Skills

Operating System Internals
Pre sales
Software Solutions
Large Scale Systems
Linux Internals
Clustering
Infiniband
Linux server administration
Benchmarking
Machine Learning
Project Management
Agile Scrum Delivery
Data Science
Big Data
Solution Architect

Languages

French
Beginner
German
Beginner
English
Expert

Memberships

IEEE
  • Professional member
  • June 2004
ACM SIGHPC
  • Professional member and Owner of SIGHPC in LInkedin Groups
  • November 2011

Training and Certifications

Wipro Hallmark Certification in GSM/ GPRS (Certificate)
Date Attended:
September 2009
Valid Until:
September 2009

Hobbies

  • DIY Open Source Projects
    Raspberry Pi, 3D Printer, Aero-modeling
  • Business Quiz
    Topped in Qatar on various Quizzing Apps like Quiz Up and other university and business quizzes.