(1. Data Scientist: Utilities)
(2. Data Scientist: Marketing)
(3. Data Scientist: Finance)
Our client is a Governmental organisation in the Local Government sector and they seek the services of the three Data Scientists to be based at their national office in Pretoria, and will each focus on the functions above.
Job Purpose:
Data Scientist will be responsible for the application of data science and advanced quantitative methods, which include machine learning, deep learning, artificial intelligence, predictive analytics to enable key strategic, tactical, and operational use-cases within the domain of finance, and human capital, as per Company’s Digital Strategic Framework. Responsible for the end-to-end data Lifecycle from strategic planning, data collection, through monitoring, evaluation, and incremental improvement.
In this role, you will enhance The Company’s current and future analytic environment and data platforms as well as associated develop data product and services for the consumption of both The Company and municipal teams – as well as to enhance and develop The Company’s data-as-a-Service offering. Further the purpose will include the identification of key organisational problem statements, opportunities, user data-experience and journey requirements, through data collection, and using a wide range of statistical, machine learning, and applied mathematical techniques to deliver insights, and predictive-analytics to decision-makers. In addition, you will provide technical advisory and
support to snr data teams and decision makers in municipalities within the domain of water, electricity, and infrastructure management or Marketing, communication, strategy and Natural Language Processing or finance and human capital, depending on the focus elected.
Responsibilities:
- Support the chief digital officer and snr data architect using a variety of state-of-the-art cloud based technologies to solve data analysis and prediction problems.
- Identify and act on new opportunities for data driven business in data science and analytics.
- Recognise when existing solutions can be generalised to solve new problems and address new data-as-a-service verticals
- Work in a collaborative environment developing data science methods, tools, and algorithms to solve problems.
- Become fluent in analytical modelling using THE COMPANY’s internal data modelling platforms and tool;
- Continuously learn and apply latest and fit-for-purpose, open-source and proprietary tools and technologies to achieve results, including some or all of the following:
Cloud
- Microsoft Azure (must)
- AWS
- Google Cloud
Big Data
- Mondodb
- Hadoop
- Cassandra
Machine Learning
- Kubeflow
- Tensorflow
- PyTorch
Business Intelligence/Analytics and visualisation
- Microsoft PowerBi (must)
- Microsoft Excel (must)
- Google Analytics (must)
- Adobe Analytics (must)
- Google Charts
- NLTK (must)
- Textblob
-
- SpaCy
- CoreNLP
- Datorama (advantage)
Languages
- Python (must)
- R (must)
- SQL (must)
Conversational AI
- Dialogflow
- Teneo
- Bot Framework / Bot Builder SDK (ideal)
- Watson Assistant
- Associated tools and technologies as they become available, and the platform evolves
- Load and merge data originating from diverse sources.
- Performa data cleansing, and quality management.
- Pre-process and Transform data for model building and analysis.
- Troubleshoot data quality issues and work with team members to reach solutions
- Perform descriptive analytics to discover trend and pattern in the data;
- Create visualizations, including dashboards to provide insights on large data sets and input to finished reports
- Develop predictive models for business solutions.
- Deploy predictive and other models to production.
- Build and train NLP models
- Analyse output products to assure data quality and conformance to requirements.
- Develop technical specification for 3rd party platform data integration and streaming
- Participate in continuous improvement efforts to increase available data quality and speed of delivery
- Address ad-hoc domain-specific data analytic requirements from domain or cluster leaders;and a continuously deliver user-centric data visualisations, publications, and products.
Outputs
Data Pre-processing and Transformation:
- Demonstrate ability to transform raw data from multiple data sources.
- Ability to understand the business requirements specification and use the suggested template to transform data.
- Ability to understand the differences between data types and transform the data types when required.
- Ability to write ETL jobs and automate pipelines when required.
- Ability to investigative anomalies when loading data into the required data source
- Ability to source external data sources to be used in machine learning processes.
- Ability to work and know which platform tools to use for data transformation.
- Exploring the data for patterns and trends,
- Ability to translate and reproduce mockups design of data analytic dashboards
Modelling Building and Deployment:
- Demonstrate ability to translate business use case into machine learning problems
- Understand the end to end process of building a machine learning model including data transformations, train test split, saving and deploying machine learning models.
- Assist Senior Data Scientist to deploy machine learning models in production.
- Monitor and debug machine learning pipelines in production.
- Ability to work and know which platform tools to use for machine learning and deployment
Strategic support:
- Support senior managers and municipal infrastructure and trading services teams to plan and execute agreed performance agreement goals for career development
- Manage the team of data scientist to deliver all strategic and analytical outputs as required
- Work with team members when required to execute data science team goals
- Provide project support on domain specific projects
- Provide feedback to senior managers on work and deliverable progress
- Write assessment reports to include performance of algorithms used for machine learning use cases
- Identify data science innovation that creates propriety advantage for the company
- Work with Senior data scientist to identify specific data science roles to specialize in the data science team
- Proactively develop data products for emerging use-cases
- Provide advisory and support to municipal data teams.
Requisite knowledge
-
- ETL
- Machine learning
- Deep learning
- Programming
- Data Modelling
- Database configuration and management
- Data visualisation
- Data analysis
- Predictive analytics
- Agile
- Exposure to financial services, mining, and/or utilities industries (advantage)
- Business acumen
- Ability to develop machine learning tools built in using python, R
- Ability to manage both structured and unstructured data using SQL
- Ability to visualise data using various tools
- Ability to model data for prediction
- Ability to manage time and project deliverables
- Ability to communicate with executives and senior management across sectors (THE COMPANY, local government, government, and industry)
Qualifications and experience
-
- Bachelor’s or Honour’s degree in Statistics, Mathematics, Applied Mathematics, Physics, Econometrics, Actuarial Science or equivalent experience
- Masters in Statistics, Mathematics, Applied Mathematics, Physics, Econometrics, an advantage
- 3+ years data science and analysis experience
-
- Proficient in Python, R, and data management technologies
Remuneration
R 629k – R820k p/a total cost
How to apply
Please send your CV to Colin Khomeliwa, cv@khomeliwa.com on or before Monday 28 February, 2022. The job
title must appear in the subject line of the e-mail