Job Description
This individual contributor is primarily responsible for participating in the design and development of data pipelines and automation for data acquisition and ingestion of raw data from multiple data sources and data formats under the guidance of more senior data scientists. This role is also responsible for developing detailed problem statements outlining hypotheses and their effect on target clients/customers, analyzing and investigating data sets and summarizing key characteristics, selecting, manipulating and transforming data into features used in machine learning algorithms, training statistical models under the guidance of more senior data scientists, deploying and maintaining reliable and efficient models through production, verifying model performance, and working with internal and external stakeholders across domains to develop and deliver statistical driven outcomes.
Essential Responsibilities
Core Responsibilities:
- Pursues effective relationships with others by proactively providing resources, information, advice, and expertise with coworkers and members. Listens to, seeks, and addresses performance feedback; provides mentoring to team members. Pursues self-development; creates plans and takes action to capitalize on strengths and develop weaknesses; influences others through technical explanations and examples. Adapts to and learns from change, challenges, and feedback; demonstrates flexibility in approaches to work; helps others adapt to new tasks and processes. Supports and responds to the needs of others to support a business outcome.
- Completes work assignments autonomously by applying up-to-date expertise in subject area to generate creative solutions; ensures all procedures and policies are followed; leverages an understanding of data and resources to support projects or initiatives. Collaborates crossfunctionally to solve business problems; escalates issues or risks as appropriate; communicates progress and information. Supports, identifies, and monitors priorities, deadlines, and expectations. Identifies, speaks up, and implements ways to address improvement opportunities for team
Specific Job Duties:
- Develops detailed problem statements outlining hypotheses and their effect on target clients/customers by defining scope, objectives, outcome statements and metrics.
- Participates in the design and development of data pipelines and automation for data acquisition and ingestion of raw data from multiple data sources and data formats under the guidance of more senior data scientists by transforming, cleansing, and storing data for consumption by downstream processes; writing and optimizing diverse SQL queries; and demonstrating a working knowledge of database fundamentals.
- Analyzes and investigates data sets and summarizes key characteristics by employing data visualization methods; and determining how best to manipulate data sources to discover patterns, spot anomalies, test hypotheses, and/or check assumptions.
- Selects, manipulates, and transforms data into features used in machine learning algorithms by leveraging techniques to conduct dimensionality reduction, feature importance, and feature selection.
- Trains statistical models under the guidance of more senior data scientists by using algorithms and data mining techniques; testing models with various algorithms to assess the input dataset and related features; and applying techniques to prevent overfitting such as cross-validation.
- Deploys and maintains reliable and efficient models through production.
- Verifies model performance by demonstrating a working knowledge of a variety of model validation techniques to assess and discriminate the goodness of model fit; and leveraging feedback and output to manage and strengthen model performance.
- Works with internal and external stakeholders across domains to develop and deliver statistical driven outcomes by delivering insights and values from heterogeneous data to investigate problems for multiple use cases; driving informed decision-making; and presenting findings to both technical and non-technical audiences.
Minimum Qualifications
- Minimum two (2) years experience working with Exploratory Data Analysis (EDA) and visualization methods.
- Minimum one (1) year machine learning and/or algorithmic experience.
- Minimum two (2) years statistical analysis and modeling experience. Minimum two (2) years programming
- Bachelor’s degree in Mathematics, Statistics, Computer Science, Engineering, Economics, Public Health, or related field AND Minimum three (3) years experience in data science or a directly related field. Additional equivalent work experience of three (3) years for a total of six (6) years in a directly related field may be substituted for the degree requirement. Advanced degrees may be substituted for the work experience requirements.
Additional Requirements
- Targeted Skills: · Advanced Quantitative Data Modeling; Applied Data Analysis; Data Extraction; Data Visualization Tools; Machine Learning; Microsoft Excel; Design Thinking; Business Intelligence Tools; Data Manipulation/Wrangling; Data Ensemble Techniques; FeatureAnalysis/Engineering; Open Source Languages & Tools; Model Optimization; Algorithms
Preferred Qualifications
- List of nice-to-have skills that are not required, but are desired qualifications that would compliment the job. These include complex skills, unique knowledge, job experience, added education, certifications, or licenses.
- At least 1 year experience in a lead role with or without direct reports. Master’s degree in Mathematics, Statistics, Computer Science, Engineering, Economics, Public Health, or related field.
Benefits
- Transportation.
- Life insurance.
- Medical insurance.
- Solidarity association.
- Growth plans.
- Additional days off.