We help companies leverage the power of their data by making sure their employees have the right data skills.
The ability to: 1) Recognize and treat data as an asset; 2) Understand how data quality impacts downstream analyses; 3) Identify analytical use cases for datasets; 4) Profile and summarize data characteristics; 5) Read and interpret data visualizations; 6) Craft insights into narratives; and, 7) Make appropriate decisions based on analysis results.
The ability to: 1) Extract measurable business objectives; 2) Translate stakeholder requirements into an analytics project; 3) Identify data requirements and data availability; 4) Select appropriate data tools; 5) Identify data talent needs and assemble a team; 6) Plan data project phases; 7) Design data user experiences; 8) Set success metrics for models or analyses; and, 8) Anticipate areas of risk in the data product.
The ability to: 1) Build ETL workflows; 2) Design and implement data pipelines; 3) Process (big) data in streams or batches and apply appropriate partitioning; 4) Set up processing graphs with task dependencies; 5) Schedule or automate pipeline jobs; 6) Implement data connections or build data hooks; 7) Establish data pipeline monitoring; and, 8) Apply best practices in data lineage and backfilling.
The ability to: 1) Work with various data structures; 2) Set up and pull data from databases, warehouses and lakes to make it available for analyses or data applications; 3) Implement cloud storage; 4) Select appropriate data storage for structured or unstructured data; 5) Navigate and apply tools in the Hadoop Ecosystem.
The ability to: 1) Read in data from a variety of data sources; 2) Select relevant data subsets; 3) Format and reshape data; 4) Clean and treat outliers or missing data; 5) Combine multiple datasets into one analytical base table; 6) Aggregate data to the level needed for analysis; 7) Apply numerical and categorical transformations; 8) Perform feature engineering or enrichment; and, 9) Adapt wrangling processes to big data scale.
The ability to: 1) Profile data structures, types and metadata; 2) Describe the data with statistical measures; 3) Analyze data distributions and relationships; 4) Interpret exploratory visualizations; and, 5) Translate exploratory analysis results into recommendations for data processing or modeling approaches.
The ability to: 1) Apply statistical methods to test hypotheses; 2) Select the appropriate model or analysis type and method given objectives and data inputs and outputs; 3) Tune models to achieve desired performance while balancing bias and variance; 4) Select appropriate model evaluation metrics and use to compare algorithms; 5) Estimate a model’s ability to generalize to new, unseen data; 6) Interpret model outputs and anticipate the impact of implementation.
Assess the ability to: 1) Design deployment systems; 2) Set up machine learning project code bases; 3) Integrate data pipelines into deployment infrastructure; 4) Select appropriate deployment frameworks; 5) Package machine learning models into robust environments; 6) Apply continuous integration and continuous deployment best practices; 7) Conduct data, model and infrastructure tests; 8) Manage model and data versioning; 9) Set up monitoring and alerts; 10) Work with APIs and serverless architectures; 10) Select appropriate deployment hardware; and, 11) Debug and tune production systems.