Are QuantHub’s Data Science Skills Tests Valid and Unbiased?
There has been a lot of data science and tech industry talk about the need to reduce “unconscious bias” in the hiring process in order to promote diversity and inclusion in the field.
Research shows that the data science and broader technology fields are particularly notorious for a lack of both gender and ethnic diversity in organizations and teams.
US Census data shows African Americans, Hispanics, and women are consistently underrepresented in tech fields with African Americans representing just 6%, Hispanics 7%, and women about 25% of the STEM workforce[i].
The figure below also shows no real signs of change in this trend, with ethnic minorities making up less than 12% of students in data science studies.
The irony is that the tech and data science industries have been proven to thrive and depend on a diversity of skills, ideas and personalities to promote innovation and to retain competitive positioning.
As more and more evidence emerges that demonstrates the financial and organizational value of promoting diversity in teams, structured and objective interview techniques are becoming a mainstay of inclusive hiring practices.
Online AI-driven skill assessments like QuantHub are one such interview technique.
Assessments are being used to test all manner of technical skills, cultural fit and other candidate characteristics. Many of these tests have been criticized by some who suggest that the companies, and more specifically the programmers, who create assessments are building in both their own and society’s inherent biases into the test algorithms, structure and content.
As a provider of technical skills assessments, we had to ask, “Is this true for QuantHub?”
We believe that one of the most significant value-adding features of QuantHub’s skill assessment platform is its ability to compare the technical skills of hundreds of candidates “blindly” and objectively without bias. The platform must behave objectively in order to provide productive structure to the hiring process.
QuantHub enlisted a team of organizational psychologist and experts who have extensive experience designing online assessments and employee selection processes to independently assess its testing platform for bias against any group of job candidates.
Following is a summary of this expert study including the results of an analysis for bias in the platform.
Bias Study Background
Hiring data scientists and analytical talent is a time consuming, complicated and highly technical process. It varies tremendously from one company to the next and from one job role to the next. As a result of this complexity and variation candidate selection standards may become relaxed and/or biased.
QuantHub was created as a tool to help companies confidently and thoroughly screen data science candidates on technical skills in a way that addresses the complexity and urgency of this hiring process.
QuantHub measures job-related skills in statistics, programming, data exploration, data wrangling and modeling. The application is designed to “screen out” poorly qualified candidates. It does this by using item response theory to vary the difficulty of skill test questions presented to candidates based on the particular applicant’s pre-assessment skill level. Test question skill levels range from 1 (least sophisticated) to 5 (pro).
To date QuantHub has been used to assess over 7,500 candidates.
We wanted to make sure that hiring managers can use QuantHub assessments as an effective resource for finding qualified candidates as opposed to being a biased barrier that eliminates potential talent.
We engaged Blankenship & Seay Consulting Group, expert industrial and organizational psychologists, to conduct an independent review of QuantHub skill assessment tests. Their goal was to determine the platform’s validity and lack of bias from a skill testing and demographic standpoint.
Job Description and Skill Testing Review
This part of the evaluation consisted of an examination of all QuantHub company content, a review of the Occupational Information Network job description database, and internal expert interviews.
The purposes of this examination was to ensure that job descriptions such as data analyst intern, data engineer and data scientist used by QuantHub were accurately defined and that the skills tested were appropriately matched to the tasks, knowledge and abilities associated with being in any of these roles.
The exercise aimed to determine that a valid link exists between the skills measured on the QuantHub assessment and those required on the “real world” job for which a candidate is being tested.
Item Review and Content Validity
The purpose of this review was to ensure that QuantHub’s assessments measure a candidate’s understanding of each of the various data science tasks, rather than simple knowledge or terminology.
The content validity review was executed in line with the Uniform Guidelines on Employee Selection Procedures. The review assessed test items and question content to ensure a valid link between the essential parts of a job’s duties, skills, abilities and personal characteristics and the candidate selection procedures and outcomes.
Demographic Adverse Impact Analysis
The issue of demographic bias within standardized tests and the hiring process, in general, is a common and ongoing concern. QuantHub was therefore especially determined to ensure that its technical skill assessments are appropriate and treat various demographic groups fairly.
Experts conducted an “adverse impact analysis” on different ethnic groups and gender to determine whether there was any bias or other test factors that treated one or more groups unfairly. Adverse impact exists when there is a substantially different rate of selection in hiring or other employment decisions that works to the disadvantage of members of a certain race, sex or ethic group.
In the case of QuantHub tests, adverse impact would be demonstrated if certain groups have a much higher passing (i.e. selection) rate than other groups for the overall test or for specific skill tests.
Adverse impact scores were computed for overall candidate test scores as well as for each skill tested separately on the QuantHub platform.
Summary of Study Results
1. QuantHub’s assessment platform measures what it is intended to measure.
- QuanthHub test questions are evenly distributed between testing a candidate’ conceptual understanding and knowledge and their practical application and ability to perform on that concept. It also scores a candidate’s actual skills in an area, rather than the number of right or wrong questions.
2. QuantHub’s test content is valid and appropriate for data science skill testing.
- QuantHub’s process for ensuring that the testing was developed by a significant and diverse number of highly experienced data science professionals with relevant knowledge and experience was determined to be sufficient to ensure content validity.
3. QuantHub’s testing does not present adverse impact for any racial, gender or ethnic group.
- The pass rate for men was only slightly higher than for women.
- African Americans and Hispanic testers scored almost as well as white candidates, while Asians scored slightly better than white candidates.
- There was no adverse impact on any group for specific skill subset tests.
Below is a summary of the adverse impact test results for overall QuanHub test scores. Note that a “passing” score was 2.5. Only candidates who passed the test were considered for each of the groups to calculate an adverse impact ratio.
Adverse impact is defined by the “the 80% rule”. That is to say that adverse impact is indicated when a minority group “pass” ratio is less than 80% of the majority group’s pass ratio.
For example, in the case of male vs. female, 72% of males passed the test. 68% of females passed the test, which represents 94% of the total males who passed. This is not a significant difference in pass rates and therefore does not demonstrate any adverse test impact on women.
|Candidate gender||% of candidates that “passed”
(minimum score of 2.5 out of 5)
|Adverse impact ratio
(equal to or greater than 80% = no adverse impact on group)
|40 and over||67%||94%|
What do these results mean for data talent recruitment and hiring?
As the candidate pool for data science-related roles evolves and expands, it will become increasingly important for companies to confidently vet large numbers of candidates with diverse experiences in an efficient, yet unbiased manner.
Overall, the organizational psychology experts at Blankenship & Seay concluded that QuantHub is:
“doing a good job of doing what it was designed to do – screen out poor candidates. It provides an accurate measure of the skills required for data scientists with little to no adverse impact to minority candidates.”
This is good new for recruiters, Human Resources managers, hiring manager and analytics leaders looking to create a diverse team of individuals with strong skill sets in an efficient, validated manner.
QuantHub skill assessments are one piece of the complex technical hiring and vetting process that can used to create a more structured, unbiased hiring process that ensures that all potential candidate groups have a fair shot at demonstrating their skills, or lack thereof.
Interested in knowing more about QuantHub’s support of diversity in data science?
Check out our interviews with women in data science:
And our diversity scholarships:
[i] US Census Bureau, American Community Survey (September 2013).