For Part 2 of QuantHub’s webinar series, we were joined by three leaders in the Data Science field to discuss building teams, how to avoid working in silos, the importance of getting the right skills on the team, and how to develop a culture of excellence and continued collaboration. Raja Chakarvorty, the VP of Data Science at Protective Life, Jacob Kosoff, the Head of Model Risk Management & Validation at Regions Bank, and Nathan Black, the Chief Data Scientist at QuantHub have many years of experience in this field. They generously shared their varied perspectives on these issues based on their teams’ needs and organizational priorities.Read more about each panelists here.The video of our webinar is below, but for those of you who are short on time, we’ve summarized 7 major issues discussed on the panel and some key takeaways.
Where Analytical Teams Should Reside in an Organization
There’s a lot of discussion in the industry about how data science teams should be structured within the organization. According to our panel, the short answer to where analytical teams should reside is, “it depends”. For Regions, being a financial services company, there is strong quantitative talent spread throughout the organization. As Jacob Kosoff explained, “There are 35 areas of the bank that actually deploy quantitative talent…There is always regular dialogue on the best processes or best kind of setup or architecture for where to deploy quantitative resources and which type of quantitative resources…that’s an ongoing conversation.”
Nathan Black cautioned that wherever the analytical team resides, there are risks to manage, such as underutilization of a centralized team or disconnection of a distributed team. He has observed that companies are looking for a balance noting, “I’ve seen a lot of companies that are trying to balance between a centralized analyst team…versus a hub and spoke type of model.” He added that we are “seeing a movement toward creating physical environments where people actually collaborate…focusing on, ‘Okay, how do we actually make this work to answer the business question?’”
Required Technical Skills for Data Scientists
When asked about technical (or hard) skills required to do the job of a pure Data Scientist, these essentially boiled down to the primary task of a Data Scientist, which Raja Chakavorty described as, “writing a mathematical equation”. That is to say writing an equation and using an algorithm to solve a business problem. Raja added that an understanding of predictive modelling and best practices, as well as an ability to test and evaluate whether a model is good, and the ability to choose an appropriate algorithm is needed.
At Regions, Jacob emphasized, “No one is perfect.” Regions looks for hard skills in 3 areas, while recognizing that no one candidate will excel in all three. Rather Regions looks for strengths in 1 or 2 of three hard skills areas, and an ability and curiosity to grow in weaker areas. Those three areas are: IT skills (i.e. programming), math and stats, and finally domain knowledge (i.e. banking).
3 Core Skills Needed for Any Analytical Role
Throughout the conversation about necessary skills, no matter what the role, all three panelists kept emphasizing the need for 3 basic skills: curiosity, problem solving and critical thinking. As Raja put it when he described what a Data Scientist needs in addition to core programming skills, “…how you solve a problem. These things become the key basic ingredients which I look for when I’m actually hiring people and I’m trying to develop a data team.” Homing in on the need for problem solving, Nathan added, “A lot of skills and technologies are changing, but at the end of the day what are you trying to do? You’re trying to solve a problem using a data driven approach.”
The Difference between Data Engineer, Data Scientist and Data Analytics Roles
In general, the panel agreed that the role of the Data Engineer is, as Raja put it, “building that raw material and presenting it to the Data Scientists for consumption…preparing the data, developing the foundation data required, and presenting it to the Data Scientist.” Data Scientists then use the data to solve a business problem using either supervised or unsupervised learning or some kind of algorithm. As for the broader role of Data Analyst, Raja described it as, “providing a quick answer to the business using some simple matters, maybe a simple linear regression or a univariate analysis.”
That said, Nathan pointed out that often companies have an idea of what they want when they are hiring, with certain analytical roles and associated skillsets in mind, but at the end of the day the analytical person hired ends up doing essentially whatever is needed to make their role work – whether it’s data engineering, data science or more simple data analysis. He has observed, “people tend to form into the actual responsibilities that are required. So, they may come in with heavy math and stats…but then they’re asked to go grab the data or build a data pipeline to actually analyze.” Likewise, Nathan has observed that a person may be brought in to work on big data only to realize that the data needs to be broken down into smaller sub sets to make it workable.
The Importance of Team Diversity and How to Achieve It
A big challenge of building data science teams is ensuring that all the skill sets needed to do the job are covered. Raja pointed out that how you do this depends on where you are in the process of building your team. He suggested that diversity is not always possible when you are just starting out building a team. In that case, he recommends that you must look for a “jack-of-all-trades” in order to be able to start addressing business problems and establish the team’s reputation within the organization.
To create diversity of skills on his team, Raja uses an approach which he dubbed “cross pollination” He explained, “I am a big fan of having a diverse team and then cross pollinating. Let’s say I hire five people. Every individual has five skills with one common to the others and then, once they are on the team, all the five cross pollinate. That way you grow everyone on the team.”
At Regions, Jacob explained, they hire and manage to achieve diversity. He admitted, “It’s very rare to find an individual that has all the skill themselves to complete a project.” Jacob described that his associates will have many different individual roles and skills – pulling data, building data platforms, implementing models – but that “together they are problem solving.”
How to Break Down Silos and Operationalize Data Science
All three panelists agreed that the focus of the Data Science team must be on building trusting relationships within the organization in order to be effective and relevant. Jacob described how this is done at Regions, “the people on our team know the people in IT. They know the people in operations centers…proven competence plus integrity plus relationships equal credibility.” He added that data teams need to involve their business partners early on a project and build trust.
Raja added that the culture at Protective Life is very transparent and collaborative. He describes it as a “360-degree exposure”. He tells his Data Scientists of their job duties that “80% is actually meeting with the business partners, understanding the business problem, taking the requirements to do the analysis to build the model, and then, working with IT and taking it to the logical conclusion, which is deployment.”
The Million Dollar Question: Do You Need a Masters or PhD to be a Data Scientist?
At QuantHub we’ve previously looked at the issue of whether data science recruits need a Master’s degree or higher. Many would-be candidates out there will be pleased to hear that the answer from our panel was in a word “no”. For Regions Bank, diversity trumps degrees. “We purposefully want a distribution of people with bachelor’s degrees, people with a distribution of masters, people with a distribution of PhD’s, and within that, even with the bachelors, we want a distribution of skills and a distribution of degrees,” Jacob explained. He added that Regions highly values people who are self-taught and people who perhaps don’t have a Master’s but who know where they really want to grow in the future and who are curious and committed to continuous learning.
That said, Raja pointed out that for hiring managers who have to review hundreds of resumes and quickly short list people, choosing people with a Master’s or PhD is the safest and easiest approach because “at least you are sure that person went through the coursework, did those fundamentals.” In his opinion screening for a Master’s degree “just becomes a hiring tool.”
The Bottom Line
It’s clear from our panel discussion that there are a few key steps to building a successful data science team: recruiting a diverse team whose skill sets complement each other, building strong, collaborative relationships within the organization, and hiring analytics professionals who possess the three core skills of curiosity, problem solving and critical thinking and a commitment to continued learning. We’d like to thank our panelists for providing their practical and professional words of wisdom and we invite readers to leave comments and thoughts after reading this or watching the video.
A full version of the webinar transcript which contains more detailed discussions and covers additional topics covered can be downloaded here.
Stay tuned for QuantHub’s next webinar, Part 3 of our series: Top 5 Pitfalls to Avoid When Developing Your Data Science Team. We’ll be covering the implications of bias and inadequate model validation processes in data science team performance and results.
More about our Panelists
Raja Chakarvorty is a Data and Analytics leader with over a decade of experience in the Insurance domain developing and implementing Data and Analytics strategy. He has delivered numerous data science projects with a significant positive impact on top and bottom line. Raja believes in building institutions and contributing to organizations by providing data-driven solutions to complex business problems. Raja’s niche is to build Data Science team and foundational capabilities for organizations to support data science program and generating a positive ROI. In his current role, Raja heads the Data Science program at Protective Life. He has previously worked as the Head of Personal Lines Data Science at Hanover Insurance Group and as a Manager of Research & Analytics at Travelers Insurance. Raja holds a PhD in Physics from the University of Notre Dame.
Jacob Kosoff is the Head of Model Risk Management and Validation (MRMV) at Regions Bank, serving in this role since May 2014. As the Head of Model Risk Management and Validation, Jacob is responsible for the management of the model governance and model validation teams and for overseeing the governance and validation for all models and analytical tools at Regions Bank. Jacob served as the Model Governance Manager at Regions prior to his current role.
Prior to Jacob’s time at Regions, Jacob served in multiple leadership roles at PNC in Pittsburgh, PA for 4½ years, including in Credit Review and Model Risk Management. Prior to Jacob’s time at PNC, Jacob served as a model developer and senior economic analyst at Freddie Mac in McLean, Virginia for 3 years. Prior to Jacob’s time at Freddie Mac, Jacob worked for 3 years at Genesis Analytics as an economic analyst and consultant, developing and implementing models for various banking clients in South Africa. Jacob has also served as a lecturer in the School of Economics and Business Sciences for 3 years at the University of the Witwatersrand, Johannesburg.
Jacob has a Master of Commerce Degree in Economics from the University of the Witwatersrand and Bachelor of Science with Honors in Economics from Pennsylvania State University. Jacob currently serves on the Board of Directors for the Levite Jewish Community Center (LJCC) in Birmingham, AL. Jacob resides with his wife and three children in Birmingham, Alabama.
Nathan Black is a Data Science Professional and AI Researcher with over 5 years of experience leading and working alongside quant teams to develop cutting-edge, end-to-end data solutions in manufacturing, healthcare, food retail, finance, and education industries. Nathan has a proven track record of using data to help people thrive, assisting organizations in capturing value from data and technology through the deployment of BI, Prescriptive Modeling, and Artificial Intelligence applications.