Despite their expertise in data science, most students lack ethical knowledge
These programs place a greater emphasis on math, statistics, and computing.
What's on the line:
According to a study, data scientists' undergraduate training - described by Harvard Business Review as the most beautiful career in this century - falls short of preparing students for the ethical use that data science can provide.
Data science is a subfield of computer science and statistics that is applied to specific fields such as astronomy, linguistics, medicine, or sociology. The idea behind data crunching is to use large amounts of data to solve complex problems, such as how healthcare professionals can develop personalized medicine based on a patient's genetics and how businesses can make purchases based on customer behavior.
The US Bureau of Labor Statistics predicts a 15% increase in data science careers between 2019 and 2029, in line with the growing demand for data science education. Colleges and universities have responded by developing new programs or revising existing ones in response to this demand. The number of undergraduate data science courses in the United States has increased from 13 in 2014 to at least 50 by September 2020.
As data science teachers and researchers, we were intrigued by the growing number of education programs to investigate what is and is not included in undergraduate data science training.
In our study, we looked at the curriculum for data science at the undergraduate level, as well as the National Academies of Sciences, Engineering, and Medicine's expectations for training data science students at the undergraduate level. These expectations include ethics instruction.
We discovered that while most programs devote significant time to math, statistics, and computer science, they do not provide any training in ethical concerns such as privacy and bias in systems. Only half of the degree programs we looked at included an ethics course.
Here's why it's important:
The ethical use of data science, like any other powerful tool based on data science, necessitates training in data science application and comprehension of its effects. Our findings are consistent with previous research that found the little emphasis on ethics in data science degrees. This suggests that undergrad data science studies may produce a workforce lacking the education and experience needed to apply data science methods ethically.
It is not difficult to find examples of careless data science usage.
For example, policing models with an inherent bias toward data may result in more police officers in previously overpoliced areas. Another example is that the algorithms used to manage healthcare providers in the United States healthcare system are biased in such a way that Black people are denied the same treatment as white patients with similar demands.
We believe that explicit training in ethical behavior will better prepare a socially responsible data science workforce.
The National Academies of Sciences, Engineering, and Medicine recommend training in ten areas, including ethical problem-solving and communication, as well as data management. Our study focused on the level of data science for undergraduates at R1 institutions, which participate in significant research activities.
Future research should look into the extent of training and preparation for various aspects of data science, particularly at the Masters's and Ph.D. levels, as well as the type of data science instruction at universities with varying research levels.
Because most data science programs are brand new, there is an opportunity to compare the instruction students receive to the expectations of employers.
Our research will be expanded by looking into the pressures driving curriculum development in other disciplines experiencing similar job market growth.