Weapons Of Math Destruction

✍️ Cathy O'Neil

Tags: cathy-oneil , data-science , woman , lang-en

Critical assessment of algorithms is essential for good Data Science

The book introduces the concept of weapons of math destruction (WMD) which are data-based models that have the following characteristics: 1) they are opaque, i.e, it is not easy (sometimes even impossible) to explain why a model gave a specific score to an individual; 2) they are pervasive, i.e., sometimes it is difficult not to be subjected to a WMD in certain areas (insurance and credit scores, for example); 3) they usually have a negative feedback loop, i.e., sometimes it is difficult for individuals to recover from a bad WMD score. This score ‘counts against’ their record and it is very likely that it will further hinder their chance of future success.

The author, Cathy O’Neil, has a remarkable trajectory, from academia to financial markets (where she was an eyewitness of WMDs being deployed) to activism and ethics in data science advocacy. It is clear that she has a bias when writing the book but to be fair, it does not compromise the message that data-driven models should be fair, accountable, and transparent.

O’Neil shows several examples of WMDs used in our day-to-day such as:
- the US News university rankings (exacerbate the high tuition problem in American universities)
- targeted ads for predatory loans (exacerbate the student loan problem)
- e-credit scores (using inadequate variables for the sake of ‘big dataism’ to compute credit scores)
- recidivism (bad variable selection - using zip codes automatically targets people from rough neighborhoods).

I would add that the uncritical use of algorithms and machine learning is already a problem in science. Good methodology, analysis and auditing are essential.

The book does not provide concrete solutions. Only general guidelines are proposed as each problem has its own peculiarities. Nevertheless, there are some common sense recipes that should be followed: choose the evaluation metrics wisely, more variables does not imply better and/or fairer models, gather feedback to tweak the model.

‘Weapons of Math Destruction’ is a recommended read for any data scientist.