Written by Felix Burn-Forti, Managing Consultant at 4most
Machine Learning has been one of the hot topics in finance over the last few years, with benefits observed in most areas – many large institutions have prototyped and implemented techniques across decisioning, strategy optimisation, and fraud. The other key area that Machine Learning can bring significant benefits to, is automation.
Over the next five to ten years automation will pose one of the biggest challenges impacting most industries. Credit Risk will be no different, already having a long entwined history with automation going back to the invention of scorecards by FICO in the late 1950’s. Banks already employ sophisticated models to quantify regulatory capital requirements and make choices on who to accept and who to reject, who to market to and how to market. Many of these areas have been automated using objective outcome data and well-established methods. However, this will not be the case for the next iteration of automation. Some recent events have highlighted areas where automation challenges our perception of existing processes. These are often incorrectly identified as “pitfalls” of automation, when they are simply reflections of reality (a reality that might go against how we believe things work). Appropriate and effective responses to the questions presented by these initial attempts can help us to understand how to approach automation to not only ensure but also improve both fairness for customers and outcomes for businesses.
The next phase of automation will be aimed at less obvious measurable processes. Amazon’s recent attempt at building a hiring model is a good example of this. Clearly it is difficult to objectively model employee performance, so Amazon decided to utilise employee performance reviews and their past interview performance as a measure of employee competency. With the benefit of hindsight, it’s easy to see the potential drawbacks, but at that stage, suggesting that there could be prejudice in the hiring system would have been laughed at.
However, a few years down the line, the model started producing unexpected results. It gave negative marks for mentioning the word “women” in a CV. It allocated negative marks for attending women-only colleges. Amazon’s data scientists must be amongst the best in the world, so it seems unlikely that they would structure the data in a way that would create this bias. The more likely cause is that the model was picking up on unintentional biases within the hiring process. Interviewers will very rarely be purposefully prejudiced, but purposeful thinking doesn’t have nearly as much impact on these decisions as we would like to hope.
Another interesting case has been seen in successful orchestras over the last few decades. Composers were previously tasked with hiring musicians to join their orchestra - having dedicated their lives to music, it’s reasonable to assume that they can be impartial and base their choices purely on the musical talent of the applicant. For many years it was argued that women were less represented in orchestras because they simply didn’t have the required characteristics to be effective musicians. However, after introducing a blind interview process (using screens to hide the applicant), the proportion of women in orchestras increased fivefold. Clearly our intuition was wrong as even experienced and dedicated professionals can be biased to their own detriment. A model built to predict orchestral appointments during that time would undoubtedly have picked up on this and scored women negatively. This is because there was no other objective measure (consistent across gender) that would have accounted for the discrepancy in hiring. Simultaneously, that insight could have been discarded as it may not have been believed! It’s certain in this case where our intuition that people always make better decisions than algorithms is not always true; automation can (and does) identify areas of human bias in decision making.
An automation project recently undertaken by 4most had similar results and uncovered some elements of bias within underwriter behaviour. The initial aim of the project was to replicate existing underwriter decisions; however, it was apparent from an early stage of the model development, that this would lead to results that were clearly in contradiction to the lending policies. When working first-hand with the underwriters it was not obvious that this bias was present but the modelling algorithms soon identified that certain pieces of information which should not have been influencing the outcome, were key predictors of the underwriter’s decision. Following this discovery, it became evident that replicating the human decision-making process could be suboptimal and potentially unfair. As a result, the aim of the project pivoted to no longer be about replication but more about optimisation in tandem with automation – two things that are not always considered to go together.
Amazon’s attempt is reflective of the first stage of full automation. We try to replicate human processes as we have trust in what we have done so far. When these models challenge us and make us take a second look at our own unconscious processes and inherent biases, we should not be knocking them down. These models allow us to identify areas in which we have made suboptimal decisions. Responding to the challenges presented by the models is the key to mastering the question of the second stage of automation, namely: how do we improve on what we were doing before?
We shouldn’t ignore human perception, but we should trust algorithms enough to uncover where we have made mistakes. Whilst we may not feel like we have biases, analytics can help us to determine whether this is true or not. Rather than discard models that do not meet our expectations, we should aim to recognise where the bias is coming from and what we can do to ensure that customers are being treated fairly.
Automation is coming and the search for efficiency and fairness doesn’t need to come at the expense of making the right decision. When approached correctly it can also open up the avenue to challenge human biases and ultimately improve the way we make decisions.