#UADProject
Explore tagged Tumblr posts
automated-decisions · 7 years ago
Text
Foundations for Understanding Automated Decisions
There’s a lot of folklore surrounding Artificial Intelligence (AI). Elon Musk is worried, while Mark Zuckerberg believes it’s the answer to many issues, most recently fake news. For many of us, AI is magical, whether ominous like HAL and Ex Machina, or enrapturing like the Spotify Daily Mix and Alexa. And yet, while we usually love magic, we don’t trust it, because we don’t know how the rabbit got in the hat. When the same trick is applied to your insurance, job, visa or bail application, the apprehension, understandably, gets worse. If we want AI to be the magic bullet many believe (or are being told) it will be, this needs to change.
Understanding Automated Decisions is a research project that aims to do exactly this, and do it in a way that it makes as much sense on the street as it does in the lab. We call them ‘automated decisions’ and not ‘AI’, so that we can set the right context. Before moving on, here’s a broad overview of the goals we had for the research we conducted:
Be able to explain what makes a decision automated, cut through the noise around ‘AI’;
Explore the current state of the art, and how we got to where we are with these systems;
Inform people how these systems function, and how they can go wrong;
Highlight the importance of fairness, transparency, accountability and explainability of these systems;
Explore how this can be achieved for the systems we already use, and how we should build systems for the future;
Look at how greater trust can be built (where deserved) using the tools we have, by making automated decisions more transparent and explainable, both technically and through design frameworks.
An automated decision, is any decision that can be made (by a machine), with little or no ‘human involvement’. Machine learning is a promising area of applied automated decisionmaking with its roots in statistics, where a set of rules (or algorithms) can be designed to learn from data, and make decisions and inferences without express human input. A more evolved version of this is deep learning, takes inspiration from the way our brains function, where the learning process uses simpler concepts to learn and develop more complex ones about other kinds of data - this is the closest we currently are to achieving ‘AI’.
The state at which a mechanical artefact (like HAL) is able to make human-like intellectual decisions and develop complex reasoning by itself, is what is known as general artificial intelligence, is true ‘AI’. A machine learning algorithm might beat you at Go, but it doesn’t know the difference between playing Go and Golf - AI might.
Many of these systems were created in earnest, to be able to help us compute, predict and analyse better. Once we were able to combine this with a lot more data and computing power, we started seeing just how much we could do with these systems, and when we deployed them, it seemed like they really were doing a great job, until we realised they weren’t.
A big part of the problem is that the systems currently deployed, weren’t developed keeping tenets like Fairness, Accountability, Transparency and explainability in the way we now demand it, in mind - leading to a trust deficit. Trust is a complicated thing especially when you’re dealing with a machine, but here are some important things over 100 research papers and books we looked at showed us:
The technologies involved here are not new - concepts such as AI and machine learning have been around for decades!
Problems of bias and fairness in computer systems are also not new - even rudimentary computer systems running on basic algorithms ran the risk of making biased, unfair decisions depending on what they were optimised for doing, and the data they relied on.
Contemporary automated decisions rely on (a) access to unprecedented amounts of ‘raw’ data to work with - some of this is structured (purpose built datasets), and some unstructured (tweets, images, raw text). This, combined with our ability to compute more, and faster than ever before makes for the steroids that allow today’s automation systems to work their ‘magic’.
Rooted in statistics, predictions automated decisionmaking systems (ADS) make essentially rely on making complex correlations between different types of data. For example, some systems have been known to correlate using certain types of websites with success at certain types of jobs - how did it do this? By analysing millions of historical patterns of website use and employee performance, in order to make predictive assumptions based on such correlative trends. However, you could still be a great employee and yet not be using that specific website, but if the automated system can’t tell you just how it picked that employee, you’ll have no way of picking up on red flags like these.  
Allowing ADS to ‘think’ for themselves, making correlations from vast amounts of data, tweaking their rulemaking structures autonomously, allows them to optimise their performance, ostensibly, far better than a human would have been able to were one to scrutinise, verify and confirm each move. Imagine deciding to buy a cup of coffee, but having to wait for someone to sign off on your caffeine crunch, how you walk to the cafe, picked a coffee, interacted with the barista, and then paid. Annoying, right? Some argue that something similar applies when it comes to ‘scrutinising’ what an automated system is doing.
That said, we know increasingly, that many of these systems aren’t the panacea to fairness, optimisation and efficiency we thought they were. Quite the contrary, the systems can draw on tainted, biased data, making correlations (see above) they shouldn’t have, and produce outputs that may fulfil one criteria (accuracy) but not another (fairness). Fixing this, and soon, is very, very important.
The root cause of these problems is not evil computer scientists and business people - it’s different approaches between developers, users, and policymakers. (There are for instance, at least 21 definitions of fairness in computer science alone, imagine when we get to law and philosophy)
There is also a genuine move towards being able to fix these issues. Some stem from a genuine move towards improving systems, and laws like the upcoming General Data Protection Regulation, with a special focus on automated decisionmaking, act as an important catalyst for setting things right. There have been several inflection points where ADS have been deployed already - social media, criminal justice, object detection, predictive policing. A lot of great work is happening in academic research, and a lot more still in big tech - however the two seldom work together, and most of the time, we can’t see what the companies are doing.
We might manage to build fair ADS, but unless we can scrutinise this fairness against long held human values, we won’t know for sure - and until we can do that, we still won’t trust them. We might love the bunny (who doesn’t?) but we still really need to know how it got in the hat. Also, not all ADS deserve our trust - some may be specifically built to decieve.
Some promising solutions are emerging, which make genuine contributions towards balancing optimisation with explainability, allowing systems to explain their ‘thought process’.
Different solutions apply to different stages of the ADS process - some are needed to fix the ‘inner workings’ - what data they used and how, what correlations they drew and why, and allowing someone to be able to audit it all if needed. Others look at the end result, the output produced by the ADS in specific instances - why a loan for £1,000 and not £1,500?
We particularly liked ones that were largely independent of what sort of ADS was used (LIME). A lot of solutions however, while great, are still ‘in the lab’, and apply in very specific contexts. Some others we really like are Bayesian Rule Lists, audit tokenisation, Deep neural network activation based auditing, using NLP based explanations, generative interfaces, automated whitebox testing, and qualitative input easing (where relevant, we will explain these in subsequent blogs).
There are also some great ideas on how we can design frameworks and interfaces to make them more transparent, explainable and trustworthy to different sets of users relying on them (a loan officer seeing an ADS’s score, as well as a loan recipient seeing a loan decision).
That’s a wrap for now! In our next blog post, we will dig deeper into fairness, accountability and transparency and what these concepts mean for understanding and explaining automated decisionmaking, as well as the all important Trust.
1 note · View note
Text
Foundations for Understanding Automated Decisions
There’s a lot of folklore surrounding Artificial Intelligence (AI). Elon Musk is worried, while Mark Zuckerberg believes it’s the answer to many issues, most recently fake news. For many of us, AI is magical, whether ominous like HAL and Ex Machina, or enrapturing like the Spotify Daily Mix and Alexa. And yet, while we usually love magic, we don’t trust it, because we don’t know how the rabbit got in the hat. When the same trick is applied to your insurance, job, visa or bail application, the apprehension, understandably, gets worse. If we want AI to be the magic bullet many believe (or are being told) it will be, this needs to change.
Understanding Automated Decisions is a research project that aims to do exactly this, and do it in a way that it makes as much sense on the street as it does in the lab. We call them ‘automated decisions’ and not ‘AI’, so that we can set the right context. Before moving on, here’s a broad overview of the goals we had for the research we conducted:
Be able to explain what makes a decision automated, cut through the noise around ‘AI’;
Explore the current state of the art, and how we got to where we are with these systems;
Inform people how these systems function, and how they can go wrong;
Highlight the importance of fairness, transparency, accountability and explainability of these systems;
Explore how this can be achieved for the systems we already use, and how we should build systems for the future;
Look at how greater trust can be built (where deserved) using the tools we have, by making automated decisions more transparent and explainable, both technically and through design frameworks.
An automated decision, is any decision that can be made (by a machine), with little or no ‘human involvement’. Machine learning is a promising area of applied automated decisionmaking with its roots in statistics, where a set of rules (or algorithms) can be designed to learn from data, and make decisions and inferences without express human input. A more evolved version of this is deep learning, takes inspiration from the way our brains function, where the learning process uses simpler concepts to learn and develop more complex ones about other kinds of data - this is the closest we currently are to achieving ‘AI’.
The state at which a mechanical artefact (like HAL) is able to make human-like intellectual decisions and develop complex reasoning by itself, is what is known as general artificial intelligence, is true ‘AI’. A machine learning algorithm might beat you at Go, but it doesn’t know the difference between playing Go and Golf - AI might.
Many of these systems were created in earnest, to be able to help us compute, predict and analyse better. Once we were able to combine this with a lot more data and computing power, we started seeing just how much we could do with these systems, and when we deployed them, it seemed like they really were doing a great job, until we realised they weren’t.
A big part of the problem is that the systems currently deployed, weren’t developed keeping tenets like Fairness, Accountability, Transparency and explainability in the way we now demand it, in mind - leading to a trust deficit. Trust is a complicated thing especially when you’re dealing with a machine, but here are some important things over 100 research papers and books we looked at showed us:
The technologies involved here are not new - concepts such as AI and machine learning have been around for decades!
Problems of bias and fairness in computer systems are also not new - even rudimentary computer systems running on basic algorithms ran the risk of making biased, unfair decisions depending on what they were optimised for doing, and the data they relied on.
Contemporary automated decisions rely on (a) access to unprecedented amounts of ‘raw’ data to work with - some of this is structured (purpose built datasets), and some unstructured (tweets, images, raw text). This, combined with our ability to compute more, and faster than ever before makes for the steroids that allow today’s automation systems to work their ‘magic’.
Rooted in statistics, predictions automated decisionmaking systems (ADS) make essentially rely on making complex correlations between different types of data. For example, some systems have been known to correlate using certain types of websites with success at certain types of jobs - how did it do this? By analysing millions of historical patterns of website use and employee performance, in order to make predictive assumptions based on such correlative trends. However, you could still be a great employee and yet not be using that specific website, but if the automated system can’t tell you just how it picked that employee, you’ll have no way of picking up on red flags like these.  
Allowing ADS to ‘think’ for themselves, making correlations from vast amounts of data, tweaking their rulemaking structures autonomously, allows them to optimise their performance, ostensibly, far better than a human would have been able to were one to scrutinise, verify and confirm each move. Imagine deciding to buy a cup of coffee, but having to wait for someone to sign off on your caffeine crunch, how you walk to the cafe, picked a coffee, interacted with the barista, and then paid. Annoying, right? Some argue that something similar applies when it comes to ‘scrutinising’ what an automated system is doing.
That said, we know increasingly, that many of these systems aren’t the panacea to fairness, optimisation and efficiency we thought they were. Quite the contrary, the systems can draw on tainted, biased data, making correlations (see above) they shouldn’t have, and produce outputs that may fulfil one criteria (accuracy) but not another (fairness). Fixing this, and soon, is very, very important.
The root cause of these problems is not evil computer scientists and business people - it’s different approaches between developers, users, and policymakers. (There are for instance, at least 21 definitions of fairness in computer science alone, imagine when we get to law and philosophy)
There is also a genuine move towards being able to fix these issues. Some stem from a genuine move towards improving systems, and laws like the upcoming General Data Protection Regulation, with a special focus on automated decisionmaking, act as an important catalyst for setting things right. There have been several inflection points where ADS have been deployed already - social media, criminal justice, object detection, predictive policing. A lot of great work is happening in academic research, and a lot more still in big tech - however the two seldom work together, and most of the time, we can’t see what the companies are doing.
We might manage to build fair ADS, but unless we can scrutinise this fairness against long held human values, we won’t know for sure - and until we can do that, we still won’t trust them. We might love the bunny (who doesn’t?) but we still really need to know how it got in the hat. Also, not all ADS deserve our trust - some may be specifically built to decieve.
Some promising solutions are emerging, which make genuine contributions towards balancing optimisation with explainability, allowing systems to explain their ‘thought process’.
Different solutions apply to different stages of the ADS process - some are needed to fix the ‘inner workings’ - what data they used and how, what correlations they drew and why, and allowing someone to be able to audit it all if needed. Others look at the end result, the output produced by the ADS in specific instances - why a loan for £1,000 and not £1,500?
We particularly liked ones that were largely independent of what sort of ADS was used (LIME). A lot of solutions however, while great, are still ‘in the lab’, and apply in very specific contexts. Some others we really like are Bayesian Rule Lists, audit tokenisation, Deep neural network activation based auditing, using NLP based explanations, generative interfaces, automated whitebox testing, and qualitative input easing (where relevant, we will explain these in subsequent blogs).
There are also some great ideas on how we can design frameworks and interfaces to make them more transparent, explainable and trustworthy to different sets of users relying on them (a loan officer seeing an ADS’s score, as well as a loan recipient seeing a loan decision).
That’s a wrap for now! In our next blog post, we will dig deeper into fairness, accountability and transparency and what these concepts mean for understanding and explaining automated decisionmaking, as well as the all important Trust.
0 notes