#digital control systems exist too and they just check inputs and calculate outputs without a need for a stupid ai or app BE REAL | Explore Tumblr posts and blogs

viditure · 6 years ago

Text

System tests: 10 traps to avoid

After having defined the test levels and in particular the system tests, it seems important to mention some very common traps, into which novice testers can easily fall. These traps can have different effects, from loss of quality in the product to lower profitability of investments in testing. In any case, these practices can quickly prove counterproductive for companies.

1. Entering the “black box “

It can sometimes be tempting to want to check the system to see if an element has been inserted, for example, but be careful! This “intrusion” into the black box creates a dependency on the implementation of the system and exposes it to increased test maintenance as the system evolves. To be used only when absolutely necessary and by weighing up the pros and cons!

2. Depend on the proper functioning of the system itself to carry out the tests

This is the case where we want to avoid the sometimes-significant complexity of setting up the test context (preconditions for execution, test data, etc.). We can then choose to use the system itself to get in shape before starting our test. We take the risk of overlooking one of the objectives of the test, which is to provide information on the proper functioning of the entire system.

Let’s take an example:

Or would like to validate a feature to delete an element in a system. To ensure the replay-ability of the test, we must have a data to delete at each launch. Several options are then available to us, including:

Playing a script before starting the test to insert the data (the best option in many cases)

Reinstalling all test data each time the system is deployed and therefore redeploying the system before each test launch

Having “enough” data to play the test a number of times (this option is clearly not the best because it only postpones the problem!)

Inserting the data using the system’s data addition functionality: this is the option that hides a major trap!

The risk is to lose visibility on the status of all system features if data insertion does not work.

Data retrievals, modifications and deletions can no longer be tested if all test scenarios start by using the system to insert the test data and this is not possible.

We find the same phenomenon if, in order to “save time” on the implementation of test contexts, we build a “super” scenario in which we carefully link all the functionalities to be tested to cover the whole: we will only be able to see the first problem encountered and we will be blind to other potential problems that can be detected later on in the test scenario. We then expose ourselves to what I would call “chain effect bugs”: we fix one bug, we discover the next one, we fix it, we find another one etc… and we absolutely do not know how many features of the system are altered at a moment T.

This makes decision-making and planning very difficult because the correction effort cannot be estimated. This dependence on the proper functioning of some system functionalities to validate others is inevitable, but it must be handled with care and it is important never to lose the objective of testing, which is to provide continuous information on each functional rule of the system.

3. Reproduce the system’s behaviour in tests

It may be tempting to reproduce a calculation algorithm for example to validate the relevance of the result returned by the system. But this practice is very risky and there is no way to tell if the correct result is the one calculated by the system under test or the one calculated by the test tool. A common example is the generation of authentication tokens: it is not good to try to calculate a valid token in the test tool to validate the one returned by the system. The best way is still to use the received token to check its validity (while taking care not to fall into the previous mistake!).

Be careful not to reimplement the complexity of the system you are trying to test in the test tool itself: the test tool must remain “simple” observer of the system it validates and that is the whole difficulty!

4. Waiting “for a while”

When some system processes take “time” or are asynchronous, you may want to wait a “certain time” before checking certain information. But how long will it take?

If we wait too little, the test is misled when the functionality is working correctly.

If we wait too long, the test execution time is degraded when the system has finished processing, and this can have a very serious impact when we have hundreds or even thousands of tests.

We often find ourselves having to gradually increase this waiting time to avoid false positives and thus waste a lot of time playing the tests. It is in this situation that we can look at the secondary outputs of the system: we can check the appearance of a log input or the emission of a particular notification to trigger the next step of the test. It is then the system itself that tells us when to continue.

It is still necessary to decide on a reasonable timeout to conclude that something went wrong, but if the test is successful, we will have limited the time spent waiting as much as possible.

5. Use dynamic reference values

To conduct the tests, we define “expected data” that will be used to validate system returns. We must always be careful not to over-dynamize these data: we can always dynamize data linked to the current date or to overcome the constraints of uniqueness of certain information (generated guidebooks for example) in the outputs that we want to check but we must be careful to have fixed and controlled data, taken as reference values. Otherwise, the risk is to validate the format or structure of the data well, but not its veracity, i.e. its relevance, given the test dataset.

6. Mass update of reference data

This is often what can be done when there are significant changes in the outputs of the system or when you want to establish a first set of reference data on an existing system. You have to be careful of this because one or more bugs can hide behind the current feedback. Indeed, if everything seems to be linked to the current modification in the system, one or more regressions can still be hidden in the newly obtained returns. Master data will be used for a long time to secure product deliveries and embedding a bug can have a very significant effect on customers without anyone noticing.

Reference data must be validated manually and on a case-by-case basis to ensure the best benefit from system testing.

7. Implement a specific behaviour for testing in the system

This is the famous “If test” in the production code! A particular behaviour is triggered if a certain header or other information is received to identify that it is an ongoing test. This makes it possible to validate the system’s behaviour, when it is part of the tests, but not its real behaviour in production. I must admit that I have sometimes had to use it in very specific cases where it has proved to be essential, but this practice must be handled with great caution.

8. Establish a global test report

Depending on the test tool used, the different test cases can be run, and reports generated in different ways. It is essential to have a report on the status of each system functionality at the end of the tests. We must avoid having a single indicator giving the global status (which would very often be “red”) with the need to go to the console or a file to be able to identify the problem or problems encountered.

9. Only play the tests after a system modification

While it is important to play the system tests after each deployment following a system change, this is not enough. It is indeed important to regularly ensure that the system remains functional in its environment. The environment can evolve and a system that is not often modified can be altered by external causes. It is therefore advisable to initiate existing tests on a regular basis, even if there are no system modifications. Otherwise, there is a risk of finding a number of surprises when the next change is made to the system.

10. Test only busy cases

While it is essential to validate the proper functioning of a system’s passing cases, error cases are just as important, and sometimes even more important if we are interested in the non-functional characteristics of system quality, such as usability or safety. Properly informing the user in case of misuse of the system will improve the quality of his or her experience, for example.

In conclusion…

You have probably already encountered one or more of these traps and it is always a question of making a decision adapted to each situation. The important thing in risk assessment or decision-making is to know as well as possible the impact that this can have in the long term.

In software development, quality management is largely about the relevance of long-term investments. If you sometimes have to make decisions that will increase the technical debt in your product, it is not a concern as long as you do not forget to manage it properly as soon as possible. But this may be the subject of another article….

Article System tests: 10 traps to avoid first appeared on Digital Analytics Blog.

from Digital Analytics Blog https://ift.tt/2NvuDNc via IFTTT

#IFTTT #Digital Analytics Blog

0 notes

shirlleycoyle · 6 years ago

Text

Credit Scores Could Soon Get Even Creepier and More Biased

There are lots of conversations about the lack of diversity in science and tech these days. In response, people constantly ask, “So what? Why does it matter?” There are many ways to answer that question, but perhaps the easiest is this: because a homogenous team produces homogenous products for a very heterogeneous world.

This is Design Bias, a monthly Motherboard column in which writer Rose Eveleth explores the products, research programs, and conclusions made not necessarily because any designer or scientist or engineer sets out to discriminate, but because to them the “normal” user always looks exactly the same. The result is a world that’s biased by design. -the Editor

Are you trustworthy? For centuries this was a qualitative question, but no longer. Now you have a number, a score, that everybody from loan officers to landlords will use to determine how much they should trust you.

Credit scores are often presented as objective and neutral, but they have a long history of prejudice. Most changes in how credit scores are calculated over the years—including the shift from human assessment to computer calculations, and most recently to artificial intelligence—have come out of a desire to make the scores more equitable, but credit companies have failed to remove bias, on the basis of race or gender, for example, from their system.

More recently, credit companies have started to use machine learning and offer “alternative credit” as a way to reduce bias in credit scores. The idea is to use data that isn’t normally included in a credit score to try and get a sense for how trustworthy someone might be. All data is potential credit data, these companies argue, which could include everything from your sexual orientation to your political beliefs, and even what high school you went to.

But introducing this “non-traditional” information to credit scores runs the risk of making them even more biased than they already are, eroding nearly 150 years of effort to eliminate unfairness in the system.

How credit scores evolved from humans to AI

In the 1800s, credit was determined by humans—mostly white, middle-class men—who were hired to go around and inquire about just how trustworthy a person really was. “The reporter’s task was to determine the credit worthiness of individuals, necessitating often a good deal of snooping into the private and business lives of local merchants,” wrote historian David A. Gerber.

These reporters’ notes revealed their often racist biases. After the Civil War, for instance, a Georgia-based credit reporter called a liquor store named A.G. Marks’ liquor “a low Negro shop.” One reporter from Buffalo, N.Y. wrote in the 1880s that “prudence in large transactions with all Jews should be used.”

Early credit companies knew that impressionistic records were biased, and introduced a more quantitative score to try and combat the prejudices of credit reporters. In 1935, for example, the Federal Home Owners’ Loan Corporation created a map of Atlanta, showing neighborhoods where mortgage lending was “best,” coded in green, compared to “hazardous,” coded in red.

This solution, it turned out, codified the discrimination against minorities by credit companies. Neighborhoods coded red were almost exclusively those occupied by racial minorities. These scores contributed to what’s called “redlining,” a systematic refusal by banks to make loans or locate branches in these “hazardous” areas.

The FICO score, the three-digit number most of us associate with credit scores today, was one of the biggest attempts at fixing the bias in the credit system. Introduced in 1989 by data analytics company FICO (known as Fair, Isaac, and Company at the time), the FICO score relies on data from your bank, such as how much you owe, how promptly you pay your bills, and the types of credit you use.

“By removing bias from the lending process, FICO has helped millions of people get the credit they deserve,” it says on its website.

Today, there are laws meant to prevent discrimination in credit scores on the basis of race, color, religion, national origin, sex, marital status, or age. But the reality is less rosy. For decades, banks have targeted historically redlined communities with predatory mortgages and loans, which many communities have yet to recover from.

Experts estimate that the higher rates of foreclosure on predatory mortgages wiped out nearly $400 billion in communities of color between 2009 and 2012. The companies that buy up debts and take people to court target people of color more than any other group.

Study after study shows that credit scores in communities that are mostly occupied by people of color are far lower than those nearby occupied by white people. A 2007 study done by the Federal Reserve Board found that the mean score of Blacks in the United States was half that of white people.

How algorithms can bring down credit scores

Against this backdrop, banks are turning to algorithms and machine learning to try and “innovate” on the ways credit scores are determined.

Though it’s unclear exactly what sorts of algorithms credit companies use, the latest in machine learning is “deep learning.” Deep-learning programs are trained on huge amounts of data in order to “learn” patterns and generate an output, such as a credit score. Machine learning depends on a quality training dataset for accuracy, meaning that AI programs can absorb the prejudices of their creators. For example, Amazon ditched an AI-driven hiring tool trained on resumes after it realized that the program was biased against women.

Financial technology (“fintech”) startups are feeding non-traditional data into their algorithms, which take those inputs and generate a credit score. Companies such as ZestFinance, Lenddo, SAS, Equifax, and Kreditech are selling their AI-powered systems to banks and other companies, to use for their own creditworthiness decisions. (Equifax, Lenddo, and ZestFinance did not respond to a request for comment.)

LenddoEFL, for example, offers a Lenddo Score that “complements traditional underwriting tools, like credit scores, because it relies exclusively on non-traditional data derived from a customer’s social data and online behavior.” Lenddo even offers an option to allow creditors to install the Lenddo app onto their phones that can analyze what is typed into a search bar.

“Your zip code alone can tell a bank how likely it is that you’re white”

In return, customers are offered quick decisions and the illusion of agency. If everything you do informs your credit score, then a person might start thinking that if they just search for “good” things on Google, check in at the “right” places on Facebook, and connect with the right people on social media, you can become lendable.

“It suggests in some ways, that a person could control their behavior and make themselves more lendable,” said Tamara K. Nopper, who has done research into alternative data and credit.

In reality, these systems are likely noticing and interpreting signals that customers might not realize: Your zip code alone, in many cases, can tell a bank how likely it is that you’re white. If you went to a historically Black college or university, that data could be used against you. If you use eHarmony, you might get a different credit score than if you use Grindr.

One study from last year on using so-called “digital footprints” to generate credit scores found that iPhone users were more likely to pay loans back than Android users. “The simple act of accessing or registering on a webpage leaves valuable information,” the authors wrote.

The algorithms likely wouldn’t be told the race or gender of an applicant, but that doesn’t prevent the system from making guesses and reflecting existing biases. A 2018 study found that both “face-to-face and FinTech lenders charge Latinx/African-American borrowers 6-9 basis points higher interest rates.”

Researchers have previously raised the alarm that AI programs could effectively reinstate redlining and similar practices by churning through deeply biased or incomplete data to produce a seemingly objective number. In 2017, Stanford University researchers found that even an ostensibly “fair” algorithm in a pretrial setting can be injected with bias in favour of a particular group, depending on the composition of the training data.

Last year even FICO recognized that an over-reliance on machine learning “can actually obscure risks and shortchange consumers by picking up harmful biases and behaving counterintuitively.”

Why AI bias is so hard to fix

Biases in AI can affect not just individuals with credit scores, but those without any credit at all as non-traditional data points are used to try and invite new creditors in.

There is still a whole swath of people in the United States known as the “unbanked” or “credit invisibles.” They have too little credit history to generate a traditional credit score, which makes it challenging for them to get loans, apartments, and sometimes even jobs.

According to a 2015 Consumer Financial Protection Bureau study, 45 million Americans fall into the category of credit invisible or unscoreable—that’s almost 20 percent of the adult population. And here again we can see a racial divide: 27 percent of Black and Hispanic adults are credit invisible or unscoreable, compared to just 16 percent of white adults.

To bring these “invisible” consumers into the credit score fold, companies have proposed alternative credit. FICO recently released FICO XD, which includes payment data from TV or cable accounts, utilities, cell phones, and landlines. Other companies have proposed social media posts, job history, educational history, and even restaurant reviews or business check-ins.

Lenders say that alternative data is a benefit to those who have been discriminated against and excluded from banking. No credit? Bad credit? That doesn’t mean you’re not trustworthy, they say, and we can mine your alternative data and give you a loan anyway.

But critics say that alternative data looks a lot like old-school surveillance. Letting a company have access to everything from your phone records to your search history means giving up all kinds of sensitive data in the name of credit.

“Coming out of the shadows also means becoming more visible and more trackable,” Nopper told me when I reported on alternative credit for Slate. “That becomes an interesting question about what banking the unbanked immigrant will mean for issues of surveillance when more and more activities are documented and tracked.”

Experts worry that the push to use alternative data might lead, once again, to a situation similar to the subprime mortgage crisis if marginalized communities are offered predatory loans that wind up tanking their credit scores and economic stability.

“If a borrower’s application or pricing is based, in part, on the creditworthiness of her social circles, that data can lead to clear discrimination against minorities compared to white borrowers with the same credit scores,” wrote Lauren Saunders, the associate director of the National Consumer Law Center, in a 2015 letter to the U.S. Department of the Treasury expressing concerns about these tactics.

Just this week on Twitter, Sen. Elizabeth Warren demanded to know what federal government financial institutions “are doing to ensure lending algorithms are fair and non-discriminatory.”

In other words, in trying to reduce the number of “credit invisible” people out there the banking industry might have created an even bigger problem.

So far there have been no high-profile cases brought to trial that allege discrimination based on alternative credit. The methods here are new, and they often target folks who don’t necessarily have the time and money to pursue legal options even if they do feel discriminated against (remember, alternative credit is often aimed at those who don’t even have a bank account).

In the United States, banks and businesses legally must be able to explain why an “adverse credit decision” was made—why someone wasn’t offered a loan or a line of credit. For companies who want to use machine learning, this can be a challenge because AI systems often make connections between pieces of data that they can’t necessarily explain.

“Credit-scoring tools that integrate thousands of data points, most of which are collected without consumer knowledge, create serious problems of transparency,” wrote the authors of a recent study on big data and credit. “Consumers have limited ability to identify and contest unfair credit decisions, and little chance to understand what steps they should take to improve their credit.”

Artificial intelligence programs basically put data in a blender and the resulting milkshake is the number they produce. It can be very hard, if not impossible, for machine-learning experts to pick apart a program’s decision after the fact. This is often referred to as the “explainability” problem, and researchers are currently working on methods for human scientists to peek under the hood and see how these programs make decisions.

Nopper wonders if there’s another way. Even as politicians question the algorithms around credit scores, they’re not making arguments for the end of credit. “There’s not necessarily this call to end marketplace lending, just a call to regulate it,” Nopper said. “How did these institutions become so pervasive in our imagination that we can’t think of true alternatives?”

Nopper points to the campaign for public banking in New York City. Rather than allowing private companies to use their algorithms and surveillance systems to determine credit, could there be a publicly run bank with the explicit purpose of serving the community, not their shareholders?

North Dakota already has a state-owned bank that was founded to help protect residents against predatory loans. Rep. Josh Elliot, a Democrat, has proposed something similar for Connecticut.

If banks aren’t set up to maximize their own profits, they might take a different tack when it comes to credit, one that is less open to systemic bias.

Credit Scores Could Soon Get Even Creepier and More Biased syndicated from https://triviaqaweb.wordpress.com/feed/

0 notes