From Black Feminism Associate to Algorithmic Bias
Artificial intelligence may have cracked the "code" of certain tasks that usually require human intelligence to complete, but in order to learn, these algorithms require a large amount of data generated in human life. They connect this information, search for commonalities and relevance, and then provide a classification or prediction (such as whether the lesion is cancerous and whether you will default on the loan) based on the mode of operation they detect. However, their wisdom comes only from their training data, which means that our limitations-our prejudices, our blind spots, our ignorance-are also endowed with them.
Last year, there was a test on three leading face recognition systems (developed by Microsoft, IBM and Face++) to test their ability to identify the gender of people with different skin colors. The accuracy of these systems in identifying light-skinned men is higher than 99%. However, it is not a big deal that data sets are heavily biased towards whites; in another widely used dataset, the set of training pictures used for identification has an accuracy rate of 78% of men and 84% of whites. When people tested the face recognition system on photos of black women, the algorithm had an error rate of about 34%. The darker the skin, the worse the system works, with an error rate of about 47%-equivalent to the probability of tossing a coin. When a black woman appears in front of the system, the system does not recognize her.
Safiya Noble, who just published Algorithms of Oppression, said: "people have organized to fight for civil rights and to eliminate discriminatory lending practices, and have gone to court to try to change these practices under the protection of the law. Now, we have a similar discriminatory decision-making mechanism, but it's done by an incomprehensible algorithm -- and you can't take the algorithm to court. We are gradually being reduced to systematic grading and decision-making, these systems are the products of human beings, but the human figure is becoming more and more blurred. "
Type "CEO" into Google Images and you will search for a series of similar white male faces. There are only a handful of women who can be seen, most of them white women and a few people of color. At last year's machine learning conference in California, a host had to find the first picture of a female CEO, Barbie, after flipping through a bunch of white men in black suits.
The amount of data is very important to the operation of AI system. The more complex the system-the more layers of corresponding neural networks (such as translating speeches, recognizing faces or calculating the likelihood of a person defaulting on a loan)-the more data must be collected. " But not everyone appears equally in the data.
If you spend enough time in in-depth communication with artificial intelligence experts, they will always mention the same truth: garbage input, garbage output. It is possible to avoid sampling bias and ensure that the system is receiving a large amount of balanced data training, but if the data itself is affected by social prejudice and discrimination, the algorithm is not superior to human beings.
These statistical correlations are called latent bias:, which is why 68 per cent of the word "cooking" is associated with female photos in the image database of artificial intelligence research institutions. This also explains why Google translate is not proficient in the language of neutral pronouns. Turkish generally does not indicate the gender of the doctor, but the English machine translation assumes that if there is a doctor at home, he must be male. This assumption has even spread to advertisements on the Internet. In 2015, researchers found that Google was six times more likely to send ads for jobs with an annual salary of more than $200000 to men than women.
Kathryn Hume said: "the system recognizes the correlation between occupation and gender, but the disadvantage is that the system has no intention-it's just mathematics that plays a role in the correlation. It doesn't realize that this is a sensitive issue. " In this technology, futurism and conservatism are at work and are sawing each other. AI is growing much faster than the data it needs, so it is destined to reflect and replicate human prejudices, not only that, but also make prejudices more ingrained.