agsandrew - Fotolia

IBM pushes boundaries of AI, but insists companies take an ethical approach

Researchers at IBM are pushing the boundaries of what artificial intelligence and machine learning can do, but remain wary of the ethical implications that accompany the proliferation of this technology

Researchers at IBM’s R&D labs in Zurich have begun applying artificial intelligence (AI) and machine learning (ML) technologies to a host of new and unusual contexts as they attempt to discover what the technology is capable of.

These applications range from creating new fragrances and streamlining healthcare operations, to producing stealthy, hard-to-detect malware and teaching machines how to debate.

The various projects are at different stages of fruition, but are all part of IBM’s wider strategy of developing core AI and using it to transform industries. However, as AI and ML technologies proliferate, increasing focus should be given to the ethical implications of using it, something IBM is keen to highlight.

Challenging biases

“One of the concerns [with AI] is around bias – the fact that, especially in AI systems that are data-driven based on ML approaches, you can inject some bias,” says Francesca Rossi, global leader of AI ethics at IBM Research.

“If the training data is not inclusive enough, not diverse enough, then the AI system will not be fair in its decisions or recommendations for humans. You want to avoid that, so there is an issue around detecting and mitigating bias in training data, but also in models,” she says.

There are a variety of different kinds of biases. One is interaction bias, exemplified by Microsoft’s infamous chatbot Tay, which learned by observing how people interacted on Twitter. However, Twitter is not necessarily the best representation of real human interaction, meaning the bot had to be shut down within 24 hours due to the fact it began making racist statements.

Another, much more subtle kind of bias comes in the form of product recommendation, like the kind you see on Netflix or YouTube that recommends content you may be interested in based on what you have previously viewed. In these contexts the bias can be fairly innocuous, but when it comes to news or social media these AI filter bubbles become much more damaging through their creation of echo chambers.

Natural language bias, however, is the hardest type of bias to eliminate in AI.

Read more about artificial intelligence and deep learning

“Certain types of bias are baked into language,” says Barry O’Sullivan, president of the European Artificial Intelligence Association. “If you look at the proximity of certain noun phrases to other noun phrases, in many languages you’ll find nouns that refer to authority positions, like president or leader or director, are very close in text with gender nouns, and that’s not the case for females who tend to be associated maybe with caring roles or supporting roles.”

According to Rossi, the intersectionality of bias means it may not be possible to completely eliminate bias. “You don’t want to be biased on various protected variables, like race and gender and age, but then by removing maybe one thing you might introduce a little more bias on another because they intersect with one another,” she says.

O’Sullivan adds that bias is ultimately a question of consensus. “We have to think very carefully about how we want to deal with bias because there are questions that we often can’t agree on, such as whether something has a bias or not. In general, the question of ethics and ethical codes ultimately comes down to whether or not there’s a consensus as to what is acceptable, and what is not,” he says.

But where does this consensus come from?

The trolley problem

The trolley problem is a theoretical question applied to self-driving cars. What it essentially boils down to is, if a crash is unavoidable and someone is going to die as a result, how does the car make that decision?

“Does it kill the young woman, or does it kill the two professional men, or does it sacrifice the two children in favour of the person who’s got world-leading expertise in neurosurgery?” says O’Sullivan.

“There is no consensus around what ethical principles should be because they’re very much dependent on societal norms. That’s really challenging for AI researchers”
Barry O’Sullivan, European Artificial Intelligence Association

A study published by Nature, called the Moral Machine experiment, asked 2.3 million people across the globe what they would like the autonomous vehicle to do in this life-or-death situation. It found that people’s ethical values and preferences differed wildly by culture and geographic location.

“There is actually no consensus around what ethical principles should be because they’re very much dependent on societal norms,” says O’Sullivan. “That’s really challenging for AI researchers.”

Ultimately, a self-driving car will never be designed specifically to kill someone, so although the life-or-death choice is an edge case, statistical versions of the trolley problem are likely to occur in the real world.

IBM’s Rossi describes a scenario where you are in a self-driving car travelling in the same lane as a truck travelling in the opposite direction towards your car, while there is also a bicycle travelling parallel to you. The choices are to move closer to the bike to create some space between you and the truck, but in doing so risk hitting the cyclist, or move to the other side which would mean less space between you and the truck. “It’s a matter of increasing one risk over another,” she says.

One solution to overcoming disagreements in ethical opinion is in the transparency of the technology – if we cannot agree on what is ethical, we at least need to know how the system is reaching its conclusions.

Building trust through transparency

To embed this transparency, IBM is proposing the use of factsheets for AI services, which it describes as being similar to food labels in the sense that, when you look at it, you can see the data it was trained on, who trained it, when it was trained, and so on.

The factsheet would be comprised of four main pillars: fairness – using data and models free of bias; robustness – making systems secure; explainability – people need to know what is going on inside the black box; and lineage – providing details of development, deployment and maintenance so the system can be audited throughout its lifecycle.

“We think every AI system, when it’s delivered, should be accompanied by something that describes all the design choices and also how we took care of bias, explainability, and so on,” says Rossi. “We think trust in the tech is achieved by being very transparent, not just about the data policy, but also very transparent on the design choices.”

IBM’s Project Debater, for example, is the first AI system to engage in a live public debate with humans, which it is able to do through unscripted reasoning.

“In this setup, we believe humans still prevail,” says Noam Slonim, the principle investigator for Project Debater. “Humans rely more on rhetoric, and when we measure it they usually deliver speech better than the system. On the other hand, the system advantage is usually reflected by the fact it can pinpoint high-quality evidence to support its case.”

“Trust in the tech is not enough. You also want the final users to trust whoever produces that technology. You need to build a system of corporate responsibility”
Francesca Rossi, IBM Research

Due to AI’s ability to search huge amounts of data very quickly, there is potential for Project Debater to be used in assisting human decisions, as it can quickly identify facts and evidence, both in support and opposition to the arguments presented.

The more transparent and explainable the system is, the more those using it for these purposes will be able to trust the information it provides.

Building trust can also be helped if companies act responsibly with the powerful technologies in their hands. This is especially true in the context of rapid technological advancement, according to O’Sullivan. “Just because we can do something with AI, doesn’t mean we should go and do it,” he says.

Humans ultimately need to be responsible for AI systems. An AI system should never be used as a way of a human being relinquishing or displacing his or her responsibility for taking a decision, so if an AI system is under your control, you are responsible,” adds O’Sullivan.

“Trust in the tech is not enough. You also want the final users to trust whoever produces that technology,” says Rossi. “You need to build a system of corporate responsibility.”

One way IBM is trying to do this is through collaborative initiatives with international bodies, national governments and other organisations developing AI technologies.

“You want to hear the voice of everybody, not just those who produce the AI system, but those communities that are affected by the AI,” concludes Rossi.

Some of the artificial intelligence applications being explored by IBM

Medgate’s AI-powered symptom checker

One of the AI-powered technologies being explored by IBM Research is a decision support system for patients originally envisioned by digital health company Medgate.

Medgate wants to decentralise and automate the provision of healthcare in Switzerland, and operates a number of consultations and clinics that fit healthcare around the individual.

Its offers teleclinics, for example, which allow patients to have a phone call or video chat with their doctor at any time of day, providing 24/7 access to medical consultations. Since 2000, the company claims to have undertaken 7.4 million of these teleconsultations.

These are accompanied by mini-clinics, which are essentially doctors’ surgeries minus the physical doctor. Instead, the doctor will be present by video, while the patient and a medical assistant have access to the broad range of diagnostic devices available at the clinics.

Both types of clinic are also connected to a partner network of more than 1,700 doctors, 50 clinicians and 200 pharmacies, to which a patient can be referred should they need further medical attention or guidance.

All of these services are integrated through the Medgate app, which, although in essence is a booking app for the consultations and clinics, acts a hub through which patients can organise their healthcare needs.

However, not every patient who uses Medgate’s telefeatures necessarily needs to speak with a doctor. To detect and identify the cases that would best be served by a direct referral, Medgate is enlisting an AI-based chatbot, which essentially acts as an intelligent symptom checker.

Medgate claims this will save 15-20% of its costs in Switzerland.

“This is the pattern of the modern healthcare industry – in hospitals you have 90 doctors and five cooks; the future is 90 doctors, 20 software engineers,” says Medgate CEO Andy Fischer, who admits there are still a number of challenges with the technology.

One of these is technological acceptance by patients and physicians.

“Traditionally, healthcare is something that’s been very personal, it’s embedded in our language. [We think,] ‘When I’m sick, I have to see a doctor’, [but] it should be, ‘When I’m sick, I need to make sure my doctor has enough data to decide’,” says Fischer.

“But it’s such a historical, traditional thing that medicine is something personal.”

Beyond building trust in the technology, there are also direct technical challenges, such as teaching the artificial intelligence contextual information that it cannot get from textbooks. For example, the time of day, how close the patient is to a hospital, how nervous a person is.

“This was, and will be, the major component of our collaboration with IBM,” says Fischer.

Symrise’s AI-generated fragrances

IBM Research has also partnered with fragrance and flavourings manufacturer Symrise to create perfume based on AI-generated digital fragrance models.

The technology builds on previous IBM research into using AI to pair flavours for recipe creation. The ML algorithms are used to sift through thousands of raw materials and pre-existing fragrance formulas, which helps it to identify patterns and novel combinations that have never been tried.

On top of this, the technology includes algorithms that can learn and predict what the human response to each fragrance would be, how much of a raw material to use and if there are any materials that can be substituted by another.

Using this data, Symrise and IBM have already created a new perfume, called Philyra, which was generated by AI with the specific design objective of creating a fragrance for Brazillian millennials.

“That’s the part that’s amazing to me – that in 1.7 million formulas, finding something that hasn’t been done before is pretty hard to do,” says David Apel, a senior perfumer at Symrise who has been using the technology.

“The creative process is superfast and very innovative, and very interesting to me because I can still pursue my own methodology of how I create fragrance while she’s [the AI] doing something else in the background,” adds Apel. “She just feeds me ideas, so in that respect it’s a personal assistant that’s never sleeping.”

IBM’s Project Debater adapts AI to human logic

Developing broad AI that learns across disciplines is a core focus for IBM, which has historically been very active in this space. In 1997, for example, IBM’s Deep Blue system beat the chess world champion, and in 2011 IBM Watson defeated the top champions of US game show Jeopardy!.

Now, Project Debater, which absorbs massive amounts of data from a diverse set of information and perspectives to help it make arguments and well-informed decisions, has been used in a number of live public debates with humans.

“I think it’s fair to say that humans really do not have a chance when facing the machines at board games, but these board games, in my opinion, also represent the comfort zone of artificial intelligence,” says Noam Slonim, the principle investigator for Project Debater.

“In search problems, computers are much, much better than humans, so we can use the computational power to overcome human performance. The AI can use a tactic or logic that humans cannot understand in order to win the game. These observations are not true for debate,” says Slonim.

Unlike with board games, where the victor is clear-cut, the value of arguments is inherently subjective, meaning the AI must adapt its logic to human rationale when arguing a point, and do so extremely quickly in an unscripted manner.

Data-driven speech writing and delivery, listening comprehension that can pick up on the nuances and complexities of human language, and a modelling system that uses human dilemmas to enable principled arguments to be formed are the three capabilities that underpin Project Debater’s ability for unscripted reasoning.

IBM’s DeepLocker uses AI to understand cyber threats

With the shift to machine learning and artificial intelligence is being touted by IBM as the next major progression in IT, researchers have also begun developing a new breed of highly evasive malware that conceals itself until it reaches a specific, pre-programmed victim.

Most security software today is rules based, but AI can circumvent these rules by learning and understanding them over time. The AI model in DeepLocker is specifically trained to behave within the rules it has learnt unless it is presented with a victim-specific trigger.

These triggers include visual, audio and geolocation features. During a demonstration, for example, one of the researchers working on the technology, cryptographer Marc Stöcklin, trained DeepLocker to recognise the face of an IBM employee.

Upon seeing the employee’s face through a laptop webcam, the malware released its malicious payload and infected the machine.

By developing this technology, IBM Research claims it can better understand the cyber threats of the future, likening its method to examining the virus to create a vaccine.


Read more on Technology startups

Data Center
Data Management