“Busting Myths” of AI by Richard Socher, Chief Scientist at Salesforce Research. An AI-Hub Europe Exclusive

To come up with a precise definition of AI for a broad audience is not easy because AI is changing and developing at a very rapid pace. Those who deal with and explore it rarely use the term AI. However, it can be stated for the foreseeable future: AI is and remains a tool about people and for people. And that’s the point. It does not work on its own, but in interaction with people or certain tasks. It links computer performance with tasks that require specific intelligence. In contrast, the facets of human intelligence are not easy to outline.

It is well known that AI research became tangible for the first time with the development of the chess computer. It had to be fed millions to billions of games and replay them against itself in order to be able to defeat human intelligence in this particular discipline. Today, however, millions of people use AI in different scenarios every day – examples include Internet searches or language interfaces such as SIRI or Alexa. This is also where the rapprochement takes place because AI is never AI for its own sake. It is always the (artificial) intelligence in combination with a concrete challenge or application. I call this AI+x, for a certain x, like playing Go, or on X-rays to recognize a disease.

Deep Learning Breakthroughs
The fact that AI is gaining so much momentum right now is due on the one hand to the computing and storage capacity that is now available. On the other hand, there is much more data available today that an AI can use for training. Above all, however, the current enthusiasm is due to the breakthrough of deep learning in 2010. The machine has learned to learn more independently. At the same time, machine learning differs considerably from human learning. People can learn from few examples, machines need thousands or millions of iterations and examples. If we want to convert spoken sentences into text, we need thousands of hours of material. Each word must occur several hundred times. For images, too, we have to show thousands of examples of an object before an algorithm can detect it. But people can understand words from context after just one or a few examples. If we try this in AI (in a research area called “zero-shot learning”), it’s usually not very accurate yet.

The following sentence provides a good example: “The girl has saved the Uffjah.” Nobody knows what exactly a “Uffjah” is but our general knowledge allows us to draw intuitive conclusions about the basic characteristics of the “Uffjah” within fractions of a second. It must be a tangible object that is not too big, too heavy, and not threatening – otherwise, a child could not or would not pick it up.

Demystifying AI interview with Richard Socher

Conversational Conventions
The same applies to social intelligence and knowledge of conventions. If, for example, an appointment is to be made for “one of the next Tuesdays”, then it is clear to everyone that a Tuesday is meant within the next two to three weeks– and not in thousands. Or that the calendar still has a gap at 11 pm, but you don’t schedule a meeting that late– unless you are in a completely different time zone, an aggravating exception. These implicitly obvious points for humans are difficult to teach in software. These are rules, but at the same time, there are many exceptions. This is difficult for algorithms.

Language, probably the most exciting manifestation of human intelligence, posed and still poses great challenges for AI research. With the breakthroughs in deep learning, development is accelerating. A striking example of entity disambiguation made headlines in 2011. Anne Hathaway won the Oscar and the shares of Berkshire Hathaway went through the ceiling at the same time. What happened was that a very simple NLP algorithm for day traders confused the actress and the stocks because it wasn’t able to create the context. Today, natural speech recognition is already available to every smartphone user but there are other, more complex applications that can make everyday business life easier. As is well known, data is the new gold, but companies still struggle to extract meaning and insight from all of that data properly. Natural language processing’s (NLP) current advances are helping to significantly accelerate research into the refinement of speech recognition and to accelerate applications.

AI in decathlon mode
One of the most exciting research areas in AI is currently multitask learning, i.e. solving different tasks with the same algorithm. All successful models of recent years have focused on a single problem: One algorithm that plays Go, one that translates, another that answers questions and so on. So the AI will never become generally useful and applicable to anything like our brain. This is exactly what my research group has now worked on with the decaNLP speech processing decathlon. We have developed a model that can solve ten different language processing problems and answer questions it has never seen before in training.

The big difference of decaNLP: Previous approaches focused on a continuous refinement of specific areas of application in the imitation of human language, but did not master the holistic understanding and execution of tasks. With the new model, AI can now draw conclusions and answer questions to an unprecedented extent. This means that it does not have to be trained for every single formulation of a question, but can understand interrelationships independently.

With decaNLP, a general model for language is now available, with which ten tasks of natural language processing can be completed at once: question answering, machine translation, summary, natural language conclusions, sentiment analysis, semantic role assignment, relationships, goal-oriented dialogue, database queries and pronoun resolution. In the past, a separate model had to be created and trained for each of these tasks. Data scientists now only need one model for ten tasks. The zero-shot learning already mentioned is provided by the neural network in decaNLP. The multitask question answering network (MQAN) can generalize completely new tasks on the basis of various but related tasks. This allows improvements in transfer learning for machine translation and proper name recognition, domain adaptation for sentiment analysis and natural language conclusions. For example, MQAN can lead to a new generation of chatbots that enable more natural and targeted human-machine interaction.

Advances in Computer Vision
Another area in which deep learning has made great progress in recent years is computer vision, i.e. image recognition. One of the biggest challenges for AI in recent years has been the development of so-called end-to-end trainable models. You take raw data, for example, the pixels of an image, that you want to predict a binding result, such as whether an image shows a cat or a dog. If you let the raw data flow into these models, they try to learn more and more complex representations and shapes. Since they start with pixels, the first layer could only identify simple edges and colours, which is surprisingly similar to the early visual cortex in the human brain.

On the next level, they combine these simple lines and colours into more complex textures and patterns. As they go deeper and deeper into the different layers, they will find more and more parts of the objects and eventually combine them to identify the complete objects. This entire process, with all visualizations, is now learned automatically by giving it controlled data: Images with their pixels and the result or object that the algorithm is to predict.

Everyday helper and life saver
The advances in computer vision and multitask learning are opening up many new opportunities. For example, computer vision can be used in oncology as a cost-effective and simple way to count blood cells or in radiology to interpret a CT scan so that patients receive faster and more individualized care. In emergency rooms, it can help with triage by making initial diagnoses of strokes or cerebral hemorrhages, for example, and thus saving lives.

The goal of AI is to help people. As with the invention of the vacuum cleaner or the washing machine which allowed us to do household chores more efficiently in future we will have more time to concentrate on things that are more interesting and important. AI helps us to operate applications and devices even better and more easily. One thing will always apply: AI is only as good as its training. If one feeds the machine intelligence with negative contents, it will internalize and multiply them. Like Tay, Microsoft’s Twitter bot, which was taken off the net after just one day. Regulation and ethical issues must, therefore, be urgently addressed and implemented in this context.