This is a post-interview, where I tried to formulate the main problems of neural networks, the solution of which can make a breakthrough in AI technology. It is mainly about networks that work with text (GPT, BERT, ELMO, etc.). As you know, a good formulation of the problem is half of its solution. But I cannot find these solutions myself. I hope for the “help of the hall” as there are many people who face the same problems and probably “see” their solution.
1. The most seemingly simple, but the neural network does not consider the facts
She learns from the examples, but can not address them in the answer. “It was yesterday” is an almost impossible answer. Neuroset learns general facts, but does not know about them. In cognitive language it is called semantic and episodic memory respectively.
The solution can be simple, but the neural network is a classifier, and precedents cannot be classes, a contradiction. Very often you need such an answer from bots, they work very badly with facts, if we are not talking about the pattern “set the alarm clock on… – how much did you set the alarm clock? The problem is exacerbated by the fact that there are always exceptions that the network cannot take into account if it did not have enough examples with exceptions. And if there are enough examples, this is not an exception. In general, NN can say it is a hat, but it cannot say which one is mine (there was only one example).
2. “Common sense”
A well-known problem, even called the “dark matter of AI”. There are interesting approaches to the solution, which describes an attempt to combine symbolic (logical) AI and neural network approaches. But it is an attempt to go backwards instead of forward. The problem is that “common sense” is an implicit knowledge of the world which was not on the training dataset. Nobody even pronounces such banalities, they are recognized at the age of 4-6 when they can’t write yet. The loud failures of the Kompreno and Cyc projects show that it is impossible to describe all facts clearly. They are somehow deduced on the fly. There are no good ideas for a solution yet, except the dictionary restriction. For example, a “schoolboy” should “lead” such “filters” on the vocabulary of the answer so that the chosen variants do not contain the words “army” or “marriage” if it is about him, and not about the presence of his elder brother at the wedding. How to do this in NN is not (I) clear.
3.-A equally important problem, and possibly related to the previous is the problem of building reasoning.
Neural networks do not know how to make syllogisms, i.e. the simplest conclusions with consistent reasoning (intermediate conclusions). The same problem, on the other hand, is the impossibility to pursue the goal of reasoning or even to stick to a certain meaning. GPT can build a news text on a given topic, but it is useless for it to say, “write a news to defame X”. In the best case, it will write about vilification by others, and explicitly, not like us humans, between the lines. The conclusion of syllogism is also a goal – it is necessary to correlate the premises with the conclusion. To have it in mind at the first statement (premise). It is not even clear yet which side to put it on the net. Maybe, who knows?
4. And another problem, which is not even dark matter, but a black hole-I.
These are analogies and metaphors. AI understands everything only literally. It is useless for her to say, “similar to X. The network may complement the description, but it does not describe the analogue. Maybe it is just a problem of the corresponding dataset. But it seems to me that it is deeper and shows the root “flaw” of the present AI architectures as well as p.3. Our language is metaphorical, hence the “curse of linguists” – a homonymy. The same lexemes through metaphors are used in a pile of different “concepts”. And we can easily navigate this. This is partly solved in the problem of determining the intents, but again it is the definition of “topic”, not the whole concept, consisting not only of the name of the intent and associated with no patterns of answers as in bots.
So far, these four are enough for discussion, although there are more private but no less important problems in building bots, for example. It is enough to communicate with Alice and they become intuitively obvious. But their wording is not so simple – to guess what the problem is means to guess and how to solve it. It’s more difficult.