In our continuing series on the basic concepts of Artificial Intelligence, today we take a closer look at ‘expert systems,’ a somewhat (but not entirely) obsolete branch of symbolic AI. For a long time, expert systems were the most promising, highest-hyped products of AI research. But both the philosophical attacks by Dreyfus, Winograd and others, as well as a lingering sense of the failure of expert systems to deliver on their promises contributed to the 80’s disillusionment with AI — what has since been dubbed the “AI winter,” and that ended only with the advent of deep learning in the early 2010s.
What is an ‘expert system’?
Expert systems are rule-based inference machines for particular domains of knowledge. They are intended to replace “experts” in that domain.
Here’s a very simplified example for some rules that an expert system might use to identify microorganisms from some basic observational data:
Example rule of a medical diagnosis expert system (MYCIN):
- IF the stain of the organism is grampositive
- AND the morphology of the organism is coccus
- AND the growth conformation of the organism is clumps
- THEN (probability=0.7) the identity of the organism is staphyloccus.
Big expert systems had thousands of such rules.
Expert systems were advertised for use in all areas where the knowledge of a domain could be encoded with big numbers of relatively simple and clear rules. Such areas included diagnostic medicine, biology, engineering (diagnosis of car or computer faults), the processing of credit applications in banks, and other similar domains.
Expert systems replace experts
Let’s look again at the definition above: “Expert systems are rule-based inference machines for particular domains of knowledge. They are intended to replace experts in that domain.”
- Expert systems are rule-based. So they are examples of symbolic AI systems. The world they know about is represented as a collection of explicit rules that contain references to the objects the system knows about (microorganisms, stains, morphologies, conformations; or: computers, screens, keyboards, error messages, beeping sounds).
- Expert systems are inference machines. Their main purpose is to perform logical inferences, that means, to deduce a conclusion from a given set of premises. Given that the computer is turned on, electricity is present, the screen lights up, but the boot process stops with an error message number 42, the system can deduce what is wrong and inform the user about the fault and how to fix it (or whom to contact). This is the biggest difference between ‘classic era’ expert systems and modern deep learning applications, and also the biggest drawback of an expert system: a deep learning system can deal with ‘fuzzy’ or ambiguous inputs, and it can learn to draw conclusions from noisy or incomplete data. A classic expert system, on the other hand, needs a complete set of explicit rules, and it can only process its input data insofar as it can deductively draw conclusions from those data.
- For particular domains of knowledge. All expert systems of the classic era of symbolic AI applied only to very narrow domains. This was necessary since the whole of the domain had to be encoded into distinct, clear, complete and contradiction-free formal rule systems. Clearly, this could only be achieved for relatively narrow domains.
- They are intended to replace experts. This answers the question “why bother?” — Human experts are a precious and limited resource. They take half a human lifetime and immense cost to produce: think of the years of learning and practice that are required to make a heart surgeon, a cancer specialist, a medieval art expert, an experienced rescue pilot, a foreign relations expert, or a specialist in planetary geology. Experts are in very short supply in most places around the world, particularly where they would be needed most: vast parts of the globe cannot afford specialised doctors, fake artworks can be sold as real because specialists are not available to analyse them, and experts in planetary geology are too fragile to strap onto a rocket and send crashing into interesting celestial bodies. In all these cases, we’d like very much to have a machine that can do the job. As opposed to humans, machines can be easily replicated: one needs to create only a single expert system, and can later copy it again and again to create thousands of identically performing systems. A thing that is very clearly not the case with humans. Every single doctor has to begin training as a baby, spend years and years to learn how to eat, how to speak, how to wash his hands, how to read and write; at which point he’s just barely able to actually start learning useful stuff: chemistry, physics, biology, anatomy, physiology, histology, radiology and so on and on, for years and years, until finally he’s a rookie doctor, inexperienced, awkward and afraid. Now follow hundreds of badly treated patients, unnecessary mistreatments and the occasional avoidable death, experiments that go wrong again and again, errors of judgement, gaps in one’s knowledge, until, finally, many years later, the doctor has finally reached competence in his discipline. And now only begins the long path to real expertise, which will take another ten years or so, years of stress, doubt, uncertainty, long nights in the library, in the lab, at the patient’s bedside, months and years of staring at X-ray images and lab results, of slowly learning to see connections, to see patterns, to feel confidence and certainty, to recognise the special case where the lab fails, where the X-ray is misleading and ambiguous, and to know the truth of each case.
The prospect of replacing all that with a set of rules was a great, utopian promise. No wonder it drove the interest in AI almost single-handedly for decades.
Basic structure of an expert system
Expert system architecture
Expert systems generally have a similar basic architecture:
- At the core of it is a database of the domain knowledge that the expert system contains. This is called the knowledge base. Often, this knowledge takes the form of ‘if/then’ rules (like in the Mycin example above). The ‘if/then’ format allows for a trivial inference logic, where the system tries to match the ‘if’ clauses with the facts provided by the user, emitting the statements after the ‘then’ clauses as the expert’s diagnosis of the problem.
- This logical inference is performed by a component called the inference engine. The inference engine must not only match ‘if’ clauses with the user-entered facts, but it must also have some way to deal with contradictory or unclear facts in the knowledge base. For instance, a person might be a vegetarian, but in an emergency situation they might eat meat to survive. The inference engine will also generally implement some sort of logic calculus, so that it can process logical relations between facts. Most likely this will be a variant of propositional or predicate logic, but probabilistic inference (“fuzzy logic”) is also a common approach, because the “facts” in our everyday experience are generally not “true” or “false” in a binary way, but “likely true” or “likely false.” A small child living with two grown-ups of different gender as a family is likely to be their child, but this is not certain. The child can be adopted, or the adults can be its grandparents, and so on. If the inference engine does not consider the more unlikely options at all, it will sometimes make errors that a human expert would avoid. So it’s a good idea to attach probabilities to facts, in order to get a better model of the world. — The problem with probabilities in the knowledge base is, obviously, that (a) many probabilities of everyday ‘facts’ are not known. How likely is it really that a child living with two adults as a family is their child? Sure, one might perhaps find some statistics on that, but finding the correct values for such facts can be difficult and will certainly drive up the price for the development of the expert system. (b) It seems that our everyday thinking does not really work using probabilities. Human common sense is bad at probability calculations, as whole books on probability fallacies demonstrate. If expert systems want to mimic how experts really think, they must also model human likelihood estimation, even when that diverges from the formally correct probability calculation model.
- The code that controls the dialog between the user and the system is called the user interface. This is of little theoretical interest, but of huge practical importance. A bad user interface can render the best expert system unusable. A brilliant user interface can cover up many deficiencies of the underlying system and provide value to the user even if the actual expert system is less sophisticated. User interfaces can be menu-driven, or based on a command line. They can accept problem descriptions in natural language or even in human speech. They might even use pictures or other graphical elements as part of the user interaction.
Expert system individual roles
A big expert system, having all these different components, cannot usually be created by a single person. Development of expert systems therefore is divided among several people with different roles:
- The domain expert is the person who is going to be replaced by the system. It is his knowledge that the systems aims to model and reproduce.
- The knowledge engineer has the task of extracting the domain expert’s knowledge and encoding it in some machine-readable form, so that the expert system can access it. The knowledge engineer is, therefore, the human interface between the expert (who does not need to understand IT) and the programmers (who don’t need to understand anything about the expert’s domain).
- The system engineers are the programmers who are actually building the software. They program the user interface, the inference engine and all other parts of the system, using the knowledge base that the knowledge engineer has provided.
- Finally, the user is the one who will be using the expert system in the field, accessing it through its user interface in order to get expert advice.
The promise of replacing costly and precious human experts by mass-produced and cloned machines was a very strong incentive for the industry (and governments that supported it) to spend years and billions in the development of expert systems. The dream never quite came true. Why it failed, and why it perhaps never had a chance, we will see in the next post in this series.