David Casacuberta, Saray Ayala and Jordi Vallverdú
Embodying cognition: A Morphological Perspective
Abstract
After several decades of success in different areas and numerous effective applications, algorithmic Artificial Intelligence has revealed its limitations. If in our quest for artificial intelligence we want to understand natural forms of intelligence, we need to shift/move from platform-free algorithms to embodied and embedded agents. Under the embodied perspective, intelligence is not so much a matter of algorithms, but of the continuous interactions of an embodied agent with the real world. In this paper we adhere to a specific reading of the embodied view usually known as enactivism, to argue that 1) It is a more reasonable model of how the mind really works; 2) It has both theoretical and empirical benefits for Artificial Intelligence and 3) Can be easily implemented in simple robotic sets like Lego Mindstorms (TM). In particular, the authors will explore the computational role that morphology can play in artificial systems. They will illustrate their ideas presenting several Lego Mindstorms robots where morphology is critical for the robot’s behaviour.
Keywords: multiple realizability, morphological computation, enactivism, cognitivism, XOR.
1. From symbols to bodies
Artificial Intelligence (AI) can be approached just with an engineering frame of mind, looking for algorithms that work and are able to solve a problem. However, one can settle to a philosophical one too, and consider AI a conceptual tool to get better insight on what the mind is and how it works. Within this frame of mind, just solving problems is not enough: we want our theory to have, to a certain degree, psychological reality. We want our model to embed some of the earthly properties that human minds have. Currently, discussion is mainly around three main models concerning what the mind is: symbolic cognitivism, connectionism and the embodied mind. In this paper we adhere to the third model; in particular, to a special branch usually known as enactivism, to argue that 1) It is a more reasonable model of how the mind really works; 2) It has both theoretical and empirical benefits for AI; and 3) Can be easily implemented in simple robotic sets like Lego Mindstorms (TM).
Much has already been written about the differences between these three mind models, and which is the superior one. To our understanding, despite their success in creating models on subjects like mathematical reasoning, face recognition, visual perception or even creating artworks, both the cognitivist and the connectionist approaches have one major flaw which is of considerable philosophical importance: they cannot produce a credible account of the relationship between mind and world. Being local symbolic representations or distributed subsymbolic representations, both models are based on an abstract reconstruction of a specific domain of the physical world, both the selection and the way representations are connected to real life events and objects has been articulated beforehand by the cognitive system (Thompson 2007). Connectionism tries to generate a more plausible description of the mind, trying to better capture its neurological basis. This leads to a more dynamic account of representations: instead of being something stable, they are distributed along the whole system as well as self-organised, having certain co-variation with the environment. However, both symbolic cognitivism and connectionism consider the world and the mind as two completely different entities, with a very much regulated protocol of interaction.
The embodied mind shares some characteristics with connectionism. It also proposes a self-organised system and it is based on a dynamic approach. However, in this approach dynamicism has been extended to the correspondence between mind and world. Instead of having a simple coordinated correspondence between symbols (or subsymbols) and real life objects, the embodied mind paradigm is based in a non-linear causality system in which by means of sensorimotor integrations, brain, body and environment are continuously influencing one another, making it impossible to separate the three into clear-cut parts. In order to have such a system, it is basic that the cognitive entity has some sort of body that can obtain continuous information from the real world in order to co-vary and co-adapt with it (Thompson 2007). This is why the paradigm we are discussing is usually called the embodied mind. First of all we need to avoid the tendency to interpret the notion of embodiment in its weakest sense: that this, a mind needs a body. The embodied mind paradigm argues for something a lot stronger than that, that is, that mind is just the result of circular and continuous processes of causality between brain activity, body and environment, with no possibilities to make a clear distinction among then, nor a chance to build a theoretical model in which mind can be described autonomously from body and environment. (Pfeifer and Iida, 2005).
The particular reading of the embodied mind paradigm we adhere here, known as enactivism, is based on the following ideas (Varela, Thompson, Lutz, Rosch 1991):
1) Living beings are autonomous entities and are responsible for their own goals that are not just settled from the outside.
2) The nervous system is also an autonomous entity, which takes care and is responsible for keeping its own coherent and meaningful patterns.
3) Cognition is the skillful know-how that co-varies with environment and how it evolves. Every cognitive action is both situated and embodied.
4) Cognitive processes are not formally prespecified, but relational domains continually coupling with the environment.
A large amount of the literature takes living beings as the main metaphor. In their seminal book, Varela et al (1991) developed most characteristics of their model by analysing the way cells behave and represent environment. Nevertheless this shouldn’t be considered a vitalist model, defending that only living beings can achieve real consciousness. Continuous coupling with the environment and self-established goals are the only requirements, as it is shown in the aforementioned book when Varela et al. argues in favour of how relevant Brooks’ robots are, presenting them as artificial systems that have some of the main characteristics of an embodied mind (Brooks 1991).
In this work we will defend the enactive approach by exploring the critical role morphology plays in artificial systems. The structure of this work is as follows. First we will point out the benefits of the enactive approach. We will then explore the (in)compatibility between the embodied view and the multiple realizability of mental processes, digging into the debate between two different readings of the embodied perspective, a functionalist and a reductionist one. We will illustrate our explanation with a thought experiment (a robot computing the XOR function courtesy of its morphology), concluding that the functionalist stance does not really match with the enactive view. This thought experiment serves us as the inspiration for our own proposal: three robots that compute XOR courtesy of their morphology. Previous to introducing the robots, we will review our preceding research that constituted our first approximation to the possibility of morphology playing a role in computation.
2. The quest for enactive AI
Despite the fact that philosophers like Dreyfus (Dreyfus 1972; 1992) are convinced of an impossible gap between living and artificial beings that makes an activity like AI impossible, one can reverse the line of thought and attempt to discover if and how these key characteristics of living beings can be reproduced in artificial systems, either by means of simulations or robotics. This is what the enactive paradigm tries to understand. Following Froese and Ziemke (2009) we can state two main systemic requirements in order to be able to speak of enactive AI: autonomy and adaptativity. Despite mysterious claims (Flanagan 2009) these two properties are not beyond scientific and philosophical analysis and are not restricted to the realm of living beings, and can be satisfied in an artificial system.
Instead of trying to rapidly dismiss Heideggerian-based criticisms to current AI by the already mentioned Dreyfus, or more biologically based like Di Paolo and Izuka (2008) or Moreno and Exteberria (2005), we believe it is better to take this challenge seriously, and once all arguments and counterarguments are settled, it is clear that current approaches on AI which don’t include some sort of enactive/embodiment perspective face several challenges which need to be addressed.
One of these problems is what Dreyfus calls the Big Remaining problem, a problem closely related to what Dennett called the Frame problem (Dennett 1984). This problem refers to the (im)possibility of a formal model of a cognitive system to “directly pick up significance and improve our sensitivity to relevance” (Dreyfus, 2007). If artificial systems cannot get meaning from the outside, but rather are following a set of formal rules, we are missing the main characteristics that make an agent a real one, besides not being really useful in real life contexts, as the frame problem paradox presents. The problem with stating meaning in artificial systems is not simply adding sensors that connect to the environment. As we mentioned in the former section, the embodied mind paradigm implies more than just having a body. Following Di Paolo & Izuka (2008) as well as Froese & Ziemke (2009), getting motor systems and sensors into a loop is a necessary condition to have autonomy and adaptivity, but it is far from sufficient. As long as this feedback between environment and cognitive systems is imposed from the outside, we won’t have a real enactive system, which needs to set its goals from the inside.
The need of intrinsically posed goals is not only asked from the enactive perspective. It was stated as earlier as Kant (1790) and can be found in authors that defend a biological, darwinian approach to functionalism like Millikan (1991) or Flanagan (2009). Either from an a priori analysis of the concept of autonomy or trying to naturalise it, the consequence is largely the same: in order for a system that adapts to the environment to be autonomous, it has to be the system itself that sets the goals, not an outside element which postulates those criteria from the beginning. As the biosemantical model defended in Millikan (1991) states, this doesn’t imply any type of vitalism or mysterious positions. The fact that aims and plans of cognitive living systems are intrinsic can be explained by the process of natural selection.
Following the ideas stated in Froese & Ziemke (2009), we will present the basic methodological principles behind enactive AI. They are:
1) Methodology is viewed under scientific light and not as an engineering process. We want to understand mind, not only solve practical problems on face recognition or make guesses about the behaviour of the stock market.
2) Behaviour cannot be prefigured by formal guesses of how the world is and then be implemented in the system. Behaviour emerges of the continuous interactions between the system and its environment.
3) An optimal enactive AI system needs to find a balance between robustness criteria and flexibility, as one can see in the natural world.
What is the main difference between enactivism and plain embodiment? Despite the fact that Thompson (2007) uses both terms almost synonymously, we believe, following Froese & Ziemke (2009), that there are interesting differences between them. Basically, we will use it to distinguish it from a more general approach to the notion of embodiment, which seems to be content with arranging a closed sensorimotor loop that allows co-variation between internal models in the brain and the outside world. Although this is necessary, it is not sufficient, and in order to assure real autonomy from agents, more needs to be added to the system.
How can we develop AI that adapts to the enactive principles? The most feasible way -and probably the only one- is to forget completely about multiple realizability, the omnipotent power of Turing machines and include both the physical structure of the system -the body of the appliance shall we say- as well as the environment as key elements for computations. We will explore this in the next section.
3. Enactive AI? Morphology to the rescue!
In order to develop AI that adapts to the principles of the enactive framework, first we have to face the assumed multiply realizable nature of minds. The Multiple Realizability thesis (Putnam, 1967, MRT) has been for many years a good justification for the Cartesian-like methodology characteristic of the disciplines studying mind over the past decades (like philosophy of mind, psychology and cognitive science). This methodology operates under the assumption that mind can be explored and explained with no (or little) attention to the body. Putnam had conceptual and empirical arguments for MRT. They both constituted arguments against the Identity-Theory, a cutting-edge theory at the time that claimed that states and processes of the mind are states and processes of the brain (Place, 1956; Feigl, 1958; Smart, 1959). The conceptual argument originates from the assumption that the Turing machine is the right way to conceive minds. The empirical argument draws attention to the fact that mental states of the sort humans possess may be had by creatures that differ from us physically, physiologically, or neurobiologically. If we are willing to attribute to other creatures, like octopi or a potential extraterrestrial form of life, the same mental states that we have (e.g. pain or hunger), then we have to detach mind from a particular physical structure (e.g. human brain). The important criterion for mental sameness here is not a physical sameness, but a functional-sameness. Therefore, the particular matter that minded creatures are made of, and the particular body they have, is of minor importance when studying and explaining how their minds work.
This disembodied methodology has also been dominating in AI, again, because of some form of multiple realizability. As stated by the cognitivist paradigm, cognition has to do with algorithms (operating over representations), and, until recently (1980s), AI has been exclusively concerned with finding effective algorithms. Algorithms are platform-free, that is, the same algorithm can be implemented in different physical structures. Algorithmic minds are free from the (constraints of the) body.
However, MRT has been called into question. Recent works argue that evidence in favor of MRT as an empirical claim, is not as convincing as many philosophers have been claiming (Bickle, 1998, 2003; Bechtel & Mundale 1999; Shapiro 2000, 2004; Polger, 2002). It seems reasonable to say, for example, that in the same way that an octopus and I do not share the neural mechanisms underlying the psychological state of being hungry, we also do not share the same psychological state of being hungry (Bechtel & Mundale, 1999). Putnam’s MRT uses a coarse-grained analysis when referring to psychological kinds. This is a legitimate practice in science. But when it comes to considering brain states, Putnam uses a fine-grained analysis. When the same grain is employed at both levels (psychological and neurological), the picture we get is a different one. A coarse-grained analysis allows for similarities at both physical and psychological levels, while a fine-grained inspection drives us to the conclusion that particular mental processes might require particular physical structures. Neural plasticity, in its turn, has been alleged as evidence for MRT (Block & Fodor, 1972). For example, the capacity of the brain to process language sometimes in the right hemisphere is said to be evidence for MRT. We should be cautious, however, in concluding that. In those arguments, the emphasis is placed on the location of the mental activity, and not in the processes by means of which the mental activity is produced. The processing of language in the right hemisphere might be done in the same way (by means of the same processes) that it is done in the left hemisphere, therefore does not necessarily constitute an interesting case of multiple realizability. An interesting case involves different processes producing the same function. As long as the neural plasticity argument elaborates only on differences in location, it lends no support to MRT. And in response to Putnam’s conceptual arguments, we can just claim that the power of the Turing Machine metaphor, and in general of the computational functionalism developed by Putnam (1960, 1967), has been dismissed over the last years.
Our first conclusion can be, on the one hand, that as an empirical claim about minds, MRT cannot be used as a justification anymore, at least not in the unchallenged way that has dominated the scene until recently. On the other hand, as a foundation for a theoretical approach and methodology to develop artificial intelligent agents, operating under the assumption that mental processes are independent of the physical structure, it is unsatisfactory. We have seen that although successful in some domains, algorithmic AI is not providing us with an understanding of how natural forms of intelligence work.
The specific line of criticism against MRT that most affects our goal here is the one that, according to some, follows from accepting the tenets of the embodied mind program. We can find two different readings of the embodied mind view, corresponding to two very different senses of embodiment, and only one of them challenging MRT. A functionalist reading claims that body plays a computational role in cognition, although the physical details of implementation are not important (Clark, 2006; 2007; 2008). Under this interpretation, mental processes are still multiply realizable, but this time the implementational base spreads to include the body and the environment. A reductionist reading, however, defends that the details of implementation are a constraint on mind, and so, mental processes are not multiply realizable in different bodies and across different environments. Differences in morphology are going to make a difference in mental processes (Shapiro, 2004; Noë, 2004).
The reductionist reading advocates for a more radical interpretation of the embodied view, which may develop into a paradigm alternative to the representationalist and computationalist models of mind. The notions of coupling, action and sensorimotor coordination are contrasted with functional, computational and information-processing. The functionalist reading, nevertheless, proposes a reconciliatory picture, and the new notions from the embodied program are integrated in the (old) functionalist model of the mind, where representations and computations are still the keystone of cognition, and mental processes keep their platform-free privilege. But now the body and the environment are as important participants as the brain is.
The reductionist interpretation matches with the enactive trend within cognitive science. A good illustration of this fact is the sensorimotor approach to perception developed in O´Regan & Noë (2001) and Noë (2004). According to this approach, perception consists of the implicit knowledge we have of sensorimotor regularities, that is, the relations between movement and change, and sensory stimulation (what is called sensorimotor dependencies). Perception here is a matter of actions, not internal representations. The range of actions an organism performs has to do, in turn, with the sort of body it has. It follows from this that differently embodied organisms, engaging in different sensorimotor loops with the world, are going to perceive the world in different ways. An organism with two eyes separated by 180º (one in what we would call the front, the other in the back) will engage in different sensorimotor loops when visually perceiving an object. Its gross-bodily and eye movements will relate to visual input in a way that differs from the way we humans relate to visual input. Thus, this approach to perception has the (radical) consequence that the particularities of our bodies are a constraint on how we perceive.
The functionalist reading of the embodied mind program defends, as we said, that the fine-grained details of an organism’s body are not a constraint on mind. In particular, they do not determine how we perceive. Although embodiment, action and embedment are significant features when we consider thought and experience, the (embodied) functionalist says, their contributions are more complex than a mere direct relation. There is a buffer zone between the fine details of body and motion-dependent sensory input, and the internal representations that determine perception. Perception ultimately depends on representations and computational processes that are insensitive to the fine details of the world-engaging sensorimotor loops. The specific sensorimotor dependencies are only the contingent means to pick up external information. It is this higher level of information-processing what determines experience. Thus, differently embodied organisms, interacting with objects in different ways, could, in principle, have the same perceptual experience, as long as they have access to the same gross information and then can form the same internal representations.
The sensorimotor approach to perception relates mental processes, in particular, perception to action, bringing mentality out of the realm of internal representations. This contrasts with the (less radical) view we just mentioned, also within the embodied mind paradigm, where perception, and in general mental processes, are still a matter of (internal) representations. For this reason, the sensorimotor approach to perception provides us with a good starting point to figure out how an enactive AI should be. And it does so not only because of its strong points, but also for its limitations. Sensorimotor loops by themselves do not allow us to talk of an agent’s intentional action (other than metaphorically). A notion of selfhood or agency is needed (Thompson, 2005). A detailed analysis of how to solve this lack is specified in Froese & Ziemke (2009). Here, we are only concerned with one of their conclusions: the aboutness of our cognition “is not due to some presumed representational content that is matched to an independent external reality (by some designer or evolution)” (ibid, p. 33), but has to do with the continuous activity of the organism. Life and (its corollary) movement are here in a continuum with mind.
In order to develop an enactive AI, we need to rely on this more radical (reductionist) interpretation of the embodied program. Hence, in exploring how we can bring the enactive approach to the AI lab, firstly, we need to ignore the multiple realizability of natural minds (Putnam’s MRT) and algorithms, and focus our attention on how to develop systems that inhabit their (particular) bodies that, in turn, inhabit their (particular) environments. This will provide our AI projects with better results and, more importantly, with a better understanding of how (natural) organisms interact with their environment. At this point, the brain-in-a-body (controller-in-an-actuator) caricature that used to rule the mind sciences disappears. The clean division between mechanical design and controller design is, therefore, no longer useful. Natural organisms evolve with particular bodies in particular environments, and exploit the potentialities of them. Since intelligent behaviour is not the result of a pre-programmed set of functions instructing a passive body, in order to build intelligent robots we need to explore the many ways natural organisms exploit their physical structures and their surroundings, and how intelligent behaviour emerges from that. The goal for enactive AI is not to simulate the abstract operations happening in the brain (algorithmic AI), but the physical interactions between the whole organism (with its particular body) and its environment.
It is time to explore the potential of embodiment in artificial systems. We will do that by means of the notion of morphological computation. This notion was first introduced and explained by Chandana Paul (2004), and refers to the phenomenon of the bodily details taking charge of some of the work that, otherwise, would need to be done by the controller (be it the brain or a neural network). That is, computation obtained through interactions of physical form. We will introduce this notion in more detail with a particular example that in turn will serve us as the inspiration for our own proposal.
Paul (2004, 2006) offers an example of morphological computation, where a robot controlled by two perceptrons (Rosenblatt, 1962) gets to exhibit, courtesy of its morphology, a XOR-like behaviour. The possibilities of the morphology of an agent’s body have been exploited in different ways in the study of adaptive behavior. There are several examples of robots and vehicles that try to make the most of the details of embodiment in order to minimize the control required (Brooks, 1999; Braitenberg, 1984). Paul’s robot consists of two perceptrons as controllers. Perceptrons are very simple connectionist networks consisting of two layers (an input layer and an output layer), modifiable connection weights and a threshold function.
Perceptrons can compute functions as (inclusive-) OR and AND, but not complex ones such as exclusive-or, also written as XOR (Minsky & Papert 1969) (See fig.1). Perceptrons learn to generalize on the basis of physical similarity. If a connectionist network has been trained to classify pattern 111000, then it will tend to classify a novel pattern 111001 in a similar way, on the basis of the (physical) similarity (or similarity of the form) between these two patterns. Perceptrons, then, are suitable to process linearly separable functions, as (inclusive-) OR and AND (see fig. 2 & 3), but not linearly inseparable ones such as XOR. To compute XOR we need an extra node or two (hidden units) between the input and the output layers (see fig. 4). This hidden unit becomes active when both inputs are active, sending a negative activation to the output unit equivalent to the positive activation that it receives from the inputs units.
The inputs coming into the robot are two, A and B. One network computes OR and the other computes AND. Each network is connected to a motor. The network computing OR is connected to motor 1 (M1), which turns a single wheel at the center that causes forward motion. AND network is connected to motor 2 (M2), which serves to lift the wheel off the ground. Thus, M1 will activate the wheel if either or both inputs are active. And M2 will raise the wheel off the ground if and only if both inputs are active (see fig. 5). When both inputs A and B are off, both networks output 0, then the wheel is not raised from the ground and it does not move, so the robot is stationary. When input A is active and input B is off, the AND network outputs 0, and then the wheel stays grounded. But the OR network outputs 1 and then M1 causes the wheel to move forward (so the robot moves forward). When B is active, the same thing happens: the AND network delivers 0 and the OR network delivers 1, so the robot moves forward. The interesting case is when A and B are both active. In this case the OR network makes M1 to turn the wheel on, but the AND network lifts the wheel from the ground, so the robot remains stationary. Summarizing the behaviour of the robot in a table, we discover that it looks like the truth table of the XOR function (see fig. 6). The explanation is that “the robot’s behaviour is not simply determined by the output of the neural networks, but also by the actuated components of the body” (Paul, 2004, p. 2).
4. Previous research: simulations and Lego NXT robots.
Previous to this current research we were working on simulating synthetic emotions, always trying to create very simple situations in which we could elucidate the value of basic emotions (pain and pleasure) for the emergence of complex activities. Our interest in emotions is not only a question of affinity with the topic but the strong belief that emotions are basic intentional forces for living entities. Therefore, emotions should be the keystone of the whole building of AI and robotics. Emotions being a natural system that most living intelligent creatures use to calibrate their relationship with the environment and their own plans and goals, they are key major factors to study when we want to understand how autonomous living systems develop their own goals.
We developed two different computer simulations, which we called TPR and TPR.2.0. Both models and results were published as Vallverdú & Casacuberta (2008) and Vallverdú & Casacuberta (2009a, 2009b), respectively. Let us summarize them.