“A wise man proportions his belief to the evidence.”
An Enquiry Concerning Human Understanding, David Hume 1748
What then are we doing when we consider all this evidence? How do we use it to make sense of the world? Stuff happens, endlessly; yet we do not experience the world as an endless array of chaos. Indeed quite the opposite, we find events understandable for the most part. How? By bringing to every experience a model of the way the world works that we have developed over a lifetime of learning. These models are bundles of hypotheses consisting of beliefs, hunches, intuitions, and prejudices all molding our expectations. They lead us to expect that certain things will or will not happen, that is, they consists of a set of probabilities by which some things are considered much more likely than others. Drop a ball and we expect it to accelerate downwards, let go of a helium filled balloon and we expect it to float upwards. Both the posterior and the likelihood term in the Bayes equation deals with these relationships between observations and models. We say that it is highly likely that the helium balloon will rise given that helium is lighter than air. The likelihood term in this instance would look like this: p(D|H) = p(helium filled balloon rises | helium is lighter than air) = extremely likely. Seeing the balloon rise confirms our understanding of how the world works, strengthening our model. On the other hand if the helium filled balloon was to drop like a rock we would be surprised. Surprise can now be given a precise definition. It is what happens when we experience data that is unexpected given our models. Surprise engages curiosity, saying ‘look right here, right here there is something new to learn’. Instead of confirming our models the unexpected invites us to modify them, to adapt.
Previously I suggested that belief could be understood as a primitive, fundamental feature of all living things. The interplay of models and data exposes the roots of how these beliefs come about and lead to evolutionary adaptations. The fixed instincts ethologists study in animal behavior can be seen as models carved in biology as selection has adapted each species to its environment. Nest building birds are engaged in an act of planning that embodies an expectation of future mates, exploiting the statistical regularities of their environments learned through the selection of their ancestors. Successful hunting and foraging are arts of exploiting models of the environment efficiently. Where and when to hunt and where and when to construct a nest are life and death gambles with only probabilities to guide the behaviors to success. In higher animals the associative learning of Skinner’s dancing pigeons and Pavlov’s salivating dogs cause them to act as if their models of the world have been updated to account for the new data they have encountered. The pigeons use the dancing displays they were born with but when they engage in those displays changes as they come to “believe” they are related to the appearance of food. The dogs salivate as they always will with a meal but what has changed is when so that now on hearing a bell they consider it highly likely food is to be forthcoming. Positive and negative reinforcements powerfully shape what can be changed – the assignments of values on their malleable probability scales.
The interplay of data and models also formalizes a process of evaluation we use everyday. Cognitive psychologists call the models we have of how the world works and how social interactions occur our folk physics and folk psychology. To explain why these cognitive models are found universally in our species these researchers postulate that they come from functional modules in our brains that have been shaped by evolution. The modules provide propensities that shape the models we build from our experiences. Using these models we come to expect certain behaviors in our environments, from rocks and stock prices to people. This classification of events on a scale from the improbable to the most likely provides those who do it well the tool needed to plan effectively. These models also provide a way to understand the experiences of our past. By applying this filtering of likelihood we explain to ourselves how the various outcomes we have experienced follow from the models. Given that my model is correct the chances that X will occur, or did occur, makes sense. In the mysterious alchemy of the neuron soup in our skulls by using these models we find meaning in events. Isolated sound-byte information does not bring understanding; it needs to be integrated into a relevant context. Unlike the evidence alone in the denominator of the Bayes equation the likelihood term concerns itself with the evidence given the particular parameter or model being used. I understand why my partner spoke kind words to me this morning because we are in love, kind words are more likely than having her add poison to my breakfast, given our affection. I comprehend why my neighbor’s roof collapsed when a tree was blown onto it in a storm because a roof is likely to hold up under snow but not a Douglas Fir, given its strength. I consider Acme Widgets to be a sound corporation given its past performance and purchase its stock because my model of the company considers it probable that future evidence will include a higher valuation. The examples can be multiplied endlessly; they are all simply illustrations of conditional probability, our main tool for dealing with the uncertain world we find ourselves in.
When events do not confirm our models we are surprised. Sometime this is delightful as when the punch line of a good joke collapses the model we thought we were dealing with or a magician pulls off the seemingly impossible right before our eyes. Because the models are grounded in probability there is a constant shift in balancing which ones apply when and how. I reach into the refrigerator to get the milk I believe is there, if I find there is none to be had I update the probability of my model in which the refrigerator contains milk. If the stock I purchased in Acme Widgets continues to go down in price instead of up I can question either the evidence or my evaluation of the model about the soundness of the corporation. If my stake in Acme Widgets consists of my life savings this surprise is not going to be delightful. When the stakes are high, surprises can be nasty and leave us confused. The world does not make sense and we find this a painful state, seeking to escape it as quickly as possible. There are a couple options available to reduce our confusion. One is to deny the data and keep our models intact. We can ignore it or downplay its importance or rationalize it away. We can also practice cognitive dissonance in which we isolate the data so that its full implications are not allowed to modify the models we hold dear. The other option is to revise the models we are using, to search for models that will take into account the new information, to learn from the experience. This is often more difficult for a number of reasons. When there is not a new model immediately available the state of confusion is prolonged as the search continues. It also entails recognizing that we were wrong, that we do not understand the world as well as we thought which can involve feelings of shame and embarrassment. A powerful incentive must be available to carry us down this more difficult path. One such incentive is simply the desire to know, to adjust to reality as it is given and not as we might wish it to be. Understanding is how we gain a sense of control. Research has shown that cancer patients given a full explanation of their condition deal better with their medical procedures and recover sooner than those kept in the dark. There seems to be an undeniable cognitive drive to not fool ourselves, a drive easy to understand for its fitness value. This may seem Pollyanna in light of all the evidence indicating how easy it is to delude ourselves but the point is that once we are convinced that something is not true we can no longer make ourselves believe in it by an act of will. As we say, knowledge has a price.
Let us take a look at a classic example of this interplay of likelihood, data and models in a non-trivial case. In 1802 the Reverend William Paley published Natural Theology, or Evidences of the Existence and Attributes of the Deity collected from the Appearances of Nature. In it he made an argument for the existence of God from the design we observe throughout the natural world. The (in)famous example was of a man walking along a sandy beach and suddenly coming across a pocket watch. He would be justified in believing that such an intricately designed thing must have had a designer. By analogy when we see the intricate design of living things each so carefully adapted to survive in their habitats we must rationally posit a designer. The probability that such intricate design could have arisen through the workings of blind chance were so miniscule that arguably the only rational conclusion was an assent to the belief in a creator. At the time this was indeed a perfectly reasonable conclusion. It remains true that the probability of blind chance creating the wonders of the biological world is vanishingly small; in the limit one could say it is impossible. From our time looking at the argument we recall the danger of the black swan, almost impossible is not in fact impossible. A few years later in 1838 Darwin presented his theory of evolution by natural selection. It is capable of explaining the very intricate design evident in every creature without the postulate of a special creation by showing that natural selection is anything but blind chance. In Richard Dawkin’s telling phrase evolution is indeed ‘The Blind Watchmaker’ (Dawkins 1986). I rehash this well known bit of intellectual history to illustrate a particular point. The likelihood one would rationally assign the hypothesis of a supernatural creator changed with the addition of knowledge. The posterior probabilities changed. This occurred as evidence accumulated that was only highly likely if the evolutionary hypothesis was correct, that is the likelihood term added more weight. The question of whether you personally choose to believe in a creator or not involve subjective Bayesian considerations and are beside the point being made here. The cultural change in the likelihood of a creator due to new evidence is a form of objective Bayesian thinking. It is as if culture had a thought and now the argument from design no longer functions as it once did. This is the Bayesian process writ large. Many still believe in a special creation, many don’t, but all absorb the cultural change. Today neuroscience and genetics are playing the same role, providing evidence slowly being assimilated into our cultural understanding of what life is and what it means to be human by changing the probabilities of various hypotheses. For example, the mind first school of philosophy which holds that first there was thought and then there was matter is becoming less cogent as data accumulates that show the roots of consciousness are processes within the brain. It is part of the power of the Bayesian model that it is capable of capturing these cultural dynamics which change our degree of belief in a hypothesis through our collective engagement with new evidence.