Review of “Anthropic Bias Observation Selection Effects in Science and Philosophy”

I’m going to do the unthinkable: on a site entitled Research Reviews, I’ll actually be reviewing some research! Before you die of shock, let me assure you that I will not be reviewing what generally comes to mind when people think of “research” (no labs were harmed, or used, in the making of this research), and that of course you will be getting something for free out of it (the entire book Anthropic Bias: Observation Selection Effects in Science and Philosophy). Also, I will as is typical of this blog be reviewing certain topics in or relevant to the sciences, such as the anthropic principle and fine-tuning.

The Book: What it isn’t

The reviewed work here is the book Anthropic Bias: Observation Selection Effects in Science and Philosophy by Dr. Bostrom. Although it covers topics like multiverse cosmology and the anthropic principle, it differs in several ways from most books that deal with these subjects. First, it is not filled with equations and formulae as is Barrow & Tipler’s (in)famous The Anthropic Cosmological Principle nor is it sensationalist and overly simplistic like Krauss’ A Universe From Nothing or Schroeder’s The Hidden Face of God. It’s a work of scholarship, published by Routlege (an academic publishing company) as part of the series Studies in Philosophy. It’s not the kind of book you’ll find in bookstores nor would most, I think, find it light reading. Second, it deals with multiverse theory and the anthropic principle secondarily. The book is really a treatise on the best way to approach a particular kind of problem that we face in the sciences but are actually more likely to encounter in popular discourse. To illustrate the kind of problem this book concerns (observation selection effect bias), I’ll give two examples, one simple and the other simplistically summarized.

Anthropic Bias: Examples of bias from observation selection effects

Example 1: Extraterrestrial intelligent life must exist

This is one of several examples that Bostrom gives in his introduction, but I choose it because I have addressed this specific question here and I have found myself trying to explain the problems involved to an almost inevitably skeptical audience. Many people believe that even if it is incredibly unlikely for life to develop on any given planet, there must be a huge number of planets with life, including intelligent life, because there exists an astronomically (bad pun intended) large number of planets in the universe. Moreover, there are many of them even relatively nearby that are “Earth-like” and found in what astrobiologists, among others, call “habitable zones” (HZs). And after all, what are the chances that Earth is the only planet in the entire universe on which complex (including microscopic, multicellular organisms) or intelligent life arose?

Well, this final question approaches the right way to think about this issue. To estimate how many planets have life, all we need to do is take the number of favorable outcomes (planets with complex life) and divide by all the planets. Simple. Of course, if knew of any other planet with complex life, we wouldn’t be asking this question. But I don’t want to present my approach to this problem (although it is similar, in form and conclusion, to Bostrom’s). I want to give his:

“Let’s look at an example where an observation selection effect is involved: We find that intelligent life evolved on Earth. Naively, one might think that this piece of evidence suggests that life is likely to evolve on most Earth-like planets. But that would be to overlook an observation selection effect. For no matter how small the proportion of all Earth-like planets that evolve intelligent life, we will find ourselves on a planet that did (or we will trace our origin to a planet where intelligent life evolved, in case we are born in a space colony). Our data point—that intelligent life arose on our planet—is predicted equally well by the hypothesis that intelligent life is very improbable even on Earth-like planets as by the hypothesis that intelligent life is highly probable on Earth-like planets…
The impermissibility of inferring from the fact that intelligent life evolved on Earth to the fact that intelligent life probably evolved on a large fraction of all Earth-like planets does not hinge on the evidence in this example consisting of only a single data point. Suppose we had telepathic abilities and could communicate directly with all other intelligent beings in the cosmos. Imagine we ask all the aliens, did intelligent life evolve on their planets too? Obviously, they would all say: Yes, it did. But equally obvious, this multitude of data would still not give us any reason to think that intelligent life develops easily. We only asked about the planets where life did in fact evolve (since those planets would be the only ones which would be “theirs” to some alien), and we get no information whatsoever by hearing the aliens confirming that life evolved on those planets (assuming we don’t know the number of aliens who replied to our survey or, alternatively, that we don’t know the total number of planets). An observation selection effect frustrates any attempt to extract useful information by this procedure.”

Example 2: The Anthropic Principle

The anthropic principle is usually divided into classes (especially strong and weak) and is highly nuanced, so I will just keep things simple by approaching it as sort of an inverse of example 1. In the first example, we looked at the flawed reasoning that leads to the idea that complex life is likely abundant in the universe. The most frequent arguments involve a flawed inference from the fact that life arose here to how likely it is to arise elsewhere, because knowing only that it arose here is consistent both with the hypothesis that life arose only on Earth and the hypothesis that complex life is abundant in the universe.

Not long ago, back when Neil deGrasse Tyson was Carl Sagan, scientists in general had pretty high hopes for the Search for Extra-Terrestrial Intelligence (SETI). To some extent that hasn’t changed, but a combination of the complete failure of SET and an increased understanding of the sheer number of variables that have to be just for life to arise and evolve have prompted many scientists working in astrobiology to conclude that complex life is probably rare, that if intelligent life exists elsewhere we’re never going to know, or even that we are alone. The arguments for and against these and other beliefs about life in the universe are for another time. Here, I simply want to introduce a very simple definition of the anthropic principle as it is relevant here:

“The anthropic principle is the name given to the observation that the physical constants in the cosmos are remarkably finely tuned, making it a perfect place to host intelligent life. Physicists offer a “many-worlds” explanation of how and why this might be the case.
My feeling is that a misanthropic principle could also be applicable. I use this term to express the idea that the possible environments and biological opportunities in this apposite cosmos are so vast, varied and uncooperative (or hostile), either always or at some time during the roughly 3-to-4 billion years intelligent life requires to emerge, that it is unlikely for intelligence to form, thrive and survive easily.” (Alone in the Universe)

Because there are so many fundamental “parameters” (e.g., the cosmological constant, the four fundamental forces, etc.) don’t just appear to allow for life, but are instead “remarkably finely tuned” for it. Again, I don’t want to introduce too much of my take here so to quote from Bostrom:

“Another example of reasoning that invokes observation selection effects is the attempt to provide a possible (not necessarily the only) explanation of why the universe appears fine-tuned for intelligent life in the sense that if any of various physical constants or initial conditions had been even very slightly different from what they are then life as we know it would not have existed. The idea behind this possible anthropic explanation is that the totality of spacetime might be very huge and may contain regions in which the values of fundamental constants and other parameters differ in many ways, perhaps according to some broad random distribution. If this is the case, then we should not be amazed to find that in our own region physical conditions appear “fine-tuned”. Owing to an obvious observation selection effect, only such fine-tuned regions are observed. Observing a fine-tuned region is precisely what we should expect if this theory is true, and so it can potentially account for available data in a neat and simple way, without having to assume that conditions just happened to turn out “right” through some immensely lucky—and arguably a priori extremely improbable—cosmic coincidence.”

Popular Physics: What this book isn’t

Paul Davies is a physicist and author of a number of popular science books, including The Goldilocks Enigma and The Eerie Science. The first book is on fine-tuning and the anthropic principle, while the second is on life in the universe. In the second, Davies concludes with his own views, one as a scientists, one from a philosophical perspective, and one as a person. Wearing his “scientist hat”, he conclude, “my answer is that we are probably the only intelligent beings in the observable universe, and I would not be very surprised if the solar system contains the only life in the observable universe. I arrive at this dismal conclusion because I see so many contingent features involved in the origin and evolution of life, and because I have yet to see a convincing theoretical argument for a universal principle of increasing organized complexity…”

Both of Davies books are quite like many, some that agree and many that don’t, in that they offer glimpses into the nature of scientific research related to the origins of life, the finely-tuned parameters (or why they actually aren’t finely tuned, although this is a minority position), but don’t require any real background knowledge. Then there are books that are at least semi-popular, such as Penrose’s The Road to Reality or the aforementioned The Anthropic Cosmological Principle, but are largely inaccessible to most readers (my father received his undergraduate degree in physics from one an Ivy League college, is an extremely intelligent individual, and didn’t get much past chapter 1 of Barrow & Tipler’s book).

That’s one thing I find particularly delightful about Bostrom’s book. It is technical in that it tackles reasoning and logic in a highly nuanced way. Although examples are given frequently to illustrate logical implications or flaws in particular inferences, the questions and issues tackled are fleshed out completely without skimping over any issue related to the rational, logic, validity, or justifications for any arguments.

Philosophical Texts on Reasoning and Rationality: What this book is better than

Better yet is that this book deals with subjects like whether the cosmos is finely tuned for intelligent life and if so what this means. The book is fundamentally concerned with advancing a coherent, logical, and justifiable framework for addressing kinds of questions like those in the examples. I have many books with similar goals: Heuristics and Biases: The Psychology of Intuitive Judgment, Probability Theory: The Logic of Science, Acceptable Premises: An Epistemic Approach to an Informal Logic Problem, Abductive Reasoning- Logical Investigations into Discovery and Explanation, Against Coherence: Truth, Probability, and Justification, Bayesian Epistemology, The Algebra of Probable Inference, Abductive Cognition: The Epistemological and Eco-Cognitive Dimensions of Hypothetical Reasoning, Model-Based Reasoning in Science and Technology: Theoretical and Cognitive Issues, and many dozens more. I enjoyed many of them and found all to be useful, but would recommend few if any to the general reader. That’s because they aren’t just technical, but only technical. They are “dry”, not just because they demand the reader deal with sophisticated nuances, but because they introduce their subject matter as their subject matter.

Now, there’s nothing wrong with this. In fact, it’s very hard to write a book about something, especially an academic monograph, without talking almost exclusively about that something. Of course, most books on methods in the sciences, certain kinds of reasoning or logics, epistemology, etc., give plenty of examples. But they are of the kind that we find e.g., in Bostrom’s 5th chapter “The self-sampling assumption in science” where there are sections on SSA in thermodynamics or evolutionary biology. Few books are able to recognize how far two related subjects (in this case, fine-tuning and the anthropic principle), both the main topic of countless popular books, can serve to introduce and cover in no small detail something like a specific kind of abstract reasoning. Bostrom not only found such a perfect way to thoroughly introduce the reader to so abstract a topic, he proceeds to cover it in more detail with a variety of interesting examples, and then uses a popular and fascinating probability paradox (the doomsday argument, one of the paradoxes in Eckhardt’s Paradoxes in Probability Theory, which cites Bostrom here) as yet another way to flesh out still finer points of his approach.

Thank you, and please help yourself to our complimentary gift on your way out

To embarrass myself by quoting the children’s television show Reading Rainbow, “but you don’t have to take my word for it.” If you think that fine-tuning, multiverse theory, the anthropic principle, and scientific reasoning might be interesting topics, but you don’t want to spend the money, another great thing about this book is that it is available in FULL for free and LEGALLY (well, I think legally, as it is available from the book’s website). So please, help yourself:

Anthropic Bias – complete text

Posted in Philosophy, Probability | Tagged , , , , , , , , , , , | Leave a comment

Dictionaries don’t define words, and words don’t really make up language

There are many academic topics which most people not in a related field know little or nothing of. There are others, such as astronomy, quantum physics, climate science, brain sciences, etc., that are interesting enough to be discussed (mostly inaccurately) in blogs, popular science magazines, newspapers, YouTube, TV, and so on. Then there are fields like linguistics. Most linguists (and often those who aren’t linguistics but study language as neuroscientists or cognitive psychologists) have had the experience of telling others what their field is only to be met with an initially positive, receptive audience that expects to hear one thing and, upon hearing something quite different behaves quite differently (sometimes even a bit hostile, at least in a few accounts I’ve heard). This reaction is because brain research or quantum physics require watching or reading about to get an inaccurate view of, and diligent study to get something more,  but everybody speaks and most people can read too. Most also either speak more than one language or have at least taken some foreign language classes.

It turns out that knowing more than one language often translates into knowing less about language. Worse still, wide-spread literacy (especially for languages with a long tradition of dictionary use) also creates rather fundamental misconceptions. So fundamental, in fact, that it took linguists a long time to realize how much even they were misled about the nature of language thanks to dictionaries and a long tradition of grammar schools and grammarians (yes, you can blame all of your problems on anybody who tried to teach you “proper” grammar, especially those who responded to “Can I go to the bathroom?” with “I don’t know, can you?” or introduced you to participles).

But let’s start with something simpler than whether or not Georgian has gerundives. Let’s start with words and simple structures, because this is really about language, words, and meaning. It would seem natural to suppose that a great part of language consists of words and a kind of mental “dictionary” of their meanings. It was so natural that linguists like Chomsky tried to understand language this way, thinking that meanings could be relegated to this mental “list” of dictionary-like meanings (called the “lexicon”) and the rest consisted of purely formal (syntactic) rules for manipulating the words of any given language to produce grammatically “correct” sentences. So, for example, linguists (even before Chomsky) produced tree diagrams in which sentences were broken down into smaller and smaller chunks within larger chunks. The main chunks were noun phrases (NPs), verb phrases (VPs), and prepositional phrases (PPs), familiar to those who were tortured…uh, I mean taught, traditional grammar in school.

For most of my life during which my grandfather was alive, I knew him as a professor of classics and linguistics at Cornell and then professor Emeritus. The books of his I received during his life (only a few) and those after his death were mostly about languages and were mostly language textbooks or grammars on e.g., Old Norse, Welsh, Cornish, Sanskrit, Old Church Slavonic, Greek Romany (he wrote that one, actually), etc. The only actual linguistics textbook I have of his—the first I read and what made me decide (wrongly) that linguistics wasn’t for me—was An Introduction to the Principles of Transformational Syntax by Akmajian & Henry (MIT Press, 1975). It was filled with tree diagrams in which a sentence S would be broken into two sub-branches NP & VP which were in turn broken into further sub-branches until at the bottom were the words of some sentence. The point was to describe the ways in which certain rules could transform or generate these grammatical structures (generate/manipulate the branches) using combinatorial methods (mathematical/formal rules that could be programmed into a computer). Once these rules were discovered, all one would need to generate or parse grammatically correct sentences would be these rules and access to an external dictionary like those thought to be in everybody’s head.

This failed. Time was when basically all natural language processing (NLP) and AI/machine learning more generally was based on generative linguistics (Chomskyan-based linguistics like that described in the aforementioned linguistics text), which was also the foundation for cognitive scientists’ understanding of language. Nowadays, NLP and related areas in machine learning/AI use advanced statistical methods and specialized databases like FrameNET rather than generative linguistics, and lots of different linguists with varying approaches and theories came to subscribe to an umbrella category of linguistic theories called cognitive linguistics (NOT generative linguistics). Meanwhile, even generative linguists increasingly had to admit that the “lexicon” wasn’t just a list of words if it existed at all, and studies of languages with radically different structures than those long known to scholars (Greek, Latin, Hebrew, German, French, English, etc.) posed significant problems even when it came to identifying whether languages had things like nouns and adjectives, and if so what empirical means there might be to determine such parts of speech.

Other than the problems posed by languages in which e.g., it seems as if everything is a verb (there are examples more extreme than Navajo, but I haven’t studied them and given the structure of Navajo I would be scared to), the big issue and the one which ultimately became central to the model of grammar cognitive linguists employ was that it seemed no matter how many rules which linguists added for a given languages, most actual speech consisted of exceptions to these rules. A landmark study was Nunberg, Sag, & Wasow’s paper “Idioms.” They challenged the view of “many linguists…[who] have been implicitly content to accept a working understanding of idioms that conforms more or-less well to the folk category” which essentially was content to regard idioms as idiomatic: a small part of any given language that could be ignored or at worst require entries in the “lexicon” that were larger than one word. In the paper, the authors showed that not only are idioms not so idiomatic as thought, but also possess rules & structures internal to them. The authors categorized idioms by these internal rules and structures, which we won’t cover, but it is important to talk a bit about their nature, as idioms were the basis for constructions and these are the basic unit of language (which we’ll get to shortly).

The easiest type of idiom to understand follows the “folk category” understanding, such as “birds of a feather flock together.” Despite its length, this idiom is basically like a single word. Even trying to change the tense is “ungrammatical” (*birds of a feather flocked together” is ungrammatical). Other idioms can be “decomposed” into meaningful parts which can be analyzed individually but only in the context of the whole idiom. Consider “spill the beans”. Clearly “Jack spilled the beans on the whole affair” is different from the (non-idiomatic) “Jack spilled the beans on the floor”. The idiom means “divulge information”, but we can split it into “spill=divulge” and “information=the beans”. This is not true of an idiom like “kick the bucket”, which is not as fixed as the “birds of a feather” metaphor (we can say “kicked the bucket”, for example), but can’t be decomposed into meaningful components. However, both “pull strings” and “kick the bucket” aren’t syntactically idiomatic in that we have a regular VP structure; the main problem with these “grammatical” idioms is that we can’t expect to regulate to the “lexicon” because “kick” in “kick the bucket” has no decomposable meaning and while “pull” in “pull strings” does, it only does in this idiom.

Other idioms are even worse. The division of idioms into “grammatical” and “extragrammatical” comes from the even more groundbreaking work with idioms by Fillmore, Kay, & O’Connor in a paper that basically founded Construction Grammar (and therefore construction grammars). Extragrammatical idioms don’t even follow predictable syntactic structures, including e.g., by and large, all of a sudden, believe you me, easy does it, be that as it may, first off, so far so good, make certain, no can do, etc.

The last dimensionality/category we’ll cover (and we’ve already introduced a lot of the notions in construction grammar) is schematicity. This is a fancy way of referring to the ways in which some idioms are actually more like grammatical rules. Syntactic structures like PPs or NPs are highly schematic (they were treated as meaningless structures that applied to basically all possible grammatical sentences). That’s what makes them grammar as opposed to part of the “lexicon”. But there are idiomatic constructions like “the X-er, the Y-er” that are almost as purely syntactic: the higher they climb, the harder they fall; the more you practice, the better you’ll be; the more you act like that, the less likely I am to give you what you want; etc. The “the” part in front of the X-er & Y-er structures is actually distinct from the definite article “the”; it’s from the Old English instrumental demonstrative. Also note that this idiom is so “syntactic” that we have to use variables to describe it, just the way we describe syntactical structures. That’s how schematic it is.

Now we can easily describe construction grammar in its barebones form. There are actually many such grammars, from the original Construction Grammar to Radical Construction Grammar or Cognitive Grammar and even Word Grammar, but they all share a fundamental property that separated them from the models of grammar before them: that the lexicon and grammar are not distinct components but lie along a continuum and that therefore constructions are the basic units of language, not lexemes (which is pretentious-speak for “words”). Some constructions correspond or can correspond to traditional parts of speech like nouns or to words. But the realization that grammatical structures, not just words, were meaningful and that this lexico-grammatical continuum existed showed us that even when we can say that a word is a construction and part of another construction like a noun phrase, it’s still true that meaning comes not from some idealized mental dictionary but are internal to the constructions in which the words appear. It turns out that about half of language consists of “prefabricated constructions” in which structure and/or meaning are internal to units that are larger than words. Put differently, about half the time we use words the meaning can’t be understood as additive (i.e., the sum of the parts of the phrase/sentence). Moreover, even if we idealize words as having independent meaning, this meaning isn’t like a dictionary entry but an encyclopedia entry.

There is one last nail in the coffin of the traditional understanding of language (at least that I’ll cover). It is related to (and involves) the encyclopedic nature of lexical meaning. Simply put, meanings are flexible. Period. Not just because they might occur in some idiom or because they might act like the modal verb “might”, but also because of things like novel usages in which phenomena like metonymy come into play. One very broad category of ways in which meanings are extended regularly and “on the fly” so to speak (I deliberately used to prefabs/idioms there) is via metaphor. My favorite example comes from a linguist who overheard part of a conversation in a pub. A member of a group of friends had left for a while, and upon returning discovered that a female member of the group had left. After asking his friends where she was, he received the answer “she left about two beers ago.” Now, normally when we wish to indicate units of time, we don’t use beers. But here the ability to comprehend novel metaphorical extensions allowed the hearer (and the linguist) to understand that “two beers” referred to the (approximate) amount of time it takes to drink two beers.

So, to wrap up, I’ll summarize the key points. Language isn’t a bunch of grammatical rules we apply to atomic elements that linguists call lexemes and most people call words. It’s vastly more complex, dynamic, convoluted, and most of all inherently and thoroughly meaningful. Not only do words lack any “dictionary” like meaning or even more generally meaning apart from the constructions in which they appear, the “structures” in language convey meaning as do various linguistic (and/or cognitive) mechanisms like metaphor. Hence debates over what a word means that rely on dictionaries aren’t just subject to the quality of the dictionary, but are fundamentally problematic. Words don’t have dictionary like meanings, and debates over what atheism means or what hypotheses are or any number of topics discussed here and elsewhere that are based on disagreements over what certain terms mean can’t be resolved by quoting dictionaries. Sometimes the terms may be technical enough that there exists among specialists an agreed upon definition. Sometimes other facets of language and linguistic use can help resolve disputes which are based on lexical semantics. Sometimes logic helps. But quoting dictionaries your average dictionary is only one step up from simply defining your personal definition to be the definition, and there is never a THE definition of any word (words are inherently polysemous).

Posted in Linguistics | Tagged , , , , , , , , | Leave a comment

Multi-worlds and many-universes: On the universes of the multiverse

As I just addressed dimensions in physics, it seems natural to address universes next. Like misinterpretations of dimensions, people frequently conceptualize a multiverse with alternate universes that are essentially “higher dimensions” in the mystical/spiritual sense, science fiction writers imagine travelling to other universes in e multiverse, and finally it is only natural to suppose that by “multiverse” physicists mean that our universe isn’t the only one (despite the prefixation).
But the truth is that there is so much more (and less) to multiverse theory and it doesn’t usually involve any actual other universes (at least not in the sense often thought). Let’s begin at the beginning (it seems an à propos place).

How split-ends create universes: The Many-Worlds Interpretation of Quantum Mechanics

A long time ago (the 50s) in a galaxy far, far away (the milky way, which is still far, far away, just not from our perspective), there lived an individual by the name of Hugh Everett III. As you can imagine given this pretentious name, Dr. Everett retired from physics early in order to make millions. However, this wasn’t his plan. In his doctoral thesis (written under the supervision of the great J. A. Wheeler), Everett proposed the basics of what is now called the many-worlds interpretation (MWI) of quantum mechanics. He proposed a way of dealing with the so-called “collapse of the wave-function.” Put as briefly as possible, quantum mechanics is statistical/probabilistic in nature, but the equations that define how quantum systems “evolve” in time (such as the famous Schrödinger equation) are deterministic. So our math tells us that the system behaves one way, but when we try to measure the system it acts quite differently: The state of the system “jumps” or “collapses” upon measurement in a very unsatisfactory way (it’s wayyy more involved than this but I’m massively simplifying for brevity). One big problem with this mysterious “collapse” notion is that it doesn’t do a good job of explaining how the quantum realm, which is supposed to be the foundation for reality, yields the classical world we experience. Everett proposed we resolve this by understanding that the possible outcomes are really “actual”, and that our classical world is constantly emerging from infinitely many realized outcomes of infinitely many quantum interactions, and each outcome is a “branch” realized by this infinite splitting of universes into others.

So, when you get split ends, the universe splits. But this multiverse theory isn’t really different universes (or rather, it’s more like different histories of the same universe). It’s not like there’s “our” universe which is just one among many. After all, “our” universe is constantly splitting too, so there isn’t really even a single “us” (granting the MWI is true, of course). Another way to think about it that won’t help you is that the “wave function” never collapses because there’s only one wave function that encompasses all reality. The possible outcomes of quantum mechanics aren’t possibilities, they’re just “branching worlds” of this wave function. If it were possible to travel to one of these alternate “universes”, there would be no point to MWI. Everett (who didn’t use the term “many-worlds”, which was coined by DeWitt), didn’t really even go as far as suggesting there exist alternate histories in which e.g., World War I remained “The War to End All Wars” (i.e., no WWII).

Not a multiverse, just a universe with some really deep pockets

The next “multiverse” theory is even less like a multiverse. The good news is it’s much simpler. It’s well-known that the universe is expanding. It’s even more well known that a very widely (although not universally) accepted theory (the Big Bang theory) posits that this expansion began a while ago (probably even before the 50s!) from a “point” out of which the entire universe emerged. It’s not very widely known how problematic the nature of this “bang” is. Obviously, the laws of physics breakdown at the “bang” itself, but they continue to fail after the initial moments of the universe (expansion faster than light, no atoms, immensely high temperatures and pressures, etc.). With some fairly minimal assumptions, the Big Bang theory also gets us a multiverse (it’s a “buy one get one free” sort of thing). Simply put, as the universe expanded you can think of it as sort of “ripping” into various pieces. Our piece is bounded by our particular cosmic horizon-a sort of limit that prevents us from observing anything beyond it. These pieces are often called universes, and this is (an incredibly simplified) version of the multiverse theory: a set of universes originating from the same cause, with the same laws of physics, impossible to “reach” or “travel to”, and fairly boring. In fact, some physicists don’t like to call these pieces “universes” at all:
“Some refer to the separate expanding universe regions in chaotic inflation as ‘universes’, even though they have a common causal origin and are all part of the same single space–time. In our view (as ‘uni’ means ‘one’) the Universe is by definition the one unique connected1 existing space–time of which our observed expanding cosmological domain is a part. We will refer to situations such as in chaotic inflation as a multidomain universe, as opposed to a completely causally disconnected multiverse.”
Ellis, G. F., Kirchner, U., & Stoeger, W. R. (2004). Multiverses and physical cosmology. Monthly Notices of the Royal Astronomical Society, 347(3), 921-936.

From Branches to Bubbles, Pieces to Pockets: Inflationary Cosmology Take 2

But the story of “universes” resulting from inflation doesn’t end here. In a similar multiverse theory, not only do the “pieces” or “pocket universes” differ more radically, but the “gaps” in the multiverse allow for “bubble universes” that not only have different laws of physics but perhaps the possibility of interaction (they can careen into one another, which isn’t exactly the kind of interactions between universes from sciences fiction). However, I’m not going to explain this one. I’m going to use it as an example to show that my explanations aren’t as bad as they seem by quoting another, fairly non-technical introductory piece on multiverse cosmologies:
“In the fashionable variant known as eternal inflation, due to A. Vilenkin and A. Linde, our “universe” is just one particular vacuum bubble within a vast–probably infinite–assemblage of bubbles, or pocket universes. If one could take a god’s-eye-view of this multiverse of universes, inflation would be continuing frenetically in the overall superstructure, driven by exceedingly large vacuum energies, while here and there “bubbles” of low-, or at least lower-, energy vacuum would nucleate quantum mechanically from the eternally inflating region, and evolve into pocket universes. When eternal inflation is put together with the complex landscape of string theory, there is clearly a mechanism for generating universes with different local by-laws, i.e. different low-energy physics. Each bubble nucleation proceeding from a very large vacuum energy represents a symbolic “ball” rolling down the landscape from some dizzy height at random, and ending up in one of the valleys, or vacuum states. So the ensemble of physical by-laws available from string theory becomes actualized as an ensemble of pocket universes, each with its own distinctive low-energy physics. The total number of such universes may be infinite, and the total variety of possible low-energy physics infinite, but stupendously big.”
Davies, P. C. W. (2004). Multiverse cosmological models. Modern Physics Letters A, 19(10), 727-743.

Naturally, you not only use the word “frenetically” in everyday discourse, but of course are more than well aware of the ways in which extra dimensions required by string theory are explained in terms of compactification to space-like regions in which they determine the physical laws for each particular region.

“As Above, So Below”: Combining the Multiverse with Many-Worlds

Certain physicists have decided that, as long as we’re admitting the possibility of infinitely many bubble universes eternally popping into existence and having differing laws of physics, and because this sounds a lot like the many-worlds interpretation of quantum mechanics, it would be a good idea to say that these quite independently developed theories formulated to address fundamentally distinct issues are nonetheless the same. I have no illusions about my inability to condense into a paragraph anything remotely resembling a clear account of how some physicists derive an equivalence between the MWI and multiverse cosmology. So I’ll leave this one with “nothing is possible, because every possibility is actualized”.

“I found God! He was hiding in a holographic anthropic multiverse”

Scientists may not be as objective as we’d like, but at least they rely on fairly minimal assumptions as opposed to e.g., historians of prehistory or theologians. Except when they don’t:
“Despite the growing popularity of the multiverse proposal, it must be admitted that many physicists remain deeply uncomfortable with it. The reason is clear: the idea is highly speculative and, from both a cosmological and a particle physics perspective, the reality of a multiverse is currently untestable…For these reasons, some physicists do not regard these ideas as coming under the purvey of science at all. Since our confidence in them is based on faith and aesthetic considerations (for example mathematical beauty) rather than experimental data, they regard them as having more in common with religion than science…To the hard-line physicist, the multiverse may not be entirely respectable, but it is at least preferable to invoking a Creator. Indeed anthropically inclined physicists like Susskind and Weinberg are attracted to the multiverse precisely because it seems to dispense with God as the explanation of cosmic design” (emphases added)
Carr, B. (2007). Introduction and Overview. In B. Carr (Ed.). Universe or Multiverse? Cambridge University Press.

I like the 2nd bolded portion, mostly thanks to

Amoroso, R., & Rauscher, E. (2009). The Holographic Anthropic Multiverse: Formalizing the Complex Geometry of Reality (Series on Knots and Everything Vol. 43). World Scientific.

What could this obscure sounding combination of cosmology, theoretical physics, and mathematics have to do with why some physicists like the multiverse because it “seems to dispense with God as the explanation” of the cosmos? Because Amoroso and Rauscher’s cosmology is (in their words) “a theistic cosmology”, despite being a multiverse cosmology and despite espousing the holographic principle (yes, it’s related to holograms; no, the world isn’t a hologram like the holograms we’re familiar with are).

Wrapping this up (finally)

I could keep going for some time and not scratch the surface. But the point is that invariably even the most exotic multiverse theories aren’t the kind described in everything from popular science to science fiction films. Even those which hold that some 101029 miles away there is an identical version of you are still really (if real, that is) part of this universe, and the very thing that seems to warrant calling them universes is that they are untestable, undetectable, and impossible to ever travel to or visit (even if you had a Delorean equipped with a flux capacitor AND a warp drive).

Posted in Physics | Tagged , , , , , , , , | Leave a comment

A note on nature of spacetime dimensions (and more!)

If there are more than three dimensions, can I get there by meditating?

I’ve often heard people fascinated by the idea that reality is actually 4-dimensional, or that physics beyond the standard model proposes that there really are 10-dimensions (or more!). And this is fine: it’s pretty interesting stuff. The problem is that all too often comments about 4D spacetime or the dimensions suggested by string theory (or some successor to it like M-theory) are accompanied by comments about how these extra-dimensions suggest that there could be an astral plane dimension, or elicit questions like “what if one of these dimensions is a spiritual dimension?” and so forth.

My problem isn’t that people are curious about mysticism, spirituality, etc. (I am too, albeit from a much more skeptical foundational perspective than many). The issue is the conflation of very, very different senses of the word “dimension”.

Dimensions in physics are mathematical, even when they aren’t

In common parlance (which doesn’t include the use of the word “parlance”), dimensions are individual, or separate from one another. Hence one might refer to a spiritual dimension vs. a material or physical dimension, or speak of the different dimensions of some business problem, or refer to the racial dimension of a sociocultural problem, etc.

Unfortunately, a lot of mathematics education unintentionally reinforces this misconception of dimensions being “separable” like this. Students of pre-college algebra learn about the x,y-plane (or Cartesian plane). They get used to working with problems in coordinate geometry, trigonometry, even calculus in this “2-dimensional” plane. Only it isn’t actually 2-dimensional.

Some (elementary) mathematics that is pointless if you know it and probably incomprehensible if you don’t- you can skip it

The truth is that even physics students have a hard time here, because they are initially taught about spatial dimensions (defined in terms of x,y, & z axes), and learn to understand this “space” in terms of special vectors i, j, & k (vectors are mathematical “objects” that are, simplistically, composed of individual values; you can think of a vector in the Cartesian plane as quite similar to a point in the plane, only instead of a pair of values (x,y) we have a single vector x which consists of an x-value and a y-value and instead of a point the vector can be thought of as a line from the point (0,0) to the values of the x,y-components of the vector). The vector i, for example, has an x-value of 1, a y-value of 0, and a z-value of 0, j has an x-value of 0, a y-value of 1, and a z-value of 0, and k has 0,0,1. This is very useful for doing mathematics required for classical mechanics, because in classical mechanics things like velocity are vectors in our familiar 3-dimensional space. Then the physics student takes a course in linear algebra, or moves into more advanced differential geometry, or studies dynamical systems, or studies electromagnetism, or is otherwise introduced into the “real” world of mathematical spaces in which there are no special vectors for the x-axis, the y-axis, and the z-axis because frequently one has to work in 1-dimensional space or 10,000-dimensional space (which would require 10,000 vectors like i, j, & k, each one having 0’s except on the “axis” for that “dimension”). At this point, all the familiarity with these special 3-dimensional vectors and how to work with them become more of a conceptual stumbling block than a step on a the path to more sophisticated topics.

…And we’re back: How to think about dimensions

All this is overly complicated, though, and I mention it briefly only to demonstrate that even so brief a treatment as this one can still be overly complicated. A much better way to think about dimensions is to consider why coordinate geometry problems worked out on a dry-erase board or chalkboard by the teacher or on paper by the student aren’t actually 2-dimensional. As chalk is scraped across a chalkboard, it leaves behind a 3-dimensional residue on a 3-dimensional surface. No matter how “thin” you make a sheet of paper, it is a 3-dimensional object. NOTHING truly 2-dimensional exists in a 3-dimensional world. Mathematically, a 2-D plane in a 3-D space has 3-dimensions, and in reality there are no 2-D planes in a 3-D reality.

You can also think of how a line is defined in the x,y-plane. Every point on the line has 2-dimensions (each has an x-value and a y-value). In fact, the “real” 1D space is just the number line (technically this is a 1D Euclidean space, and not all spaces with the same dimensions are the same, but for simplicity they are here).

The point is that the 3-dimensional world we experience isn’t composed of the separate dimensions in any way that most can readily conceptualize and certainly not that any can experience. Adding an extra dimension for time means that everything described in this spacetime is always and everywhere described by single points or sets of points that are defined by 4-dimensions. If there are really 10-dimensions, or 26, etc., then we are always and everywhere living as 10-dimensional or 26-dimensional beings. There is no special status accorded to these dimensions such that one could be the astral plane, heaven, hell, Nirvana, higher “universal” consciousness, or whatever, any more than there is some special status accorded to the 3rd spatial dimension z compared to x and y (technically, this isn’t true, but the special status of extra dimensions has to do with manifolds, topology, and other mathematical notions that the surgeon general cautions can lead to headaches, psychotic breakdowns, permanent social disorders, and writing blogs nobody in their right mind would read).

Posted in Mathematics, Physics | Tagged , , , , , , , , , , | Leave a comment

Movie Trivia: 300

You may wonder how movie trivia “fits” the nature of this blog. Clearly, you haven’t read much of it because even I can’t figure out the nature of this blog, but one thing I know is that a lot of the posts I wrote on a whim due to something I found interesting. With that in my mind, let’s talk quotes:

Bad Guy Envoy of Xerxes: “Our arrows will blot out the sun!”

Spartan soldier who clearly works out in the same gym Hercules does: “Then we will fight in the shade.” (if these are off, it’s because I’m going on memory and I don’t care here).

Now, many a “gun nut” will know of the quote in 300 in which Leonides responds to a demand (“Spartans! Lay down your weapons”) with “Come and get them”. That’s because there are hats, shirts, etc., with the Greek μολὼν λαβέ (molon labe), or “having come, take [them]. Apparently “you can have my guns when you pry them from my cold dead fingers” wasn’t cutting it, so somebody decided to quote from Plutarch.

What’s less well known is that the “fight in the shade” bit also comes from an ancient source (Cicero’s Tusculanae Disputationes), although it’s not quite as dramatic as in the film:

e quibus unus, cum Perses hostis in conloquio dixisset glorians: “solem prae iaculorum multitudine et sagittarum non videbitis”, “in umbra igitur” inquit “pugnabimus.”

(“One of them [the Spartans], when an enemy Persian soldier said to him, boasting, ‘You will not see the sun because of the cloud of our javelins and arrows;’ [the Spartan] replied ‘Then we will fight in the shade'”)

(translation mine, and off of the cuff with little thought to e.g., whether I should have broken up the reply as in the text into “‘in the shade’ he replied ‘we will fight'” and other such nuances)

Posted in History | Tagged , , , , , , , , , , | Leave a comment

Was Einstein good at math?

How to evaluate diverging sources and why it is important:

There is a widely held notion that Einstein was bad at math, or flunked math, or that “every boy in the streets of Gottingen understands more about four dimensional geometry than Einstein” (the great mathematician and contemporary of Einstein, David Hilbert). Yet we find sources claiming the following:

“Contrary to another myth, Einstein did not have difficulties in mathematics. Indeed, his preteen replacement of religious zeal with scientific fervor involved mathematics too. He was given a book on Euclidean geometry, which he devoured, even trying to prove theorems on his own before reading the solutions in the book. In his autobiography he referred to this math textbook as the ‘holy geometry book’– an extraordinary phrase for such a prosaic subject, but perhaps significantly it was an unconscious reference to his geometry book replacing the previous other “Holy Book” around the same time in his life. He went on to higher mathematical texts, teaching himself and mastering calculus by age sixteen. All of which explains the letter from his mathematics teacher that was in his pocket as he crossed into Italy.”
Topper, D. R. (2013) How Einstein Created Relativity out of Physics and Astronomy (Astrophysics and Space Science Library). Springer.

So did Einstein really have no difficulties with mathematics? Does the equivalence of “Einstein” and “genius” hold for mathematics as it is holds in general usage (i.e., “He’s no Einstein” or “She’s like an Einstein when it comes to problem solving”)? Well, the first problem is reading between the lines of the above. When I was in 2nd grade, I read Moby Dick cover-to-cover. I devoured it, mainly because I knew enough to know it was great literature. I understood almost nothing at that point of what I read. It was beyond me. In the introduction to a delightful math book I know of, the author notes how fascinated he was not just by a calculus text he read as a youngster, but a particular integral. Yet he didn’t really understand the integral or the text. And as for finding geometry sacred, the Greeks beat Einstein to this position on geometry by some ~2500 years, yet it was some ~1000 years until any culture was able to produce mathematicians able to make the jump of a pre-college student from geometry to algebra (actually, these days algebra tends to be introduced before geometry, but the jump is still possible to make in a year or less rather than a millennium). What we don’t find is why Einstein’s famous 1905 derivation E=mc2 was wrong (Ohanian, H. C. (2009). Did Einstein prove E= mc 2?. Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics, 40(2), 167-173.), why so many of the mathematical developments in work began by Einstein like special and general relativity were developed by other physicists (Minkowski, Lorentz, etc.), or why Einstein relied on mathematicians or physicists better in mathematics than he:

“Einstein is generally recognized as the greatest physicist of the 20th century and perhaps the greatest physicist since Newton, though Faraday and Clerk Maxwell are close competitors… But unlike Newton, Einstein was not a mathematician. He used mathematics in an essential way but he did not create it and he relied on his colleagues for technical help.”
Atiyah, M. (2007). Einstein and Geometry. In S. R. Wadia (Ed.) The Legacy of Albert Einstein: A Collection of Essays in Celebration of the Year of Physics (pp. 15-23). World Scientific.

“If the air of Ulm [Einstein’s birthplace] carries some mathematical miasma, it did little good to Einstein. He always remained a rather mediocre mathematician. In his autobiography (or ‘obituary,’ as he liked to call it) written in his late years in Princeton, he confessed, ‘I had excellent teachers, so I really could have gotten a profound mathematical education…That I neglected mathematics to a certain extent had its cause not only in that my interest in natural sciences was stronger than in mathematics, but also in the following strange experience. I saw that mathematics was split up into many specialties, each of which could absorb the short lifespan granted to us. Thus I saw myself in the position of Buridan’s ass, which was unable to decide on a particular bundle of hay…’ Einstein preferred to leave any difficult mathematical labors to others, such as his mathematician friend Marcel Grossman, during his years in Zurich and the several other mathematical assistants he later employed during his years in Berlin and Princeton, whose job it was to grind out the mathematical details that Einstein found too troublesome. He called them his Rechenpferde, or his ‘calculating horses,’ a reference to Clever Hans the horse…”
Ohanian, H. C. (2008). Einstein’s mistakes: The Human Failings of Genius. WW Norton & Company.

Any significant reading in a given subject at an academic level will yield divergences as well as outright contradictions. It is thus important to be able to evaluate sources before one happens upon direct contradictions or divergences.

Posted in History, Mathematics, Physics | Tagged , , , , , , | Leave a comment

Why you are irrational and illogical

Humans are amazingly smart. We not only have the most powerful brains in the known universe (which is a stupid statement because “known” here must implicitly refer to what we know and therefore the statement is basically “so far as we know, we are better able to know than anything we know”), we’ve managed to construct systems which are able to learn the way many living systems do, only with the advantage of unparalleled precision and “memory”. On the other hand, we’re morons who continually manage to make inferences we believe to be supported by logic, mathematical reasoning, rationality, etc., that are actually wholly illogical. What gives?

It turns out that logic and probability (which is required for valid inferences involving chances among many, many other every-day applications) do not wholly come naturally to anybody, and generally don’t come natural at all to most. Here I intend to give some “fun” (or hopefully at least mildly interesting) illustrations as to why we can be some dumb when we think we are being so rational.

The first method I’ll use is to demonstrate the counterintuitive nature of probability and logic by some classic examples:

1) The Full Monty (Hall). Imagine that I run some booth at a carnival which consists of a table and 3 boxes. You are intrigued, not because my booth looks like it promises anything remotely resembling entertainment, but because you wonder how I could possibly make a dime with such an unappealing display. So you ask me to explain what the “game” is and what you can win. I tell you that within one of the three boxes is 100 grand (I don’t tell you I mean the candy bar). The other two are empty. But this isn’t simply a game of guessing which box holds the prize; I sweeten the pot: you pick a box, and without telling you what’s in it, I open one of the other two boxes (it will never be the box with the prize). Now I give you the chance to change your choice.

You, being a rational person with no small amount of brain power, see right through this “sweetened pot”. The chances that you chose correctly initially were 1 in 3, because there were three boxes and one prize. Now, I’ve shown that one of the boxes doesn’t have the prize, and your chances are better (1 in 2), but I’ve tried to trick you into thinking there is any good reason to change your mind. After all, there are now two boxes, you’ve chosen one already, the chances are 1 in 2, so what would be the point of switching choices?

A lot. Consider a similar situation, only this time there are a billion boxes. The chances that you select the correct one with your initial choice are 1 in a billion. This time, after you make your initial choice, I again reduce your options to 2 boxes by opening all the boxes that don’t have prizes other than the one you picked and one other box. You know the prize must be in one of the remaining boxes, but how on earth did you manage to choose a box such that your chances went from 1 in a billion to 1 in 2 (apparently) and there’s no point in switching your initial choice? The answer is that you almost certainly chose the wrong box. The probability was 1 in a billion, but even though there are only 2 boxes left the chances aren’t 1 in 2. As soon as you select a box, whether there are initially 3 or initially a billion, in order to reduce your options to 2 boxes I have to open up a non-prize box or non-prize boxes. Either you initially selected the prize box, in which case I can open any other box or boxes I wish so long as I leave 1 unopened, or you didn’t, in which case I must open boxes (or a box in the case of 3) until only the prize box and the box you selected are left. Even in the case of 3 boxes (and obviously in the case of a billion), the chances are you chose wrong. Even in the case of 3 boxes, if I open one more I have to do so on the condition that it isn’t the prize box, which means the probability is conditional (it is contingent upon your initial choice). So even in the case of 3 boxes, you should switch your choice.

2) Smoking doesn’t cause cancer. Imagine that I am the little guy fighting the big, bad Tobacco Industry in some Hollywood-like dramatization (I’m working with a guy who looks a lot Al Pacino, and I bear a striking resemblance to Russell Crowe). I end up in court, but with the results of years of research. I can show that the incidence of cancer among the smoking population is a billion times higher than average. My research designs are flawless, my use of statistics perfect, and the finding (the difference between the cancer incidence among smokers vs. non-smokers) is accurate. I also lucked out because the jury happens to consist of several eminent logicians, a few professional statisticians, and two mathematicians. My lawyer and I agree the jury will inevitably rule in my favor. Only they don’t. Why?

Mostly because they are using rules of logic and analytic reasoning. In this example, sticking to formal reasoning yields the wrong answer, because the jury consists of individuals who know that correlation doesn’t equal causation and have forgotten that the reason we frequently think it does is sound: correlation makes causation more likely, and the greater the correlation the greater the chances (in general).Also, the statisticians unfortunately all work in the social and behavioral sciences, and are prone to consider my results in terms of the kinds of arguments marshalled to show that marijuana is a gateway drug: people addicted to heroin, cocaine, etc., are much more likely than the average person to have smoked marijuana and, in addition, they almost always start drug use with marijuana. Ergo, gateway drug. Only, 1) almost all alcoholics started drinking milk, but milk isn’t a gateway drink and 2) the “gateway drug” fallacy involves comparing the wrong populations. These problems are related and the “gateway drug/drink” story is a classic example of the “correlation equals causation” fallacy. If I run several studies on samples from the population of alcoholics, I will find a very high correlation between drinking milk and being an alcoholic. That’s because I started with alcoholics, so anything that holds true in general of people is going to be highly correlated with alcoholism. If I compare the incidence of milk-drinking in the population in general to the population of alcoholics, I’ll find that I get the same “correlation” with no-alcoholics. This isn’t quite true of marijuana vs. heroine, crack, etc. Here, the population in general has a much lower average rate of marijuana usage and amount of marijuana usage than the population of addicts. But I’m still ignoring a key population: people who use or have used marijuana. When I start with this population, and try to use marijuana to predict addiction to crack, heroine, etc., I will fail. In fact, my predictive power won’t be much better than if I used milk for my predictive model.

But we all know that smoking does cause cancer, right? Not exactly. In most uses of the word “cause”, the answer is yes. But scientifically, we tend to restrict claims that x causes y to what is called “necessary and sufficient conditions”. That is, to claim x causes y, we require that for y to happen, x must happen, and that y cannot be happen unless x does. These might seem to be equivalent statements (logic again!), but they aren’t. Consider billionaires. In order to be a billionaire, you have to be a millionaire because you have to have millions to have billions. Thus being a millionaire is a necessary condition for being a billionaire. But not all millionaires are billionaires. So being a millionaire is not a sufficient condition for being a billionaire. The reason the very rational, highly educated jury came to the wrong conclusion is because they correctly realized that my evidence didn’t demonstrate that smoking causes cancer, but they failed to correctly realize that my evidence made the probability that smoking causes cancer incredibly likely. Unlike the “gateway drug/drink” fallacy, I (being the diligent, Russell Crowe-looking researcher I am) didn’t test only smokers or only those with cancer but compared the incidence of cancer among BOTH the population of smokers and non-smokers. Now, it could be that my results are due to a third variable such as some gene that tends to both cause cancer and make people inclined to smoke. But if so, the Tobacco Industry lawyers would have such an explanation, and the fact that, with all their money and resources, is actually evidence that the jury should have considered (absence of evidence IS evidence of absence, it just is often not very good evidence).

3) One-eyed flying purple people-eaters. This example is quick, neat, and dirty. Consider the statement “all numbers are even.” This is obviously wrong, but why? Well, because all it takes to show it’s wrong is a single counter-example like 3. This sounds reasonable- to show that a property X doesn’t hold for all Y, one need only find a single Y for which X doesn’t hold. But consider the statement “all one-eyed flying people-eaters are purple.” I argue (as logicians, mathematicians, etc., do) that this is true. Moreover, that it is clearly and obviously true, and that nobody familiar with logic or analysis could possibly think otherwise (I’m wrong about this, but not for the obvious reasons). You, being a rational, intelligent individual, assert that I’m clearly and obviously wrong (and a moron to boot). Well, with the statement “all numbers are even”, we proved this false merely by offering the counter-example 3 (there are infinitely many other counter-examples we could have offered, but we only need one). To show that the statement “all one-eyed flying people-eaters are purple” is false, there must exist at least a single one-eyed flying people-eater who isn’t purple. When you can show me this person, then you can tell me I’m wrong.

4) To prove it, assume it’s true. When taking courses in logic, set theory, argumentation, or other topics which involve formal, logical proofs, students often find the proofs for one kind of statement illogical/irrational. Consider the statement “If x, then y”. How might you prove this in e.g., a mathematical/formal logic course? Let’s make this less abstract. We’ll start with the example “if x is a swan, then x is white.” It seems pretty intuitive that for anything x, for this statement to be true it must be true that there exist no black swans. But what about a statement like “if I’m the king of the universe, then the moon is made of green cheese”? This statement is in fact true. We’re back to the one-eyed flying people-eater problem. That’s because the truth of any statement “if x, then y” (whether involving swans or me ruling the universe) doesn’t depend (at least directly) on reality. To see why, imagine how you’d determine whether or not I was right if I told you “if you go outside, you’ll get wet from the rain.” The easiest way would be to go outside, because if you went outside and didn’t get wet, you could say I was wrong. If you didn’t go outside, though, you can’t possibly say that what I said was logically false because I only stated that on the condition that you went outside, then you would get wet from the rain. Remember when I mentioned conditional probability in the first example? Well, this is the very related issue of “conditionals”. When I say something like “if x, then y”, what I have logically claimed is that “on the condition that x is true, then y must be true.” If x is false, then necessarily the conditional is true. The reason why “if I’m the king of the universe, then the moon is made of green cheese” is true is because I’m not the king of the universe (yet…). Conditionals make a conditional assertion (the “if x” part), and to be wrong, the condition has to be met and the “then y” part turn out false. By making the “if x” part false, I make it impossible for the condition to be met and therefore impossible for the conditional statement to be false. Similarly, when I say “all one-eyed people-eaters are purple”, I am saying that a property holds true of every member of a group that doesn’t exist. You can’t prove it false because it is vacuously true.

5) Vacuously true and the null/empty set. There’s a famous paradox called “The Barber’s Paradox.” It’s sort of interesting: there’s a town in which a barber shaves all and only the beards of men who don’t shave themselves. Who shaves the barber? He can’t shave himself, because he only shaves those who don’t shave themselves. But if he doesn’t shave himself, then he must shave himself because he shaves every man who doesn’t shave himself.

Turns out this kinda stupid kinda interesting example is more important than you might think. The man who founded formal/symbolic logic, Gottlob Frege, did so in order to achieve something even more incredible in mathematics (what I won’t go into). The problem is that he allowed logical statements of the form “X is the set that contains only and all sets that don’t contain themselves.” Although we substitute barbers and beards for a symbol X and sets, the logic is the same (it’s called Russell’s paradox, because Bertrand Russell wrote a letter to Frege in which he demonstrated that Frege’s formal system made possible statements which couldn’t be formally evaluated; unfortunately for Russell, an equally simplistic paradox was to unravel his and White’s monumental 3-volume work The Principia Mathematica). As a result, logicians and mathematicians realized they had to be very careful about defining sets. In particular, there must exist one and only one “empty” or “null” set (the set with no members/elements) and this set must be a subset of every set. The fact that it is a subset of every set is actually intuitive. Consider the set of all people and the subset of people who are purple, have one-eye, and fly. How many members belong to this subset? None. The more important issue is that there is only one “empty” set. Imagine this isn’t true and consider again the set of people and the subset of people who are purple, have one-eye, and fly, but this time also consider the set of people who are dragons. If these sets are different, then one of them most have some element/member that the other doesn’t. But neither have any members, making this impossible.

Conclusion example) Why the most logical intelligent systems are worse at recognizing faces than babies. I hope that these examples and the discussion of them has revealed something about the counter-intuitive nature of logic and probability (it’s a bit difficult to demonstrate that something is hard to understand while making it understandable, so if either the examples were too difficult to be understood or too easy to be counter-intuitive, you’ll just have to trust me here). So why do we do things like illogically infer that correlation is causation and find illogical perfectly logical arguments or statements? Because we live in the real world. Consider facial recognition software. Chances are, you’ve never actually written a facial recognition program or any program that allows a computer to “learn” to recognize/classify things (whether distinguishing letters from scribbles or passing CAPTCHAs). But you have probably used a calculator. Computers are called computers because that’s what they do: compute. Computing is another word for calculating, and your simple calculator is really the same as the world’s best supercomputer (just slower). Getting a computer to recognize faces or letters is not that different from trying to get your calculator to do this. Computers understand nothing. To get them to do anything, you have to reduce it to mere logical rules (because every computer program ultimately works only because of “logic gates”, which are physical realizations of a few simple, logical “operations” from formal logic). To understand what I mean, we will finally see the real version of logical statements. In formal logic, a statement like “there is only one Pope” must be rendered into symbolic form such as ∃x[(Px → Hx) & ∀y(Py → y=x)] (in words,” there exists an x such that if x is the Pope then x is human and for all y, if y is the Pope then y is x”). The point is to be able to take something inherently meaningful, like language, and reduce it to meaningless symbols such that one can decide whether or not it is true or not without having any idea what it means. In fact, in logic courses one learns to do “proofs” or “derivations” in which one uses rules of logic to show that e.g., ~(A v B) ≡ (~A & ~B) without needing to know what “A” or “B” mean. That’s because students learn that the certain symbols are “operators” that require a particular, specific operation and others are things that must be operated. Another word for “operated” is “computed” or “calculated”, which is why, in order to get computers to recognize faces, we have to reduce faces and the differences between them to pure, meaningless formal logic.

Humans don’t think logically or rationally much of the time because we recognize it is far more important to be able to recognize causes and patterns that enable us to recognize the relationship between lots of smokers with cancer and smoking as a cause of cancer, recognize trees as being trees despite the fact that no two trees are exactly the same the way computers would have it, etc. Kurt Gödel was a brilliant mathematician and probably the greatest logician who ever lived. Towards the end of his life, he came to believe that people were trying to poison him, and he trusted only his wife to prepare his meals. Unfortunately, she grew ill at one point and had to spend an extended amount of time in a hospital.

Gödel, however, was nothing if not logical. He had two premises:

P1) People are trying kill me by putting poison in any food I eat.
P2) The only food that I can believe isn’t poisoned is food provided by my wife.

From these premises follow this conditional proposition:

C) If my food isn’t provided by my wife, I can’t believe that it isn’t poisoned.

From this logical inference he concluded (validly) that the best course of action would be not to eat anything until his wife returned, as either she would return before he starved to death, or he would die of starvation which meant the same outcome as eating food not prepared by his wife (death). The problem is that this highly logical, valid inference was ridiculous because it ignored the wildly improbable idea that there was some mass conspiracy to poison him. Turns out, common sense may not be logical, but it is far more useful most of the time. It turns out that our intuitions about the probability of tossing a fair coin 100 times and getting all heads isn’t the same as getting a sequence of heads and tails (though this is false) is useful because it allows us to look past the specific sequence and recognize the more important truth: it’s more probable that tossing a fair coin 100 times will result in some sequence of heads and tails than all heads. It turns out that because we don’t analyze language according to logical rules, we don’t interpret “If you’re hungry, there’s food on the table” to be equivalent to “if there’s not food on the table, you’re not hungry”. Turns out being a little illogical goes a long way to being right more often than not (particularly in the tens of thousands of years humans existed before things like number systems, formal logic, or writing even existed).

Posted in Probability | Tagged , , , , , | 4 Comments