My Shot At Explaining Bayes’ Theorem

by Scott

As we all know Bayes’ Theorem is hot.

And there are lots of good resources out there for Bayes’ Theorem.11.An Intuitive Explanation of Bayesian Reasoning is ©2003 by Eliezer S. Yudkowsky. [link]

Mine is a sort of mnemonic by way of explanation.  Bayes’ they say, is “deeply intuitive” and yet, it seemed, that I usually found myself needing a refresh. A lot of the example exercises one would work through are about test error rates or other abstractions, and so that whole: we know [latex]P(B\mid A)[/latex] but what we really want is: [latex]P(A\mid B)[/latex]  mostly turns to mush because many of these example exercises are fairly removed from “intuition.”  What is needed is an example application of Bayes’ that one can mentally refer back to when looking at a new problems.  Well, something close to our everyday experience and something that we all eventually think about turns out to be a good way to explain Bayes’ Theorem.

Saturday Idyll Speculation

Maybe it hits you in Peet’s Coffee on a busy Saturday morning, or maybe when you are walking through Bernal Heights, but after seeing one too many of these men pushing Swiss premium double-wide prams, eventually you say to yourself, “Man, it wasn’t like this when I was newsboy growing up; playing stick ball.  Twins used to be really rare, maybe I knew one set or two growing up, but these days seems like twins are everywhere!”  This reaction is reflective of the statistics.  The CDC referred22. National Vital Statistics Reports, Vol 57. No. 7, January 7, 2009 [pdf] to it as:

The rapid, unprecedented rise in multiple birth rates of the last several decades..

Of course, the answer is, you say to yourself: “invitro” (Assisted Reproductive Technology (ART) as the statisticians call it).  Everybody’s doing “invitro” these days and you’re always hearing about the twins, octo-moms, etc. that result.  ”Invitro” must supercharge the rate of occurrence of twins.

So as you sip your large coffee (cream, no sugar) and stare at the new dad with twins, you wonder: “did they use invitro?” Or maybe your mother calls and tells you that a distant relative is having twins: “huh, wonder if they used invitro?”   You realize, that you can never know with certainty whether this specific couple used “invitro” because twins do occur naturally.  Maybe this couple just happened to have the rare natural twins.   But if twins are rare naturally and common via “invitro” shouldn’t that tip the scales somehow toward “invitro?”

Well, let’s dive in to the stats a bit.   The natural incidence of twins in pregnancy is considered:


So approximately 1.25% (1 / 80) of successful pregnancies from “the natural way” will result in a set of twins.

Looking at the 2007 ART Success Rates Report33.2007 Assisted Reproductive Technology Report, Centers for Disease Control and Prevention [link] [pdf], which details the outcomes of all “invitro” pregnancies in the US, we find that:

12,792 out of 43,412 “invitro” deliveries were Twins

A whopping 29.47% (12,792 / 43,412) of successful pregnancies from “invitro” will result in a set of twins.

Ha! You knew it!  It’s not exactly a smoking gun, but that’s a big difference. You are 23.57x (29.47% / 1.25%) more likely to have a twin if you use “invitro,” than if you do not.   So that tells us something, but you still don’t have a statistical guess about this particular individual.

In statistical parlance we now have [latex]P(twins)[/latex] (read: probability of having twins) and [latex]P(twins\mid invitro)[/latex] (read: probability of twins given “invitro” i.e. if you have “invitro” what are the odds of a twin).  But that’s not what we want, we’ve already got the twin! They’re both working on $3 muffins there before you. What we want is:  what is the probability that the parent used “invitro,” given the twins we see before us?

Statistically, we want:

[latex]P(invitro\mid twins)[/latex]

And it’s in the “given twins” part where people start talking about “information updating” when talking about Bayes’ Theorem.Taking a step back, let’s pretend we know nothing about the individual in question, having never seen them, somebody phone’s you up and tells you this individual has just delivered an indeterminate number of infants. Then they ask you: “What’d ya think the chances they used invitro?” At this point you’d have to throw your hands up in the air and say something along the lines of “How should I know?! If forced to pick a probability, I’d just have to go with the general prevalence among the population.”You would just say the odds of “invitro” for this individual is commensurate with that of the population. The baseline rate.

Looking again at the 2007 ART Success Rates Report, we recall that:

In 2007, there were 43,412 total “invitro” deliveries (Singleton, Twin, and Triplet+) in the US

And looking at the looking at the National Vital Statistics Final Birth Data for 2006 (close enough, no final data for 2007 yet) and massaging the data a bit we find that:

In 2007E, there were 4,192,614 total deliveries (Singleton, Twins, and Triplet+) in the US

I guess that’s the baseline rate; a measly 1.035% (43,412 / 4,192,614). So, knowing nothing, just that an individual delivered an indeterminate number of infants, I guess you need to start here (they call this “your prior,” consider it slang, trying to think “prior to what?” just confuses the issue.  It has no meaning beyond: it’s the uneducated guess). But what about this twin stuff?  Isn’t this the smoking gun? Can’t we use this “information” some how?

This is where Bayes’ Theorem comes in.  Bayes’ bridges the gap between “its much more likely to have twins using “invitro” then natural” and “that difference in likelihood increases the probability that they used “invitro” by this much.” Bayes’ Theorem for our case is:

[latex]P(invitro\mid twins)=\frac{P(twins\mid invitro)}{P(twins)} \times P(invitro)[/latex]

We’ve already calculated [latex]\frac{P(twins\mid invitro)}{P(twins)}[/latex], it’s 23.57x (29.47% / 1.25%) more likely to have a twin if you use “invitro,” than if you do not. And [latex]P(invitro)[/latex] is just the baseline rate of 1.035%.  So, the probability of the man with the pram having used “invitro,” given the twins you see before you is: 24.41% (23.57 x 1.035%).  This makes sense, if you’re 24x more likely one way than another, then the probability you went that way is 24x greater than the baseline rate.

I like this explanation as a mnemonic because all the categories are crisp in my mind. The unknown is real clear: what are the chances they had “invitro” given the twins that I see before me  (sure as hell can’t ask them, that’d require open and honest communication :)

It also highlights: Why do you think the twin tells you something?  Well, the probability of having a twin is much higher if you have “invitro” than if you don’t.  And you can see that the higher that multiple goes the more and more likely it is that they’ve had “invitro.” The explanatory power of the evidence (twins) increases as that multiple increases.

And the final thing this example highlights is: why do you need the general rate of “invitro?” Well, we went as far as we could with the differing rates of twins. And we were left with 24x, but there was no way to tune that to this tangible individual. You needed Bayes’ to get a number that is “my best estimate” of whether this person used “invitro” or not.

And the bearing of the baseline rate?  At this point you are usually prompted to consider these two extreme cases:

What if I told you only one “invitro” delivery was allowed in the US per year and the rest natural?
What if I told you only one natural delivery was allowed in the US per year and the rest “invitro?”
Would that change your estimate that this individual had “invitro?”

And you’re supposed to give a begrudging nod. “Damn it, Thomas, you’re right again!”

A quick recap:

Bayes’ Theorem:

[latex]P(A\mid B)=\frac{P(B\mid A) \times P(A)}{P(B)}[/latex]

Our case:

[latex]P(A\mid B)= P(invitro \mid twins)[/latex]

[latex]P(B\mid A)= P(twins\mid invitro)[/latex]

[latex]P(B)= P(twins)[/latex]

[latex]P(A)= P(invitro)[/latex]

Postscript on “choosing your prior”

Obviously choosing that baseline rate has a big impact on our final estimate of probability. We took a real broad swipe with using total US births as the denominator in our estimation of this rate. Access to “invitro” is not evenly distributed across demographics and socioeconomic stratum. For example, 1.52 million of the 4.26 million births in the US in 2006 were to women 24 years old and younger. An educated guess would say not many of these women had the means or motivation to investigate “invitro” (and I know, sure as hell, the couple you are estimating is not 24yrs or younger; no one in SF would ever have children at such an age!).

If we exclude these births (holding “invitro” totals the same) it would, roughly, bring our prior rate up to 1.626% and our estimated probability that the individual used “invitro” to 38.32%.

It’s pretty clear that with judicious refinement of the “prior,” (with the coresponding adjustment to [latex] P(twins)[/latex] to reflect the higher incidence of twins within your subgroup44.We took a little shortcut for sake of the mnemonic value of the explanation above by assuming that “invitro” usage was so small that it did not affect the national rate of multiple births. I’ll leave as an “exercise for the reader” to figure out the implication of this.) you will probably get close to a near certainty that the if the man with the pram has twins, they used “invitro.”