Probability and Statistics
Katherine Rawlins
August 7, 2000, River Falls, WI
Outline:
-
Introduction to statistics
-
The addition and multiplication of probabilities
-
"There are lies, damn lies, and statistics"; how numbers can mislead
-
Histograms and the Normal distribution
-
Significance and Confidence
-
"If you're one in a million, there are a thousand Chinese people just like
you"; the unlikely does happen
Introduction to Statistics
-
Important in every field of science
-
Essential for AMANDA
-
Important in everyday life
Two broad categories of probabilities:
-
Measured emperically
(Ex: 18 out of every 1,000 teenage girls get pregnant every year. 85%
of small businesses fail within the first 5 years.)
-
Computed from assumptions or models
(Ex: The probability of rolling a six on a die is 1/6. The particle
has a probability of 0.01 for tunneling through the barrier.)
AMANDA: "The probability of a neutrino interacting with ice or rock and
triggering our detector is...."
Which type is it?
...So, what are my chances of being hit by a stray neutrino?
And why can nobody give me a straight answer to this question?
To answer this question, we need 2 pieces of information:
-
The FLUX (number per sq. meter per sec) of neutrinos passing through
-
The probability that any single neutrino will interact
Both of these numbers CHANGE with ENERGY.
Flux:
At solar neutrino energies (MeV): 10^12 nus/sq.m/sec
(There are 10^4 square cm in a square meter...) = 10^8 nus/sq.cm/sec
At 1 GeV: 1 nus/sq.cm/sec
At higher energies:
A steeply falling spectrum!
from: "Cosmic Rays" by Michael W. Friedlander, 1989.
The important line here is the H(x5), representing proton cosmic rays.
Notice that both the x and y axes are on a log scale.
Interaction Probability
At solar energies: Pretty darn low (one in 10^18, or probability of
10^-18)
At 1 GeV: One in 10^12 (probability = 10^-12)
At 1 TeV: One in 10^6 (probability = 10^-6)
At 1 PeV: One in 1000 (probability = 10^-3)
from: "Cosmic Rays and Particle Physics" by Thomas K. Gaisser, 1990.
So, the number of interactions per square centimeter per second
is:
-
solar: 10^-10 interactions/sq.cm/sec
-
1 GeV: 10^-12 interactions/sq.cm/sec
-
1 TeV: 10^-12 interactions/sq.cm/sec
-
1 PeV: 10^-15 interactions/sq.cm/sec
The size of AMANDA is about 1 square kilometer = 10^10 square cm.
(This is the "interaction volume", not the physical size)
The size of a person is about 1 square meter = 10^4 square cm.
The number of seconds in a year is about 3 x 10^7 sec/year.
Use these to compute the number of neutrinos that interact in AMANDA
or you each year.
Keep in mind these are all order of magnitude estimates.
Addition and Multiplication of Probabilities
EXAMPLE 1a: "What is the probability that I will die on a Tuesday in September?"
Assume:
-
All seven days are equally likely
-
All days of the year are equally likely
-
The day of the week and the day of the year are independent.
Consider an ensemble of universes.
Answer:
Probability (Tues. AND Sept.)
=
Probability (Tues.) X Probability (Sept.)
...and in general:
Probability (A AND B)
=
Probability (A) X Probability (B)
EXAMPLE 1b: "What is the probability that I will die on a Tuesday or a
Wednesday?"
Make the same assumptions (all days of the week equally likely) and again
consider the ensemble of universes.
Answer:
Probability (Tues. OR Weds.)
=
Probability (Tues.) + Probability (Weds.)
...and in general:
Probability (A OR B)
=
Probability (A) + Probability (B)
EXAMPLE 1c: "What is the probability that I will die on a Tuesday or in
September?"
This one's a little trickier. It's an "OR" question, but the probabilities
do NOT just add up. Why? Look again at the ensemble of universes.
Some of the universes will be counted twice! Instead, you have to compute
the total amount of "white space" or universes that fulfil neither criteria,
and subtract this fraction from one.
This time:
Probability (Tues. OR Sept.)
=
1 - { Probability(not Tues) x Probability(not Sept) }
EXAMPLE 2: "What are the odds that my high school sweetheart becomes a
telemarketer and calls me?"
We need to know:
-
What are the odds that he becomes a telemarketer? (P_tele)
-
Of all the area codes he could have been assigned to, what are the odds
that he get assigned to mine? (P_area)
-
Probability that my home phone number is listed? (P_homelisted)
-
Probability that I am home at the time he calls? (P_home)
-
Probability that my work phone number is listed? (P_worklisted)
-
Probability that I am at work at the time he calls? (P_work)
-
Probability that I even HAVE a high school sweetheart? (P_sweet)
Then:
P = (P_tele) (P_area) {(P_homelisted) (P_home) + (P_worklisted) (P_work)}(P_sweet)
EXAMPLE 3: "What are the odds that there exists intelligent life elsewhere
in our galaxy capable of communicating with us?"
We need to know:
-
Rate of star formation in the galaxy (R*)
-
Probability that a star has planets (fp)
-
Fraction of the planets that are Earth-like (Ne)
-
Probability that life will originate on an Earth-like planet (fl)
-
Probability that a lifeform will evolve intelligence (fi)
-
Probability that an intelligent civilization will develop technology for
interstellar communication and use it (fc)
-
Lifetime of the average technological civilization (L)
Then:
Ncivilizations = R* fp Ne fl fi fc L
The Drake Equation:
Ncivilizations = R* fp Ne fl fi fc L
Written down in 1961 by Frank Drake, radio astronomer who conducted the
first radio search for extraterrestrial life (Project Ozma) at NRAO.
Has been used ever since... we can well estimate the first three numbers,
but can only guess at the others!
EXAMPLE 4: The "Birthday Problem":
In a room of N people, what is the probability that two have the same birthday?
How big does N have to be before the probability passes 50/50?
Solution:
It's actually easier to do the inverse problem... What is the probability
that no two people have the same birthday?...
Then the answer will be:
P(somebody shares a birthday) =
1 - P(nobody shares)
No two people share a birthday:
Person 2 does not share a birthday with Person 1
AND
Person 3 does not share a birthday with Person 1 OR Person 2,
AND
Person 4 does not share a birthday with Person 1 OR Person 2 OR Person
3,
AND
.....
Person N does not share a birthday with Person 1 OR Person 2 OR Person
3 OR.... OR Person N-1.
Notice that this is just a big combination of AND (multiplying) and
OR (adding)!
Let's compute each one:
In general: P (x shares with y) = P(x,y) = 1/365.
I'll call P (x does not share with y) = P(not x,y) = 1 - P(x,y)
P (not 2, 1) = 1 - 1/365
= 364/365
P (not 3, 1 or 2) = 1 - P(3,1 or 2)
= 1 - ( P(3,1) + P(3,2) )
= 1 - 1/365 - 1/365
= 363/365
P (not 4, 1 or 2 or 3)
= 1 - P(4,1 or 2 or 3)
= 1 - ( P(4,1) + P(4,2) + P(4,3) )
= 1 - 1/365 - 1/365 - 1/365
= 362/365
....
Following this pattern,
P (not N, anybody else) = (365 - N + 1)/365
and so...
P (no two people share) =
(364/365) x
(363/365) x
(362/365) x
.....
(365 - N + 2)/365 x
(365 - N + 1)/365
(364 x 363 x 362 x ... x 365 - N + 1)
= ---------------------------------------
(365)^N
P (somebody does share) = 1 - this.
from http://www.mste.uiuc.edu/reese/birthday/intro.html
Next page...