You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: contents/probability/distributions/distributions.md
+20-20Lines changed: 20 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -42,11 +42,11 @@ Both of the above examples are rather boring, because the value of $$P(n)$$ is t
42
42
An example of a discrete probability function where the probability actually depends on $$n$$, is when $$n$$ is the sum of numbers on a __roll of two dice__.
43
43
In this case, $$P(n)$$ is different for each $$n$$ as some possibilities like $$n=2$$ can happen in only one possible way (by getting a 1 on both dice), whereas $$n=4$$ can happen in 3 ways (1 and 3; or 2 and 2; or 3 and 1).
44
44
45
-
The rolling two dice is a great case study for how we can construct a probability distribution, since the probability varies and it is not immediately obvious how it varies.
45
+
The example of rolling two dice is a great case study for how we can construct a probability distribution, since the probability varies and it is not immediately obvious how it varies.
46
46
So let's go ahead and construct it!
47
47
48
48
Let's first define the domain of our target $$P(n)$$.
49
-
We know that the lowest sum of two dice is 2 (a 1 on both dice), so $$n \geq 2$$ for sure. Similarly, the maximum is sum of two sixes, or 12, so $$n \leq 12$$ also.
49
+
We know that the lowest sum of two dice is 2 (a 1 on both dice), so $$n \geq 2$$ for sure. Similarly, the maximum is the sum of two sixes, or 12, so $$n \leq 12$$ also.
50
50
51
51
So now we know the domain of possibilities, i.e., $$n \in [2..12]$$.
52
52
Next, we take a very common approach - for each outcome $$n$$, we count up the number of different ways it can occur.
@@ -72,7 +72,7 @@ But we can get the probability by dividing $$f(n)$$ by the _total_ number of pos
72
72
For two dice, that is $$N = 6 \times 6 = 36$$, but we could also express it as the _sum of all frequencies_,
73
73
74
74
$$
75
-
N = \sum_n f(n)
75
+
N = \sum_n f(n),
76
76
$$
77
77
78
78
which would also equal to 36 in this case.
@@ -81,14 +81,14 @@ This process is called __normalization__ and is crucial for determining almost a
81
81
So in general, if we have the function $$f(n)$$, we can get the probability as
82
82
83
83
$$
84
-
P(n) = \frac{f(n)}{\displaystyle\sum_{n} f(n)}
84
+
P(n) = \frac{f(n)}{\displaystyle\sum_{n} f(n)}.
85
85
$$
86
86
87
-
Note that $$f(n)$$ does not necessarily have to be the frequency of $$n$$- it could be any function which is _proportional_ to $$P(n)$$, and the above definition of $$P(n)$$ would still hold.
88
-
And it's easy to check that the sum is now equal to 1, since
87
+
Note that $$f(n)$$ does not necessarily have to be the frequency of $$n$$– it could be any function which is _proportional_ to $$P(n)$$, and the above definition of $$P(n)$$ would still hold.
88
+
It's easy to check that the sum is now equal to 1, since
Once we have the probability function $$P(n)$$, we can calculate all sorts of probabilites.
@@ -97,13 +97,13 @@ For brevity, we will use the notation $$\mathbb{P}(a \leq n \leq b)$$ to denote
97
97
And to calculate it, we simply have to sum up all the probabilities for each value of $$n$$ in that range, i.e.,
98
98
99
99
$$
100
-
\mathbb{P}(a \leq n \leq b) = \sum_{n=a}^{b} P(n)
100
+
\mathbb{P}(a \leq n \leq b) = \sum_{n=a}^{b} P(n).
101
101
$$
102
102
103
103
## Probability Density Functions
104
104
105
105
What if instead of a discrete variable $$n$$, we had a continuous variable $$x$$, like temperature or weight?
106
-
In that case, it doesn't make sense to ask what the probability is of $$x$$ being _exactly_ a particular number - there are infinite possible real numbers, after all, so the probability of $$x$$ being exactly any one of them is essentially zero!
106
+
In that case, it doesn't make sense to ask what the probability is of $$x$$ being _exactly_ a particular number – there are infinite possible real numbers, after all, so the probability of $$x$$ being exactly any one of them is essentially zero!
107
107
But it _does_ make sense to ask what the probability is that $$x$$ will be _between_ a certain range of values.
108
108
For example, one might say that there is 50% chance that the temperature tomorrow at noon will be between 5 and 15, or 5% chance that it will be between 16 and 16.5.
109
109
But how do we put all that information, for every possible range, in a single function?
@@ -125,7 +125,7 @@ This is the defining feature of a probability density function:
125
125
So if $$dx$$ is infinitesimally small, then the area of the green sliver becomes $$P(x)dx$$, and hence,
126
126
127
127
$$
128
-
\mathbb{P}(x_0 \leq x \leq x_0 + dx) = P(x)dx
128
+
\mathbb{P}(x_0 \leq x \leq x_0 + dx) = P(x)dx.
129
129
$$
130
130
131
131
So strictly speaking, $$P(x)$$ itself is NOT a probability, but rather the probability is the quantity $$P(x)dx$$, or any area under the curve.
@@ -134,19 +134,19 @@ That is why we call $$P(x)$$ the probability _density_ at $$x$$, while the actua
134
134
Thus, to obtain the probability of $$x$$ lying within a range, we simply integrate $$P(x)$$ between that range, i.e.,
135
135
136
136
$$
137
-
\mathbb{P}(a \leq x \leq b ) = \int_a^b P(x)dx
137
+
\mathbb{P}(a \leq x \leq b ) = \int_a^b P(x)dx.
138
138
$$
139
139
140
140
This is analagous to finding the probability of a range of discrete values from the previous section:
141
141
142
142
$$
143
-
\mathbb{P}(a \leq n \leq b) = \sum_{n=a}^{b} P(n)
143
+
\mathbb{P}(a \leq n \leq b) = \sum_{n=a}^{b} P(n).
144
144
$$
145
145
146
-
And the fact that all probabilities must sum to 1 translates to
146
+
The fact that all probabilities must sum to 1 translates to
147
147
148
148
$$
149
-
\int_D P(x)dx = 1
149
+
\int_D P(x)dx = 1.
150
150
$$
151
151
152
152
where $$D$$ denotes the __domain__ of $$P(x)$$, i.e., the entire range of possible values of $$x$$ for which $$P(x)$$ is defined.
@@ -157,7 +157,7 @@ Just like in the discrete case, we often first calculate some density or frequen
157
157
We can get the probability density function by normalizing it in a similar way, except that we integrate instead of sum:
0 commit comments