Skip to content

Commit 05e9758

Browse files
shudipto-aminkazi-shudipto-aminleiosAmaras
authored
New chapter "Metropolis" with Python Implementation (#929)
* Add new chapter: metropolis_hastings; in python * Add markdown and python files for metropolis * Add image for metropolis * Update .gitignore, SUMMARY.md, and metropolis * Untrack .ipynb_checkpoints * Untrack ipynb_checkpoints * Fix algorithm steps list * Really fix markdown list * Add plot of P to chapter * Add metropolis animation and update plot of P(x) * Add minor update to chapter text * Generate gif and mp4 for random walk * Complete first draft! * Final version before Pull Request * Add metropolis citation * Remove unnecessary lines from code and bib file * Fix display of code in md * Fix random walk capitalization. * Apply Amaras' suggestions from code review This commit makes minor edits which resolve conflicts with PEP8 and improve readability of code. It also makes chapter sub-chapter of Monte Carlo. Co-authored-by: James Schloss <jrs.schloss@gmail.com> Co-authored-by: Sammy Plat <amaras@vivaldi.net> * Fix the code import lines in md file. * Change to in metropolis.py * Add probability section to metropolis.md However, this is temporary, as this section needs to go into a separate chapter. * Move Probability section in metropolis to own chapter. * Fix SUMMARY.md spelling mistake from previous commit * Add figures for probability distribution chapter * Update image of normal distribution * Finish first draft of probability chapter * Minor edits to distributions.md. * "Minor changes to distributions.md" * Complete the Example/Application section of metropolis. * Address most issues in review of PR#929 * Application sections created. * g() definition moved up to a better place * More discussion on g. * Missing citations provided * Clarity provided (hopefully) * Add image of 1D_particles * Update citations in md file * Add testing function to metropolis python code. Also modifies markdown to describe test. Also adds citation for RMSD. * Numpyfy metropolis f function and fix errors. * Implement generator to iterate. * Reformat output of test and nrmsd error reporting. * Add description of video and fix code display. * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Put sentences in `metropolis.md` on separate lines. * Put sentences in distributions.md chapter on separate lines. * Addresses issues raised by Leios' 1st review of probability chapter. * Adds intro to beginning. * Fixes plurality of die. * Clarifies some notation in discrete section. * Removes explanation of calculus. * Other minor edits. * Add minor edits. * Update contents/metropolis/metropolis.md title Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Simplify intro to contents/metropolis/metropolis.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Minor formatting to contents/metropolis/metropolis.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Fix spelling contents/metropolis/metropolis.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Simplify line 71 of contents/metropolis/metropolis.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update contents/metropolis/metropolis.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Update example application in metropolis according to Leios comments. * Minor edit: contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Minor edit: contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Minor edit: contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Minor edit: contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Minor edit: contents/probability/distributions/distributions.md Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Apply minor suggestions from Leios' code review Mostly typos, spelling, punctuation, and grammar. Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Add minor edits from code Leios' review * Add minor edit * Apply minor edit suggestions from Leios Mostly punctuation and grammar. Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Add minor formatting, such as periods after equations. * Fix integer interval notation using hack. * Add minor edits missed previously in probability chapter. * Add minor edits to metropolis, mostly punctuation. * Apply minor edits from Leios Mostly grammar and sentence structure. Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Add intro line to Algorithm section of metropolis. * Add name to contributor.md * Apply suggestions from Leios * Minor edits for grammar and clarity. * Applies to both metropolis and probablity chapters. Co-authored-by: James Schloss <jrs.schloss@gmail.com> * Apply suggestions from code review Co-authored-by: Shudipto <32040453+shudipto-amin@users.noreply.github.com> Co-authored-by: Kazi Shudipto Amin <kazi.amin@ucalgary.ca> Co-authored-by: James Schloss <jrs.schloss@gmail.com> Co-authored-by: Sammy Plat <amaras@vivaldi.net>
1 parent ffaa890 commit 05e9758

16 files changed

+692
-1
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -478,7 +478,7 @@ paket-files/
478478
# Python Tools for Visual Studio (PTVS)
479479
__pycache__/
480480
*.pyc
481-
481+
*.ipynb_checkpoints*
482482
# Cake - Uncomment if you are using it
483483
# tools/**
484484
# !tools/packages.config

CONTRIBUTORS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,4 +61,5 @@ This file lists everyone, who contributed to this repo and wanted to show up her
6161
- Hugo Salou
6262
- Dimitri Belopopsky
6363
- Henrik Abel Christensen
64+
- K. Shudipto Amin
6465
- Peanutbutter_Warrior

SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,11 @@
2020
* [Multiplication as a Convolution](contents/convolutions/multiplication/multiplication.md)
2121
* [Convolutions of Images (2D)](contents/convolutions/2d/2d.md)
2222
* [Convolutional Theorem](contents/convolutions/convolutional_theorem/convolutional_theorem.md)
23+
* [Probability Distributions](contents/probability/distributions/distributions.md)
2324
* [Tree Traversal](contents/tree_traversal/tree_traversal.md)
2425
* [Euclidean Algorithm](contents/euclidean_algorithm/euclidean_algorithm.md)
2526
* [Monte Carlo](contents/monte_carlo_integration/monte_carlo_integration.md)
27+
* [Metropolis](contents/metropolis/metropolis.md)
2628
* [Matrix Methods](contents/matrix_methods/matrix_methods.md)
2729
* [Gaussian Elimination](contents/gaussian_elimination/gaussian_elimination.md)
2830
* [Thomas Algorithm](contents/thomas_algorithm/thomas_algorithm.md)
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
import numpy as np
2+
3+
4+
def f(x, normalize=False):
5+
'''
6+
Function proportional to target distribution, a sum of Gaussians.
7+
For testing, set normalize to True, to get target distribution exactly.
8+
'''
9+
# Gaussian heights, width parameters, and mean positions respectively:
10+
a = np.array([10., 3., 1.]).reshape(3, 1)
11+
b = np.array([ 4., 0.2, 2.]).reshape(3, 1)
12+
xs = np.array([-4., -1., 5.]).reshape(3, 1)
13+
14+
if normalize:
15+
norm = (np.sqrt(np.pi) * (a / np.sqrt(b))).sum()
16+
a /= norm
17+
18+
return (a * np.exp(-b * (x - xs)**2)).sum(axis=0)
19+
20+
def g():
21+
'''Random step vector.'''
22+
return np.random.uniform(-1,1)
23+
24+
def metropolis_step(x, f=f, g=g):
25+
'''Perform one full iteration and return new position.'''
26+
27+
x_proposed = x + g()
28+
a = min(1, (f(x_proposed) / f(x)).item())
29+
30+
x_new = np.random.choice([x_proposed, x], p=[a, 1-a])
31+
32+
return x_new
33+
34+
def metropolis_iterate(x0, num_steps):
35+
'''Iterate metropolis algorithm for num_steps using iniital position x_0'''
36+
37+
for n in range(num_steps):
38+
if n == 0:
39+
x = x0
40+
else:
41+
x = metropolis_step(x)
42+
yield x
43+
44+
45+
def test_metropolis_iterate(num_steps, xmin, xmax, x0):
46+
'''
47+
Calculate error in normalized density histogram of data
48+
generated by metropolis_iterate() by using
49+
normalized-root-mean-square-deviation metric.
50+
'''
51+
52+
bin_width = 0.25
53+
bins = np.arange(xmin, xmax + bin_width/2, bin_width)
54+
centers = np.arange(xmin + bin_width/2, xmax, bin_width)
55+
56+
true_values = f(centers, normalize=True)
57+
mean_value = np.mean(true_values - min(true_values))
58+
59+
x_dat = list(metropolis_iterate(x0, num_steps))
60+
heights, _ = np.histogram(x_dat, bins=bins, density=True)
61+
62+
nmsd = np.average((heights - true_values)**2 / mean_value)
63+
nrmsd = np.sqrt(nmsd)
64+
65+
return nrmsd
66+
67+
68+
69+
if __name__ == "__main__":
70+
xmin, xmax = -10, 10
71+
x0 = np.random.uniform(xmin, xmax)
72+
73+
num_steps = 50_000
74+
75+
x_dat = list(metropolis_iterate(x0, 50_000))
76+
77+
# Write data to file
78+
output_string = "\n".join(str(x) for x in x_dat)
79+
80+
with open("output.dat", "w") as out:
81+
out.write(output_string)
82+
out.write("\n")
83+
84+
85+
# Testing
86+
print(f"Testing with x0 = {x0:5.2f}")
87+
print(f"{'num_steps':>10s} {'NRMSD':10s}")
88+
for num_steps in (500, 5_000, 50_000):
89+
nrmsd = test_metropolis_iterate(num_steps, xmin, xmax, x0)
90+
print(f"{num_steps:10d} {nrmsd:5.1%}")

contents/metropolis/metropolis.md

Lines changed: 283 additions & 0 deletions
Large diffs are not rendered by default.
23 KB
Loading
1.57 MB
Loading
Binary file not shown.
41.5 KB
Loading
20.2 KB
Binary file not shown.
Loading

contents/metropolis/res/plot_of_P.png

154 KB
Loading
Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
# What's a probability distribution?
2+
3+
Probability distributions are mathematical functions that give the probabilities of a range or set of outcomes.
4+
These outcomes can be the result of an experiment or procedure, such as tossing a coin or rolling dice.
5+
They can also be the result of a physical measurement, such as measuring the temperature of an object, counting how many electrons are spin up, etc.
6+
Broadly speaking, we can classify probability distributions into two categories - __discrete probability distributions__ and __continuous probability distributions__.
7+
8+
## Discrete Probability Distributions
9+
10+
It's intuitive for us to understand what a __discrete__ probability distribution is.
11+
For example, we understand the outcomes of a coin toss very well, and also that of a dice roll.
12+
For a single coin toss, we know that the probability of getting heads $$(H)$$ is half, or $$P(H) = \frac{1}{2}$$.
13+
Similarly, the probability of getting tails $$(T)$$ is $$P(T) = \frac{1}{2}$$.
14+
Formally, we can write the probability distribution for such a coin toss as,
15+
16+
$$
17+
P(n) = \begin{matrix}
18+
\displaystyle \frac 1 2 &;& n \in \left\{H,T\right\}.
19+
\end{matrix}
20+
$$
21+
22+
Here, $$n$$ denotes the outcome, and we used the "set notation", $$n \in\left\{H,T\right\}$$, which means "$$n$$ belongs to a set containing $$H$$ and $$T$$".
23+
From the above equation, we can also assume that any other outcome for $$n$$ (such as landing on an edge) is incredibly unlikely, impossible, or simply "not allowed" (for example, just toss again if it _does_ land on its edge!).
24+
25+
For a probability distribution, it's important to take note of the set of possibilities, or the __domain__ of the distribution.
26+
Here, $$\left\{H,T\right\}$$ is the domain of $$P(n)$$, telling us that $$n$$ can only be either $$H$$ or $$T$$.
27+
28+
If we use a different system, the outcome $$n$$ could mean other things.
29+
For example, it could be a number like the outcome of a __die roll__ which has the probability distribution,
30+
31+
32+
$$
33+
P(n) = \begin{matrix}
34+
\displaystyle\frac 1 6 &;& n \in [\![1,6]\!]
35+
\end{matrix}.
36+
$$
37+
This is saying that the probability of $$n$$ being a whole number between $$1$$ and $$6$$ is $$1/6$$, and we assume that the probability of getting any other $$n$$ is $$0$$.
38+
This is a discrete probability function because $$n$$ is an integer, and thus only takes discrete values.
39+
40+
Both of the above examples are rather boring, because the value of $$P(n)$$ is the same for all $$n$$.
41+
An example of a discrete probability function where the probability actually depends on $$n$$, is when $$n$$ is the sum of numbers on a __roll of two dice__.
42+
In this case, $$P(n)$$ is different for each $$n$$ as some possibilities like $$n=2$$ can happen in only one possible way (by getting a $$1$$ on both dice), whereas $$n=4$$ can happen in $$3$$ ways ($$1$$ and $$3$$; or $$2$$ and $$2$$; or $$3$$ and $$1$$).
43+
44+
The example of rolling two dice is a great case study for how we can construct a probability distribution, since the probability varies and it is not immediately obvious how it varies.
45+
So let's go ahead and construct it!
46+
47+
Let's first define the domain of our target $$P(n)$$.
48+
We know that the lowest sum of two dice is $$2$$ (a $$1$$ on both dice), so $$n \geq 2$$ for sure. Similarly, the maximum is the sum of two sixes, or $$12$$, so $$n \leq 12$$ also.
49+
50+
So now we know the domain of possibilities, i.e., $$n \in [\![2,12]\!]$$.
51+
Next, we take a very common approach - for each outcome $$n$$, we count up the number of different ways it can occur.
52+
Let's call this number the __frequency of__ $$n$$, $$f(n)$$.
53+
We already mentioned that there is only one way to get $$n=2$$, by getting a pair of $$1$$s.
54+
By our definition of the function $$f$$, this means that $$f(2)=1$$.
55+
For $$n=3$$, we see that there are two possible ways of getting this outcome: the first die shows a $$1$$ and the second a $$2$$, or the first die shows a $$2$$ and the second a $$1$$.
56+
Thus, $$f(3)=2$$.
57+
If you continue doing this for all $$n$$, you may see a pattern (homework for the reader!).
58+
Once you have all the $$f(n)$$, we can visualize it by plotting $$f(n)$$ vs $$n$$, as shown below.
59+
60+
<p>
61+
<img class="center" src="res/double_die_frequencies.png" alt="<FIG> Die Roll" style="width:80%"/>
62+
</p>
63+
64+
We can see from the plot that the most common outcome for the sum of two dice is a $$n=7$$, and the further away from $$n=7$$ you get, the less likely the outcome.
65+
Good to know, for a prospective gambler!
66+
67+
### Normalization
68+
69+
The $$f(n)$$ plotted above is technically NOT the probability $$P(n)$$ &ndash; because we know that the sum of all probabilities should be $$1$$, which clearly isn't the case for $$f(n)$$.
70+
But we can get the probability by dividing $$f(n)$$ by the _total_ number of possibilities, $$N$$.
71+
For two dice, that is $$N = 6 \times 6 = 36$$, but we could also express it as the _sum of all frequencies_,
72+
73+
$$
74+
N = \sum_n f(n),
75+
$$
76+
77+
which would also equal to $$36$$ in this case.
78+
So, by dividing $$f(n)$$ by $$\sum_n f(n)$$ we get our target probability distribution, $$P(n)$$.
79+
This process is called __normalization__ and is crucial for determining almost any probability distribution.
80+
So in general, if we have the function $$f(n)$$, we can get the probability as
81+
82+
$$
83+
P(n) = \frac{f(n)}{\displaystyle\sum_{n} f(n)}.
84+
$$
85+
86+
Note that $$f(n)$$ does not necessarily have to be the frequency of $$n$$ &ndash; it could be any function which is _proportional_ to $$P(n)$$, and the above definition of $$P(n)$$ would still hold.
87+
It's easy to check that the sum is now equal to $$1$$, since
88+
89+
$$
90+
\sum_n P(n) = \frac{\displaystyle\sum_{n}f(n)}{\displaystyle\sum_{n} f(n)} = 1.
91+
$$
92+
93+
Once we have the probability function $$P(n)$$, we can calculate all sorts of probabilites.
94+
For example, let's say we want to find the probability that $$n$$ will be between two integers $$a$$ and $$b$$, inclusively (also including $$a$$ and $$b$$).
95+
For brevity, we will use the notation $$\mathbb{P}(a \leq n \leq b)$$ to denote this probability.
96+
And to calculate it, we simply have to sum up all the probabilities for each value of $$n$$ in that range, i.e.,
97+
98+
$$
99+
\mathbb{P}(a \leq n \leq b) = \sum_{n=a}^{b} P(n).
100+
$$
101+
102+
## Probability Density Functions
103+
104+
What if instead of a discrete variable $$n$$, we had a continuous variable $$x$$, like temperature or weight?
105+
In that case, it doesn't make sense to ask what the probability is of $$x$$ being _exactly_ a particular number &ndash; there are infinite possible real numbers, after all, so the probability of $$x$$ being exactly any one of them is essentially zero!
106+
But it _does_ make sense to ask what the probability is that $$x$$ will be _between_ a certain range of values.
107+
For example, one might say that there is $$50\%$$ chance that the temperature tomorrow at noon will be between $$5$$ and $$15$$, or $$5\%$$ chance that it will be between $$16$$ and $$16.5$$.
108+
But how do we put all that information, for every possible range, in a single function?
109+
The answer is to use a __probability density function__.
110+
111+
What does that mean?
112+
Well, suppose $$x$$ is a continous quantity, and we have a probability density function, $$P(x)$$ which looks like
113+
114+
<p>
115+
<img class="center" src="res/normal_distribution.png" alt="<FIG> probability density" style="width:100%"/>
116+
</p>
117+
118+
Now, if we are interested in the probability of the range of values that lie between $$x_0$$ and $$x_0 + dx$$, all we have to do is calculate the _area_ of the green sliver above.
119+
This is the defining feature of a probability density function:
120+
121+
__the probability of a range of values is the _area_ of the region under the probability density curve which is within that range.__
122+
123+
124+
So if $$dx$$ is infinitesimally small, then the area of the green sliver becomes $$P(x)dx$$, and hence,
125+
126+
$$
127+
\mathbb{P}(x_0 \leq x \leq x_0 + dx) = P(x)dx.
128+
$$
129+
130+
So strictly speaking, $$P(x)$$ itself is NOT a probability, but rather the probability is the quantity $$P(x)dx$$, or any area under the curve.
131+
That is why we call $$P(x)$$ the probability _density_ at $$x$$, while the actual probability is only defined for ranges of $$x$$.
132+
133+
Thus, to obtain the probability of $$x$$ lying within a range, we simply integrate $$P(x)$$ between that range, i.e.,
134+
135+
$$
136+
\mathbb{P}(a \leq x \leq b ) = \int_a^b P(x)dx.
137+
$$
138+
139+
This is analagous to finding the probability of a range of discrete values from the previous section:
140+
141+
$$
142+
\mathbb{P}(a \leq n \leq b) = \sum_{n=a}^{b} P(n).
143+
$$
144+
145+
The fact that all probabilities must sum to $$1$$ translates to
146+
147+
$$
148+
\int_D P(x)dx = 1.
149+
$$
150+
151+
where $$D$$ denotes the __domain__ of $$P(x)$$, i.e., the entire range of possible values of $$x$$ for which $$P(x)$$ is defined.
152+
153+
### Normalization of a Density Function
154+
155+
Just like in the discrete case, we often first calculate some density or frequency function $$f(x)$$, which is NOT $$P(x)$$, but proportional to it.
156+
We can get the probability density function by normalizing it in a similar way, except that we integrate instead of sum:
157+
158+
$$
159+
P(\mathbf{x}) = \frac{f(\mathbf{x})}{\int_D f(\mathbf{x})d\mathbf{x}}.
160+
$$
161+
162+
For example, consider the following __Gaussian function__ (popularly used in __normal distributions__),
163+
164+
$$
165+
f(x) = e^{-x^2},
166+
$$
167+
168+
which is defined for all real numbers $$x$$.
169+
We first integrate it (or do a quick google search, as it is rather tricky) to get
170+
171+
$$
172+
N = \int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi}.
173+
$$
174+
175+
Now we have a Gaussian probability distribution,
176+
177+
$$
178+
P(x) = \frac{1}{N} e^{-x^2} = \frac{1}{\sqrt{\pi}} e^{-x^2}.
179+
$$
180+
181+
In general, normalization can allow us to create a probability distribution out of almost any function $$f(\mathbf{x})$$.
182+
There are really only two rules that $$f(\mathbf{x})$$ must satisfy to be a candidate for a probability density distribution:
183+
1. The integral of $$f(\mathbf{x})$$ over any subset of $$D$$ (denoted by $$S$$) has to be non-negative (it can be zero):
184+
$$
185+
\int_{S}f(\mathbf{x})d\mathbf{x} \geq 0.
186+
$$
187+
2. The following integral must be finite:
188+
$$
189+
\int_{D} f(\mathbf{x})d\mathbf{x}.
190+
$$
191+
192+
<script>
193+
MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
194+
</script>
195+
196+
## License
197+
198+
##### Images/Graphics
199+
200+
- The image "[Frequency distribution of a double die roll](res/double_die_frequencies.png)" was created by [K. Shudipto Amin](https://github.com/shudipto-amin) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
201+
202+
- The image "[Probability Density](res/normal_distribution.png)" was created by [K. Shudipto Amin](https://github.com/shudipto-amin) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
203+
204+
##### Text
205+
206+
The text of this chapter was written by [K. Shudipto Amin](https://github.com/shudipto-amin) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
207+
208+
[<p><img class="center" src="../cc/CC-BY-SA_icon.svg" /></p>](https://creativecommons.org/licenses/by-sa/4.0/)
209+
210+
##### Pull Requests
211+
212+
After initial licensing ([#560](https://github.com/algorithm-archivists/algorithm-archive/pull/560)), the following pull requests have modified the text or graphics of this chapter:
213+
- none
214+
Loading
Loading

0 commit comments

Comments
 (0)