Skip to content

Commit 536a511

Browse files
committed
reformat display math, add mystnb figure tags
1 parent a80c4a1 commit 536a511

File tree

1 file changed

+121
-24
lines changed

1 file changed

+121
-24
lines changed

lectures/heavy_tails.md

Lines changed: 121 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -61,10 +61,10 @@ To explain this concept, let's look first at examples.
6161
The classic example is the [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution), which has density
6262

6363
$$
64-
f(x) = \frac{1}{\sqrt{2\pi}\sigma}
65-
\exp\left( -\frac{(x-\mu)^2}{2 \sigma^2} \right)
66-
\qquad
67-
(-\infty < x < \infty)
64+
f(x) = \frac{1}{\sqrt{2\pi}\sigma}
65+
\exp\left( -\frac{(x-\mu)^2}{2 \sigma^2} \right)
66+
\qquad
67+
(-\infty < x < \infty)
6868
$$
6969

7070

@@ -78,6 +78,12 @@ We can see this when we plot the density and show a histogram of observations,
7878
as with the following code (which assumes $\mu=0$ and $\sigma=1$).
7979

8080
```{code-cell} ipython3
81+
---
82+
mystnb:
83+
figure:
84+
caption: Histogram of observations
85+
name: hist-obs
86+
---
8187
fig, ax = plt.subplots()
8288
X = norm.rvs(size=1_000_000)
8389
ax.hist(X, bins=40, alpha=0.4, label='histogram', density=True)
@@ -101,6 +107,12 @@ X.min(), X.max()
101107
Here's another view of draws from the same distribution:
102108

103109
```{code-cell} ipython3
110+
---
111+
mystnb:
112+
figure:
113+
caption: Histogram of observations
114+
name: hist-obs2
115+
---
104116
n = 2000
105117
fig, ax = plt.subplots()
106118
data = norm.rvs(size=n)
@@ -174,6 +186,12 @@ data = yf.download('AMZN', '2015-1-1', '2022-7-1')
174186
```
175187

176188
```{code-cell} ipython3
189+
---
190+
mystnb:
191+
figure:
192+
caption: Daily Amazon returns
193+
name: dailyreturns-amzn
194+
---
177195
s = data['Adj Close']
178196
r = s.pct_change()
179197
@@ -194,7 +212,18 @@ Several of observations are quite extreme.
194212
We get a similar picture if we look at other assets, such as Bitcoin
195213

196214
```{code-cell} ipython3
197-
s = yf.download('BTC-USD', '2015-1-1', '2022-7-1')['Adj Close']
215+
:tags: [hide-output]
216+
data = yf.download('BTC-USD', '2015-1-1', '2022-7-1')
217+
```
218+
219+
```{code-cell} ipython3
220+
---
221+
mystnb:
222+
figure:
223+
caption: Daily Bitcoin returns
224+
name: dailyreturns-btc
225+
---
226+
s = data['Adj Close']
198227
r = s.pct_change()
199228
200229
fig, ax = plt.subplots()
@@ -211,6 +240,12 @@ The histogram also looks different to the histogram of the normal
211240
distribution:
212241

213242
```{code-cell} ipython3
243+
---
244+
mystnb:
245+
figure:
246+
caption: Histogram (Normal vs Bitcoin returns)
247+
name: hist-normal-btc
248+
---
214249
r = np.random.standard_t(df=5, size=1000)
215250
216251
fig, ax = plt.subplots()
@@ -274,10 +309,6 @@ like
274309
We return to these points [below](https://intro.quantecon.org/heavy_tails.html#why-do-heavy-tails-matter).
275310

276311

277-
278-
279-
280-
281312
## Visual comparisons
282313
In this section, we will introduce important concepts such as the Pareto distribution, Counter CDFs, and Power laws, which aid in recognizing heavy-tailed distributions.
283314

@@ -300,6 +331,12 @@ distribution](https://en.wikipedia.org/wiki/Cauchy_distribution), which is
300331
heavy-tailed.
301332

302333
```{code-cell} ipython3
334+
---
335+
mystnb:
336+
figure:
337+
caption: Histogram of Cauchy distribution
338+
name: hist-cauchy
339+
---
303340
n = 120
304341
np.random.seed(11)
305342
@@ -353,6 +390,12 @@ The exponential distribution is a light-tailed distribution.
353390
Here are some draws from the exponential distribution.
354391

355392
```{code-cell} ipython3
393+
---
394+
mystnb:
395+
figure:
396+
caption: Histogram of Exponential distribution
397+
name: hist-exponential
398+
---
356399
n = 120
357400
np.random.seed(11)
358401
@@ -394,14 +437,22 @@ exponential random variable.
394437

395438
In particular, if $X$ is exponentially distributed with rate parameter $\alpha$, then
396439

397-
$$ Y = \bar x \exp(X) $$
440+
$$
441+
Y = \bar x \exp(X)
442+
$$
398443

399444
is Pareto-distributed with minimum $\bar x$ and tail index $\alpha$.
400445

401446
Here are some draws from the Pareto distribution with tail index $1$ and minimum
402447
$1$.
403448

404449
```{code-cell} ipython3
450+
---
451+
mystnb:
452+
figure:
453+
caption: Histogram of Pareto distribution
454+
name: hist-pareto
455+
---
405456
n = 120
406457
np.random.seed(11)
407458
@@ -425,21 +476,27 @@ light and heavy tails is to look at the
425476

426477
For a random variable $X$ with CDF $F$, the CCDF is the function
427478

428-
$$ G(x) := 1 - F(x) = \mathbb P\{X > x\} $$
479+
$$
480+
G(x) := 1 - F(x) = \mathbb P\{X > x\}
481+
$$
429482

430483
(Some authors call $G$ the "survival" function.)
431484

432485
The CCDF shows how fast the upper tail goes to zero as $x \to \infty$.
433486

434487
If $X$ is exponentially distributed with rate parameter $\alpha$, then the CCDF is
435488

436-
$$ G_E(x) = \exp(- \alpha x) $$
489+
$$
490+
G_E(x) = \exp(- \alpha x)
491+
$$
437492

438493
This function goes to zero relatively quickly as $x$ gets large.
439494

440495
The standard Pareto distribution, where $\bar x = 1$, has CCDF
441496

442-
$$ G_P(x) = x^{- \alpha} $$
497+
$$
498+
G_P(x) = x^{- \alpha}
499+
$$
443500

444501
This function goes to zero as $x \to \infty$, but much slower than $G_E$.
445502

@@ -505,13 +562,21 @@ The sample counterpart of the CCDF function is the **empirical CCDF**.
505562

506563
Given a sample $x_1, \ldots, x_n$, the empirical CCDF is given by
507564

508-
$$ \hat G(x) = \frac{1}{n} \sum_{i=1}^n \mathbb 1\{x_i > x\} $$
565+
$$
566+
\hat G(x) = \frac{1}{n} \sum_{i=1}^n \mathbb 1\{x_i > x\}
567+
$$
509568

510569
Thus, $\hat G(x)$ shows the fraction of the sample that exceeds $x$.
511570

512571
Here's a figure containing some empirical CCDFs from simulated data.
513572

514573
```{code-cell} ipython3
574+
---
575+
mystnb:
576+
figure:
577+
caption: Empirical CCDFs
578+
name: ccdf-empirics
579+
---
515580
def eccdf(x, data):
516581
"Simple empirical CCDF function."
517582
return np.mean(data > x)
@@ -690,7 +755,13 @@ def extract_wb(varlist=['NY.GDP.MKTP.CD'],
690755
Here is a plot of the firm size distribution for the largest 500 firms in 2020 taken from Forbes Global 2000.
691756

692757
```{code-cell} ipython3
693-
:tags: [hide-input]
758+
---
759+
tags: [hide-input]
760+
mystnb:
761+
figure:
762+
caption: Firm size distribution
763+
name: firm-size-dist
764+
---
694765
695766
df_fs = pd.read_csv('https://media.githubusercontent.com/media/QuantEcon/high_dim_data/main/cross_section/forbes-global2000.csv')
696767
df_fs = df_fs[['Country', 'Sales', 'Profits', 'Assets', 'Market Value']]
@@ -711,7 +782,13 @@ Here are plots of the city size distribution for the US and Brazil in 2023 from
711782
The size is measured by population.
712783

713784
```{code-cell} ipython3
714-
:tags: [hide-input]
785+
---
786+
tags: [hide-input]
787+
mystnb:
788+
figure:
789+
caption: City size distribution
790+
name: city-size-dist
791+
---
715792
716793
# import population data of cities in 2023 United States and 2023 Brazil from world population review
717794
df_cs_us = pd.read_csv('https://media.githubusercontent.com/media/QuantEcon/high_dim_data/main/cross_section/cities_us.csv')
@@ -732,7 +809,13 @@ Here is a plot of the upper tail (top 500) of the wealth distribution.
732809
The data is from the Forbes Billionaires list in 2020.
733810

734811
```{code-cell} ipython3
735-
:tags: [hide-input]
812+
---
813+
tags: [hide-input]
814+
mystnb:
815+
figure:
816+
caption: Wealth distribution (Forbes Billionaires in 2020)
817+
name: wealth-dist
818+
---
736819
737820
df_w = pd.read_csv('https://media.githubusercontent.com/media/QuantEcon/high_dim_data/main/cross_section/forbes-billionaires.csv')
738821
df_w = df_w[['country', 'realTimeWorth', 'realTimeRank']].dropna()
@@ -782,7 +865,13 @@ df_gdp1.dropna(inplace=True)
782865
```
783866

784867
```{code-cell} ipython3
785-
:tags: [hide-input]
868+
---
869+
tags: [hide-input]
870+
mystnb:
871+
figure:
872+
caption: GDP per capita distribution
873+
name: gdppc-dist
874+
---
786875
787876
fig, axes = plt.subplots(1, 2, figsize=(8.8, 3.6))
788877
@@ -828,6 +917,12 @@ Let's have a look at the behavior of the sample mean in this case, and see
828917
whether or not the LLN is still valid.
829918

830919
```{code-cell} ipython3
920+
---
921+
mystnb:
922+
figure:
923+
caption: LLN failure
924+
name: fail-lln
925+
---
831926
from scipy.stats import cauchy
832927
833928
np.random.seed(1234)
@@ -887,7 +982,9 @@ portfolio is $\mu$ and the variance is $\sigma^2$.
887982

888983
If instead the investor puts share $1/n$ of her wealth in each asset, then the portfolio payoff is
889984

890-
$$ Y_n = \sum_{i=1}^n \frac{X_i}{n} = \frac{1}{n} \sum_{i=1}^n X_i. $$
985+
$$
986+
Y_n = \sum_{i=1}^n \frac{X_i}{n} = \frac{1}{n} \sum_{i=1}^n X_i.
987+
$$
891988

892989
Try computing the mean and variance.
893990

@@ -918,8 +1015,6 @@ For example, the heaviness of the tail of the income distribution helps
9181015
determine {doc}`how much revenue a given tax policy will raise <mle>`.
9191016

9201017

921-
922-
9231018
(cltail)=
9241019
## Classifying tail properties
9251020

@@ -964,7 +1059,9 @@ For example, every random variable with bounded support is light-tailed. (Why?)
9641059

9651060
As another example, if $X$ has the [exponential distribution](https://en.wikipedia.org/wiki/Exponential_distribution), with cdf $F(x) = 1 - \exp(-\lambda x)$ for some $\lambda > 0$, then its moment generating function is
9661061

967-
$$ m(t) = \frac{\lambda}{\lambda - t} \quad \text{when } t < \lambda $$
1062+
$$
1063+
m(t) = \frac{\lambda}{\lambda - t} \quad \text{when } t < \lambda
1064+
$$
9681065

9691066
In particular, $m(t)$ is finite whenever $t < \lambda$, so $X$ is light-tailed.
9701067

@@ -1023,7 +1120,7 @@ $$
10231120
But then
10241121

10251122
$$
1026-
\mathbb E X^r = r \int_0^\infty x^{r-1} \mathbb P\{ X > x \} dx
1123+
\mathbb E X^r = r \int_0^\infty x^{r-1} \mathbb P\{ X > x \} dx
10271124
\geq
10281125
r \int_0^{\bar x} x^{r-1} \mathbb P\{ X > x \} dx
10291126
+ r \int_{\bar x}^\infty x^{r-1} b x^{-\alpha} dx.
@@ -1254,7 +1351,7 @@ assumption leads to a lower mean and greater dispersion.
12541351
The [characteristic function](https://en.wikipedia.org/wiki/Characteristic_function_%28probability_theory%29) of the Cauchy distribution is
12551352
12561353
$$
1257-
\phi(t) = \mathbb E e^{itX} = \int e^{i t x} f(x) dx = e^{-|t|}
1354+
\phi(t) = \mathbb E e^{itX} = \int e^{i t x} f(x) dx = e^{-|t|}
12581355
$$ (lln_cch)
12591356
12601357
Prove that the sample mean $\bar X_n$ of $n$ independent draws $X_1, \ldots,

0 commit comments

Comments
 (0)