Skip to content

Commit 481d17b

Browse files
committed
first draft article
1 parent 362d8f3 commit 481d17b

File tree

73 files changed

+974
-361
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

73 files changed

+974
-361
lines changed
Lines changed: 184 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,212 @@
11
---
2-
title: "Microsoft Stock Prediction using LSTM or GRU"
2+
title: "MSFT Stock Prediction using LSTM or GRU"
33
date: 2024-06-16T00:00:00+01:00
44
description: "Short Stock price analysis on MSFT, then a prediction is tested using GRU"
55
menu:
66
sidebar:
77
name: GRU
88
identifier: GRU
9-
parent: finance
9+
parent: stock_prediction
1010
weight: 9
11+
hero: images/stock-market-prediction-using-data-mining-techniques.jpg
1112
tags: ["Finance", "Deep Learning", "Forecasting"]
1213
categories: ["Finance"]
1314
---
1415

15-
## Pick a stock commodity
16+
## Introduction
1617

17-
'
18-
...
19-
'
18+
In this article, we will explore time series data extracted from the **stock market**, focusing on prominent technology companies such as Apple, Amazon, Google, and Microsoft. Our objective is to equip data analysts and scientists with the essential skills to effectively manipulate and interpret stock market data.
19+
20+
To achieve this, we will utilize the *yfinance* library to fetch stock information and leverage visualization tools such as Seaborn and Matplotlib to illustrate various facets of the data. Specifically, we will explore methods to analyze stock risk based on historical performance, and implement predictive modeling using **GRU/ LSTM** models.
21+
22+
Throughout this tutorial, we aim to address the following key questions:
23+
24+
1. How has the stock price evolved over time?
25+
2. What is the average **daily return** of the stock?
26+
3. How does the **moving average** of the stocks vary?
27+
4. What is the **correlation** between different stocks?
28+
5. How can we forecast future stock behavior, exemplified by predicting the closing price of Apple Inc. using LSTM or GRU?"
29+
30+
***
31+
32+
## Getting Data
33+
The initial step involves **acquiring and loading** the data into memory. Our source of stock data is the **Yahoo Finance** website, renowned for its wealth of financial market data and investment tools. To access this data, we'll employ the **yfinance** library, known for its efficient and Pythonic approach to downloading market data from Yahoo. For further insights into yfinance, refer to the article titled [Reliably download historical market data from with Python](https://aroussi.com/post/python-yahoo-finance).
34+
35+
### Install Dependencies
36+
```bash
37+
pip install -qU yfinance seaborn
38+
```
39+
### Configuration Code
40+
```python
41+
import pandas as pd
42+
import numpy as np
43+
44+
import matplotlib.pyplot as plt
45+
import seaborn as sns
46+
sns.set_style('whitegrid')
47+
plt.style.use("fivethirtyeight")
48+
%matplotlib inline #comment if you are not using a jupyter notebook
49+
50+
# For reading stock data from yahoo
51+
from pandas_datareader.data import DataReader
52+
import yfinance as yf
53+
from pandas_datareader import data as pdr
54+
55+
yf.pdr_override()
56+
57+
# For time stamps
58+
from datetime import datetime
59+
60+
# Get Microsoft data
61+
data = yf.download("MSFT", start, end)
62+
```
2063

2164
## Statistical Analysis on the price
65+
### Summary
66+
```python
67+
# Summary Stats
68+
data.describe()
69+
```
70+
71+
### Closing Price
72+
The closing price is the last price at which the stock is traded during the regular trading day. A stock’s closing price is the standard benchmark used by investors to track its performance over time.
73+
74+
```python
75+
plt.figure(figsize=(14, 5))
76+
plt.plot(data['Adj Close'], label='Close Price')
77+
plt.xlabel('Date')
78+
plt.ylabel('Close Price [$]')
79+
plt.title('Stock Price History')
80+
plt.legend()
81+
plt.show()
82+
```
83+
### Volume of Sales
84+
Volume is the amount of an asset or security that _changes hands over some period of time_, often over the course of a day. For instance, the stock trading volume would refer to the number of shares of security traded between its daily open and close. Trading volume, and changes to volume over the course of time, are important inputs for technical traders.
85+
```python
86+
plt.figure(figsize=(14, 5))
87+
plt.plot(data['Volume'], label='Volume')
88+
plt.xlabel('Date')
89+
plt.ylabel('Volume')
90+
plt.title('Stock Price History')
91+
plt.show()
92+
```
2293

94+
### Moving Average
95+
The moving average (MA) is a simple **technical analysis** tool that smooths out price data by creating a constantly updated average price. The average is taken over a specific period of time, like 10 days, 20 minutes, 30 weeks, or any time period the trader chooses.
96+
97+
98+
```python
99+
ma_day = [10, 20, 50]
100+
101+
# compute moving average (can be also done in a vectorized way)
102+
for ma in ma_day:
103+
column_name = f"{ma} days MA"
104+
data[column_name] = data['Adj Close'].rolling(ma).mean()
105+
106+
plt.figure(figsize=(14, 5))
107+
data[['Adj Close', '10 days MA', '20 days MA', '50 days MA']].plot()
108+
plt.xlabel('Date')
109+
plt.ylabel('Volume')
110+
plt.title('Stock Price History')
111+
plt.show()
112+
```
23113

24114
## Statistical Analysis on the returns
115+
Now that we've done some baseline analysis, let's go ahead and dive a little deeper. We're now going to analyze the risk of the stock. In order to do so we'll need to take a closer look at the daily changes of the stock, and not just its absolute value. Let's go ahead and use pandas to retrieve teh daily returns for the **Microsoft** stock.
116+
```python
117+
# Compute daily return in percentage
118+
data['Daily Return'] = data['Adj Close'].pct_change()
25119

120+
# simple plot
121+
plt.figure(figsize=(14, 5))
122+
data['Daily Return'].hist(bins=50)
123+
plt.title('MSFT Daily Return Distribution')
124+
plt.xlabel('Daily Return')
125+
plt.show()
26126

27-
## GRU Model
127+
# histogram
128+
plt.figure(figsize=(8, 5))
129+
data['Daily Return'].plot()
130+
plt.title('MSFT Daily Return')
131+
plt.show()
28132

29-
### Init
133+
```
134+
## Data Preparation
135+
```python
136+
# Create a new dataframe with only the 'Close column
137+
X = data.filter(['Adj Close'])
138+
# Convert the dataframe to a numpy array
139+
X = X.values
140+
# Get the number of rows to train the model on
141+
training_data_len = int(np.ceil(len(X)*.95))
30142

31-
### Training
143+
# Scale data
144+
from sklearn.preprocessing import MinMaxScaler
145+
scaler = MinMaxScaler(feature_range=(0,1))
146+
scaled_data = scaler.fit_transform(X)
32147

33-
### Testing Metrics
34-
* mean squared error
148+
scaled_data
149+
```
150+
Split training data into small chunks to ingest into LSTM and GRU
151+
```python
152+
# Create the training data set
153+
# Create the scaled training data set
154+
train_data = scaled_data[0:int(training_data_len), :]
155+
# Split the data into x_train and y_train data sets
156+
x_train = []
157+
y_train = []
158+
seq_length = 60
159+
for i in range(seq_length, len(train_data)):
160+
x_train.append(train_data[i-60:i, 0])
161+
y_train.append(train_data[i, 0])
162+
if i<= seq_length+1:
163+
print(x_train)
164+
print(y_train, end="\n\n")
165+
166+
# Convert the x_train and y_train to numpy arrays
167+
x_train, y_train = np.array(x_train), np.array(y_train)
168+
169+
# Reshape the data
170+
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
171+
```
172+
173+
## GRU
174+
Gated-Recurrent Unit (GRU) is adopted in this part
175+
```python
176+
from tensorflow.keras.models import Sequential
177+
from tensorflow.keras.layers import GRU, Dense, Dropout
35178

179+
lstm_model = Sequential()
180+
lstm_model.add(GRU(units=128, return_sequences=True, input_shape=(seq_length, 1)))
181+
lstm_model.add(Dropout(0.2))
182+
lstm_model.add(GRU(units=64, return_sequences=False))
183+
lstm_model.add(Dropout(0.2))
184+
lstm_model.add(Dense(units=1))
36185

37-
## LSTM Comparison
186+
lstm_model.compile(optimizer='adam', loss='mean_squared_error')
187+
lstm_model.fit(x_train, y_train, epochs=10, batch_size=4)
188+
```
189+
190+
## LSTM
191+
Long Short-Term Memory (LSTM) is adopted in this part
192+
```python
193+
from tensorflow.keras.layers import LSTM
194+
195+
lstm_model = Sequential()
196+
lstm_model.add(LSTM(units=128, return_sequences=True, input_shape=(seq_length, 1)))
197+
lstm_model.add(Dropout(0.2))
198+
lstm_model.add(LSTM(units=64, return_sequences=False))
199+
lstm_model.add(Dropout(0.2))
200+
lstm_model.add(Dense(units=1))
201+
202+
lstm_model.compile(optimizer='adam', loss='mean_squared_error')
203+
lstm_model.fit(x_train, y_train, epochs=10, batch_size=4)
204+
```
205+
206+
207+
## Testing Metrics
208+
* mean squared error
38209

210+
### Test Plot
39211

40212
## Possible trading performance

public/404.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
<link rel="icon" type="image/png" href="/images/site/favicon_hub02d7508a1c89b2b7812eab204efeb9a_4223_42x0_resize_box_3.png" />
2525

2626
<meta property="og:url" content="http://localhost:1313/404.html">
27-
<meta property="og:site_name" content="Stefano&#39;s Blog">
27+
<meta property="og:site_name" content="Stefano Giannini">
2828
<meta property="og:title" content="404 Page not found">
2929
<meta property="og:locale" content="en">
3030
<meta property="og:type" content="website">
@@ -125,7 +125,7 @@
125125
<a class="navbar-brand" href="/">
126126

127127
<img src="/images/site/main-logo_hu9ad2f25a877e6fef77c7a3dbef5094ad_6881_42x0_resize_box_3.png" id="logo" alt="Logo">
128-
Stefano&#39;s Blog</a>
128+
Stefano Giannini</a>
129129
<button
130130
class="navbar-toggler navbar-light"
131131
id="navbar-toggler"
@@ -479,7 +479,7 @@ <h5>Contact me:</h5>
479479
Toha
480480
</a>
481481
</div>
482-
<div class="col-md-4 text-center">© 2020 Copyright.</div>
482+
<div class="col-md-4 text-center">© 2024 Copyright.</div>
483483
<div class="col-md-4 text-end">
484484
<a id="hugo" href="https://gohugo.io/" target="_blank" rel="noopener">Powered by
485485
<img

public/categories/basic/index.html

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
<link rel="icon" type="image/png" href="/images/site/favicon_hub02d7508a1c89b2b7812eab204efeb9a_4223_42x0_resize_box_3.png" />
2525

2626
<meta property="og:url" content="http://localhost:1313/categories/basic/">
27-
<meta property="og:site_name" content="Stefano&#39;s Blog">
27+
<meta property="og:site_name" content="Stefano Giannini">
2828
<meta property="og:title" content="Basic">
2929
<meta property="og:locale" content="en">
3030
<meta property="og:type" content="website">
@@ -125,7 +125,7 @@
125125
<a class="navbar-brand" href="/">
126126

127127
<img src="/images/site/main-logo_hu9ad2f25a877e6fef77c7a3dbef5094ad_6881_42x0_resize_box_3.png" id="logo" alt="Logo">
128-
Stefano&#39;s Blog</a>
128+
Stefano Giannini</a>
129129
<button
130130
class="navbar-toggler navbar-light"
131131
id="navbar-toggler"
@@ -259,6 +259,11 @@
259259

260260

261261

262+
<li><a class="taxonomy-term " href="http://localhost:1313/categories/finance/" data-taxonomy-term="finance"><span class="taxonomy-label">Finance</span></a></li>
263+
264+
265+
266+
262267
<li><a class="taxonomy-term " href="http://localhost:1313/categories/physics/" data-taxonomy-term="physics"><span class="taxonomy-label">Physics</span></a></li>
263268

264269

@@ -531,7 +536,7 @@ <h5>Contact me:</h5>
531536
Toha
532537
</a>
533538
</div>
534-
<div class="col-md-4 text-center">© 2020 Copyright.</div>
539+
<div class="col-md-4 text-center">© 2024 Copyright.</div>
535540
<div class="col-md-4 text-end">
536541
<a id="hugo" href="https://gohugo.io/" target="_blank" rel="noopener">Powered by
537542
<img

public/categories/basic/index.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
22
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
33
<channel>
4-
<title>Basic on Stefano&#39;s Blog</title>
4+
<title>Basic on Stefano Giannini</title>
55
<link>http://localhost:1313/categories/basic/</link>
6-
<description>Recent content in Basic on Stefano&#39;s Blog</description>
6+
<description>Recent content in Basic on Stefano Giannini</description>
77
<generator>Hugo -- gohugo.io</generator>
88
<language>en</language>
99
<lastBuildDate>Sat, 08 Jun 2024 08:06:25 +0600</lastBuildDate><atom:link href="http://localhost:1313/categories/basic/index.xml" rel="self" type="application/rss+xml" /><item>

public/categories/finance/index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
<link rel="icon" type="image/png" href="/images/site/favicon_hub02d7508a1c89b2b7812eab204efeb9a_4223_42x0_resize_box_3.png" />
2525

2626
<meta property="og:url" content="http://localhost:1313/categories/finance/">
27-
<meta property="og:site_name" content="Stefano&#39;s Blog">
27+
<meta property="og:site_name" content="Stefano Giannini">
2828
<meta property="og:title" content="Finance">
2929
<meta property="og:locale" content="en">
3030
<meta property="og:type" content="website">
@@ -125,7 +125,7 @@
125125
<a class="navbar-brand" href="/">
126126

127127
<img src="/images/site/main-logo_hu9ad2f25a877e6fef77c7a3dbef5094ad_6881_42x0_resize_box_3.png" id="logo" alt="Logo">
128-
Stefano&#39;s Blog</a>
128+
Stefano Giannini</a>
129129
<button
130130
class="navbar-toggler navbar-light"
131131
id="navbar-toggler"
@@ -537,7 +537,7 @@ <h5>Contact me:</h5>
537537
Toha
538538
</a>
539539
</div>
540-
<div class="col-md-4 text-center">© 2020 Copyright.</div>
540+
<div class="col-md-4 text-center">© 2024 Copyright.</div>
541541
<div class="col-md-4 text-end">
542542
<a id="hugo" href="https://gohugo.io/" target="_blank" rel="noopener">Powered by
543543
<img

public/categories/finance/index.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
22
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
33
<channel>
4-
<title>Finance on Stefano&#39;s Blog</title>
4+
<title>Finance on Stefano Giannini</title>
55
<link>http://localhost:1313/categories/finance/</link>
6-
<description>Recent content in Finance on Stefano&#39;s Blog</description>
6+
<description>Recent content in Finance on Stefano Giannini</description>
77
<generator>Hugo -- gohugo.io</generator>
88
<language>en</language>
99
<lastBuildDate>Sun, 16 Jun 2024 00:00:00 +0100</lastBuildDate><atom:link href="http://localhost:1313/categories/finance/index.xml" rel="self" type="application/rss+xml" /><item>

public/categories/index.html

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
<link rel="icon" type="image/png" href="/images/site/favicon_hub02d7508a1c89b2b7812eab204efeb9a_4223_42x0_resize_box_3.png" />
2525

2626
<meta property="og:url" content="http://localhost:1313/categories/">
27-
<meta property="og:site_name" content="Stefano&#39;s Blog">
27+
<meta property="og:site_name" content="Stefano Giannini">
2828
<meta property="og:title" content="Categories">
2929
<meta property="og:locale" content="en">
3030
<meta property="og:type" content="website">
@@ -125,7 +125,7 @@
125125
<a class="navbar-brand" href="/">
126126

127127
<img src="/images/site/main-logo_hu9ad2f25a877e6fef77c7a3dbef5094ad_6881_42x0_resize_box_3.png" id="logo" alt="Logo">
128-
Stefano&#39;s Blog</a>
128+
Stefano Giannini</a>
129129
<button
130130
class="navbar-toggler navbar-light"
131131
id="navbar-toggler"
@@ -293,6 +293,11 @@
293293

294294

295295

296+
<li><a class="taxonomy-term " href="http://localhost:1313/categories/finance/" data-taxonomy-term="finance"><span class="taxonomy-label">Finance</span></a></li>
297+
298+
299+
300+
296301
<li><a class="taxonomy-term " href="http://localhost:1313/categories/physics/" data-taxonomy-term="physics"><span class="taxonomy-label">Physics</span></a></li>
297302

298303

@@ -521,7 +526,7 @@ <h5>Contact me:</h5>
521526
Toha
522527
</a>
523528
</div>
524-
<div class="col-md-4 text-center">© 2020 Copyright.</div>
529+
<div class="col-md-4 text-center">© 2024 Copyright.</div>
525530
<div class="col-md-4 text-end">
526531
<a id="hugo" href="https://gohugo.io/" target="_blank" rel="noopener">Powered by
527532
<img

public/categories/index.xml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
22
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
33
<channel>
4-
<title>Categories on Stefano&#39;s Blog</title>
4+
<title>Categories on Stefano Giannini</title>
55
<link>http://localhost:1313/categories/</link>
6-
<description>Recent content in Categories on Stefano&#39;s Blog</description>
6+
<description>Recent content in Categories on Stefano Giannini</description>
77
<generator>Hugo -- gohugo.io</generator>
88
<language>en</language>
9-
<lastBuildDate>Wed, 12 Jun 2024 08:06:25 +0600</lastBuildDate><atom:link href="http://localhost:1313/categories/index.xml" rel="self" type="application/rss+xml" />
9+
<lastBuildDate>Sun, 16 Jun 2024 00:00:00 +0100</lastBuildDate><atom:link href="http://localhost:1313/categories/index.xml" rel="self" type="application/rss+xml" />
1010
</channel>
1111
</rss>

0 commit comments

Comments
 (0)