447
447
< div class ="author-profile ms-auto align-self-lg-center ">
448
448
< img class ="rounded-circle " src ='/images/author/profile_hu8a567cefac8c1a165d433ac0796ac418_3088978_120x120_fit_q75_box.jpg ' alt ="Author Image ">
449
449
< h5 class ="author-name "> Stefano Giannini</ h5 >
450
- < p class ="text-muted "> Friday, June 28, 2024 | 6 minutes</ p >
450
+ < p class ="text-muted "> Friday, June 28, 2024 | 7 minutes</ p >
451
451
</ div >
452
452
453
453
@@ -600,7 +600,7 @@ <h3 id="finding-arima-parameters-p-d-q">Finding ARIMA Parameters (p, d, q)</h3>
600
600
</ ul >
601
601
</ li >
602
602
</ ul >
603
- < h3 id ="finding-d-values -from-plots "> Finding d values from plots</ h3 >
603
+ < h3 id ="finding-d-parameter -from-plots "> Finding d parameter from plots</ h3 >
604
604
< p > Since, the stationary was already checkd in the previous, this paragraph is useful for graphical and comphrension purpose. Moreover, with autocorrelation parameters, it is possible to find better values of d that the ADF test cannot recognize.</ p >
605
605
< div class ="highlight "> < pre tabindex ="0 " style ="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4; "> < code class ="language-python " data-lang ="python "> < span style ="display:flex; "> < span > < span style ="color:#f92672 "> from</ span > statsmodels.graphics.tsaplots < span style ="color:#f92672 "> import</ span > plot_acf, plot_pacf
606
606
</ span > </ span > < span style ="display:flex; "> < span >
@@ -625,6 +625,20 @@ <h3 id="finding-d-values-from-plots">Finding d values from plots</h3>
625
625
</ span > </ span > < span style ="display:flex; "> < span > plt< span style ="color:#f92672 "> .</ span > tight_layout()
626
626
</ span > </ span > < span style ="display:flex; "> < span > plt< span style ="color:#f92672 "> .</ span > show()
627
627
</ span > </ span > </ code > </ pre > </ div > < p > < img alt ="png " src ="/posts/finance/stock_prediction/arima/images/find_d.png "> </ p >
628
+ < p > Indeed, from the plot, < em > d=2</ em > is probably a better solution since we have few coefficient that goes above the confidence threshold.</ p >
629
+ < h3 id ="finding-p-parameter-from-plots "> Finding p parameter from plots</ h3 >
630
+ < p > As suggest previously, Partical Correlation Plot is adopted to find the < strong > p</ strong > parameter.</ p >
631
+ < div class ="highlight "> < pre tabindex ="0 " style ="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4; "> < code class ="language-python " data-lang ="python "> < span style ="display:flex; "> < span > plt< span style ="color:#f92672 "> .</ span > rcParams< span style ="color:#f92672 "> .</ span > update({< span style ="color:#e6db74 "> 'figure.figsize'</ span > :(< span style ="color:#ae81ff "> 15</ span > ,< span style ="color:#ae81ff "> 5</ span > ), < span style ="color:#e6db74 "> 'figure.dpi'</ span > :< span style ="color:#ae81ff "> 80</ span > })
632
+ </ span > </ span > < span style ="display:flex; "> < span > fig, axes < span style ="color:#f92672 "> =</ span > plt< span style ="color:#f92672 "> .</ span > subplots(< span style ="color:#ae81ff "> 1</ span > , < span style ="color:#ae81ff "> 2</ span > , sharex< span style ="color:#f92672 "> =</ span > < span style ="color:#66d9ef "> False</ span > )
633
+ </ span > </ span > < span style ="display:flex; "> < span > axes[< span style ="color:#ae81ff "> 0</ span > ]< span style ="color:#f92672 "> .</ span > plot(df< span style ="color:#f92672 "> .</ span > index, df< span style ="color:#f92672 "> .</ span > Close< span style ="color:#f92672 "> .</ span > diff()); axes[< span style ="color:#ae81ff "> 0</ span > ]< span style ="color:#f92672 "> .</ span > set_title(< span style ="color:#e6db74 "> '1st Differencing'</ span > )
634
+ </ span > </ span > < span style ="display:flex; "> < span > axes[< span style ="color:#ae81ff "> 1</ span > ]< span style ="color:#f92672 "> .</ span > set(ylim< span style ="color:#f92672 "> =</ span > (< span style ="color:#ae81ff "> 0</ span > ,< span style ="color:#ae81ff "> 5</ span > ))
635
+ </ span > </ span > < span style ="display:flex; "> < span > plot_pacf(df< span style ="color:#f92672 "> .</ span > Close< span style ="color:#f92672 "> .</ span > diff()< span style ="color:#f92672 "> .</ span > dropna(), ax< span style ="color:#f92672 "> =</ span > axes[< span style ="color:#ae81ff "> 1</ span > ], lags< span style ="color:#f92672 "> =</ span > < span style ="color:#ae81ff "> 20</ span > , color< span style ="color:#f92672 "> =</ span > < span style ="color:#e6db74 "> 'k'</ span > , auto_ylims< span style ="color:#f92672 "> =</ span > < span style ="color:#66d9ef "> True</ span > , zero< span style ="color:#f92672 "> =</ span > < span style ="color:#66d9ef "> False</ span > )
636
+ </ span > </ span > < span style ="display:flex; "> < span >
637
+ </ span > </ span > < span style ="display:flex; "> < span > plt< span style ="color:#f92672 "> .</ span > tight_layout()
638
+ </ span > </ span > < span style ="display:flex; "> < span > plt< span style ="color:#f92672 "> .</ span > show()
639
+ </ span > </ span > </ code > </ pre > </ div > < p > < img alt ="png " src ="/posts/finance/stock_prediction/arima/images/find_p.png "> </ p >
640
+ < p > A possible choice of < strong > p</ strong > can 8 or 18, where the coefficient crosses the confidence intervals.</ p >
641
+ < h3 id ="finding-q-parameter-from-plots "> Finding q parameter from plots</ h3 >
628
642
< h3 id ="grid-search "> Grid Search</ h3 >
629
643
< p > Here’s a Python function to perform a grid search:</ p >
630
644
< div class ="highlight "> < pre tabindex ="0 " style ="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4; "> < code class ="language-python " data-lang ="python "> < span style ="display:flex; "> < span > < span style ="color:#66d9ef "> def</ span > < span style ="color:#a6e22e "> grid_search_arima</ span > (ts, p_range, d_range, q_range):
@@ -645,8 +659,7 @@ <h3 id="grid-search">Grid Search</h3>
645
659
</ span > </ span > < span style ="display:flex; "> < span > < span style ="color:#66d9ef "> return</ span > best_order
646
660
</ span > </ span > < span style ="display:flex; "> < span >
647
661
</ span > </ span > < span style ="display:flex; "> < span > best_order < span style ="color:#f92672 "> =</ span > grid_search_arima(ts_diff, range(< span style ="color:#ae81ff "> 3</ span > ), range(< span style ="color:#ae81ff "> 2</ span > ), range(< span style ="color:#ae81ff "> 3</ span > ))
648
- </ span > </ span > </ code > </ pre > </ div > < p > Indeed, from the plot, < em > d=2</ em > is probably a better solution since we have few coefficient that goes above the confidence threshold.</ p >
649
- < h2 id ="6-limitations-and-considerations "> 6. Limitations and Considerations</ h2 >
662
+ </ span > </ span > </ code > </ pre > </ div > < h2 id ="6-limitations-and-considerations "> 6. Limitations and Considerations</ h2 >
650
663
< p > While ARIMA models can be powerful for time series prediction, they have limitations:</ p >
651
664
< ol >
652
665
< li > < strong > Assumption of linearity</ strong > : ARIMA models assume linear relationships, which may not hold for complex financial data.</ li >
@@ -840,7 +853,9 @@ <h5 class="text-center ps-3">Table of Contents</h5>
840
853
< li > < a href ="#5-model-selection-and-diagnostic-checking "> 5. Model Selection and Diagnostic Checking</ a >
841
854
< ul >
842
855
< li > < a href ="#finding-arima-parameters-p-d-q "> Finding ARIMA Parameters (p, d, q)</ a > </ li >
843
- < li > < a href ="#finding-d-values-from-plots "> Finding d values from plots</ a > </ li >
856
+ < li > < a href ="#finding-d-parameter-from-plots "> Finding d parameter from plots</ a > </ li >
857
+ < li > < a href ="#finding-p-parameter-from-plots "> Finding p parameter from plots</ a > </ li >
858
+ < li > < a href ="#finding-q-parameter-from-plots "> Finding q parameter from plots</ a > </ li >
844
859
< li > < a href ="#grid-search "> Grid Search</ a > </ li >
845
860
</ ul >
846
861
</ li >
0 commit comments