From 5ec7de803ffeed59e7cfb8b1fb235a3f3b6cf5de Mon Sep 17 00:00:00 2001 From: haritha-j Date: Fri, 28 Aug 2020 17:46:03 +0530 Subject: [PATCH 1/8] add gsoc-gpu page --- assets/images/gpu_approx.png | Bin 0 -> 4089 bytes assets/images/gpu_knn.png | Bin 0 -> 3741 bytes assets/images/gpu_radius.png | Bin 0 -> 14977 bytes gsoc-2020.md | 2 +- gsoc-gpu.md | 101 +++++++++++++++++++++++++++++++++++ 5 files changed, 102 insertions(+), 1 deletion(-) create mode 100644 assets/images/gpu_approx.png create mode 100644 assets/images/gpu_knn.png create mode 100644 assets/images/gpu_radius.png create mode 100644 gsoc-gpu.md diff --git a/assets/images/gpu_approx.png b/assets/images/gpu_approx.png new file mode 100644 index 0000000000000000000000000000000000000000..054edd35307901f57335d281578fee80d275dc6f GIT binary patch literal 4089 zcmeHKYfuwc6h0wf6nWUz0v6I1nu;wXVsU85Ac9XaB|!vI2@j*vpklEp4bOtH<71Q- z#!z_`VQd}I1OyS3f|R9HF@-=i24XQZ0zm`=0U96{uy^Zprv6~6{^$%HzM0)*@7{aP zJ>Q&r&bP<5us2(pzh({qmQ245TLDaM0Em+dP0&c+`Tm3GXVE8qp>Y6KKj9bQq>Gh3 zfVGsl!8bUuK&M{2H(b4}X{2c8Os4inxsMNz`es92ppC{u&*^qu7ei0a9ladcVDPcw z4TV*2b+ofv7<9w0`lKM=N5jHwvhUv{r+T|ga%`>I*0eE^w|UGp1`>z3G0H^%nE#hO zbe6)+Lg^UHD6s>8ym zle&wF?r&ZWJlo5%VcBH{5g^wx<#R-Z=8oq%WJ@Ec?Db-fh6xq|xRKU>no~0EetjkX zybEB3DoTfpC0u}B*!Nk_!X7qJ+0GriK>*MbxiN~>EXD1VvRJ}Qe>DRc&undNWwY55 z7XXp5zs}xw7Ar3_j6Ea=lK`6;lsVm@9YPf3_UPznCLR)QGVggpDtFG1 zAo?dAsm9HtIZ+ZC^1$c$x68JjA{@oHh80YnSfp~H8l^9-Eiaz=ddkIET;>2?*@gbH z+{%HMLA9$1&C7gn21N$BS&k2smSBA=WxxFQrN}O!(XFIoP*BM~=6Uf1N}EJTa0K|d zgd>&nNS_A2~e{iRVa513e=lA)$xPISONO+fa#Fs*1gW zpnMf`C5Mmz%)06iJk~bn1QdC*6v67h^XebGxVdN7n84$kS?RqScebI^uP0OU#I;f9 zXoNy~Sr8GJ75=f}!Kk1J+~FoEDn#>`XW+kPKs3xaXC(^VD@zjhUg=(fI;+)@URtPg zU5zR5nlhwBUN%Q2rN~0??)U_HS3}fiBoj#d;qGA(=~39|#J~a&F*26uecs5Tbavy4 z?-@~eu$?P30c^-2goIuFs%LUs*Qrn_1P;#|F`$fDePaGLxVlBjlkKVtIJyYN+ao3!7K#k7a zHEWC_lh93HAl9_-;Z%pF9d?rSnjE?W6(iOAwzjszU3GsvG7O>)xSFN()Zak)fE0zC zf61u7*BD>E$WoCos0Nx_Dd^eFM5R7zROW%_(LD?)SF0JxlDb#8;Qn6HM zX~DIWS$4BlE|(h&2L7eyN^leaVT z!g17BM0vZ9S?d2<^RU$y8NJkzwXuRplYW5SEwjtokE<&zwwi5;wf(xh-nnzY0-bV? meKAkFnSYvR;Q1I(hcH5Tvy literal 0 HcmV?d00001 diff --git a/assets/images/gpu_knn.png b/assets/images/gpu_knn.png new file mode 100644 index 0000000000000000000000000000000000000000..84982f9262f00a3f3077f7fc81ae995d4310e90a GIT binary patch literal 3741 zcmeAS@N?(olHy`uVBq!ia0y~yU=adhCJrW`$hPa4SZ z_jGX#shIQjuA$$P2!V!&L9@9nW-U?O#A~G*qoiVSc*%-&8^tT`99)?Hutny{n&y*H zS)3E!3Yy)_C~Iku4Se_Qd+dUL2O7Prs`Notje^k-7#Sg8p?KNE_i{)q1;cZ*crq^=XMco@#zSZ0MJPByc_^khfaU4}WUeM=b_K2-k^{CM{n(6Bvc z&yj5;R*eTDPn>zh!oZMlFeTgV}W+AIq~#q>h)58fB(-m;oGt59TkLy zz?D;H&a7Dn(O0k2>R&vD~Tu*msE(-NMf9A}hhYSugyI(Opm~!XNoi}B>J0#i`?qX&5VReno z!qC~}A^-1>6+G6=#M`vNZRX9W)sS2#bNukb2{W%MGQ7VO%Fb{=>TB))PoF-0_z)mV zwT8-{HLgB_k#iYq_JWFJ>VP?`&#dtS8YQM<{Jc``_(m(1AJ;+k$Ak3425-Lqe*E#r z(PHC8wI4lzLR)$Si(4)SeuxK^&If|N2-dsXgrrl)9fRGJKk;j~7Rs#FPACOtA_fNe z4R`a_uU~Jx*^1@b($##x;Jb5l4j2&of_cbLx!L?S*VGzg6 zINhQ5NHb}j9$no+1Md-M$^b^bqnzyNjxt#_{1$p>ES%(IUV`A3mckW5SD#n64`M>plJMjP7 z0q2ZAFnFIkHWLa?-M7J>fy7-{LkkKDbP6igUqVf4M`ow55ds^JBtm~XRyjF2?TIr( z(4=@MZ5iK1){ll;QY_Tr1N@Nl%0uZNo*!Rs3D9#_jIE+jC{LuZ<8{h&PS}Q`Vgm?z z5X+{Nw&0Vd+ZfEI#Ab2Ed23dq$&>P2$bPHEU2ZWe{RB3plA$qllAemH2UmnSFE7PY z4hYgq$3;BNk>3!u;#j zNt#^tXXDfZPw?0v+<-1JFoO?uJjHT66)t=91=W3JH~gp5S717p`QT9p?I%2R7%6ZNzlRJa(`d}C&vK+79YjApx~MENYOUd&_k8x zd36J6wFJwhi#K1v;HXBFvSm;KO$xqm%Gu`9nc(NFF{H=2W2>GLsq@;ti0YaerP#0Y z!2_J|zV8Ee-Bi`;eLG7_Jm>zlr>CW-JK4W4G3J;m!Xs&Ixvd+e#oKLdS<3xVMIYU>*iKCja_>2k|1jxQC~qp)n1>X@9zPsm~3GSsQJ4 zs4&=!?6A1|N%eBPJP9=EUF0It=tJAz->XGU21WmB$1_7r@A*)HyXENtdl;|gDd#n6 zTf}jnETNAx-Tx_u)ykk`=OJsT>G4)#iAjCFjF96I_Jn|yJ({9vB&<52Sh8#n`_hlU z%iPNEkYV~r`!go!Xhs`*A94=Xig_)|n@rKQ%1WD>nhFw%Dm@qDXli1jh7)*mqw!!q zIdA$K0|Nt2@gl2{v9YwY^b2tw`nM{N#`u?Nzw1(f2j&SCs`|b>n4WdemZhP+^zenT z$OfN)y!P1hq0lT5r{wx2KR&qTlr#9~Gi2T+m2{NlbiOBHNU6^q#uy)zn8*Dv!7{Vg zX6>4e-MO+qP8#%rj>oi+H*-dsS#HqHE`pfskd^#JtShTgzT?su^C1Ut051ZG%Ej6vyP9COG!!D6j3$36v_-veWS5u zi>NdxI(+-XOehKC9T?b9@Zn}-2A6<Ow(2h7;o?JMQSZnh$fOo=U125E*u-1_A=>hoP(9`g7fM*u zWEO?@Nul9pI)9`Z=kAv~LX*(qr|#Y{RI5?RkRRa?H_n2?6NV3Kk382Lh5yd2tqH*U z1RIZBcc(oxFge~?egOfgmipc6%$cjE{q@^iu)`iNx3 zS(6e5yp~psY*o#~R3z%I(u?cG`}UqlOcC1)v6DN4lzA9OFNsu+l3m8PqY$+6aN7`- zRdeyC{n_>$TL~s=uYlzH(UntIx(Nf(E-$CO6LmTtR0PnRg@UaH--IAuh~Hc|XYYPx zY&KR5RhD$-_bjQqSJbDG+N@1m2&96@AeU7Ptsse4fTI7>z9oT+;d8iEvz=PGrVc29 zMj|=))r2aUE1t%^T_$zcmNbrdOMy-Bh0HDuJonL`FE(qr-`uZo)Y{rAaE_MYwj8<0 zr)oF1iaZw=m)=Ep&yNMa*Y~gu0e)UyRTIr{Y7!`Vc%^8W5A`Uscf9bex3s}-?S#;p zO|OJ};bCpZhXVD|18=qFdxCdhLG0?EYlWdEl!zG8S0h1ToSA{Q(v@O(Ry?_1(4eBY z|Ihfl=jpD6V*9w-c(I_lg|O#0oiIF5l8WpFHO&=^6t)gCplMOu8-wdfQI2J)RuX-j;cSA`z-Tvdt%v0f+^d#1GQ z*J;%R{erjRw(VAD;(TSk&BRwyC^*0Xm+hyI&Ur=-hl@HVShC3Y+g!%bIq9x*8Cmz% zwu~=*569knwn(mPTs0!JAfp~zapEv1S$1ZWTU)Ps z`qj6KQhZs&`Lc%A_!ZN=ydv*79shUyd>E}ns#hHSM>O9 zdIm%l7GYP?BR)n5JMgyN1zF>Xym=S7RPVuD`5rOw3$f)t&icz<;xXB6xo(hgvQ zof`gLa`Ho#N-tm6v6@CTJd>1ypxu&r-{Qa?+KVu4FB8G{0a1}d{Mo{F?@=S#ClZ<; zo{QKDC3i`GZ$2|gb{W@_r$p>c`4~que|aA`i`1SBv)Hm*Po;kK_4P5B`jhsm%duoo%4A1=1`%Q&M!*E(bxRPRYOh|tF>w^~@gaB(F^ zJ7_pwgEp+;$^@-AdUG3i+Co8F=|`Fo^Zmep)|@*@a#mXzo82fszqum6dd zd`*N~!PX+oG3cJ2rF-;XZ_SZav+892V2gF?oj?&<-hU6;&sBwT&xV$R1D}F{2-9{S zHC*N)=&80hkoqEvXBDlHFB+epAo&}_g~P3PX{r>!Np42&Hy9%aujk|+M}aP!o|aaL zcL!Ce^cfeF5}hP&7$g|PqvX5nwPGbnqZK3}8SHv(-~(SO1&nD|U`=_gbx<=Sr`FJ6 zc#2IamY81}M#`hT7fqyBD30JGPkcDId7pL}J4UUq9!lQSq@h%bhVm(%CElf{HTWCn(c7>h)!G-B61k@h*-UD9SaOK#opL3%gu?Bm;G0hY@D>2sKTB%>TyS_?g3L6> z?AS{Wgz~X%mF+5$CW%#><|}xI9kuP@uTp4iV-U-pY$!u0W372PzZ(NfzI7~zQ~8p< zyRCI{)-v^7E=ZhX^PXVXN2!z65v>%ZBcpT#)FR3dDPts0zK(4yS!?Xz+{}M=POy9a zhe~o1g10l>%1|eQ!C)sLSt3$qLDH>0R6ye?ROqnA^H7BUnPJ~syHDANF$E)jU32ak z60dXDO~m3MTDfack7PK!EE?}!z*P?;58q{OsaC{uE4%rm#XQQ?j+tS(5FF26+t-Nk zcuV^;oQ8$5hl?Kb+@#{~6OXk^vcC%E>@m#~Pf~hs%3T{~eMf?*qd{FJRYA)xUsXto z`WE39;{7xg|6Y?v^?cKWlwtP6qiW2ehd;dkM}MKMttHP=Yh-+{jm2DeUnNFUN6H{h zAz1u;QhrR;j*+xBt{9;r>OsxNd0DckjuINHT5jDz%%JBjkOlQ$yj!=A34TMjQ}S=d zzcg7q(iE_1!ObrHV`WV$QXZWr%}~LPTc>RL!W&}Y>8_%>gHjfyAxOM`YWD4p+nraQ z2oczg4E+hNydHV;`Ut9UVXwEZm6ZIR4kSA|oXWQ6ww~|UR%8B~Ij z$k$@6K9@7-+&ZcTiK?5Mf3WK8X@bE3ik$!p6#u^Vdw|jmmFnG{r zq^UmvH-sHaVpE81A(Zh+%E|_T+LNyiDhIaS&&$(ucV`ENs|)$@x#LnNGNj(bX~3Fu zX%7#egcNhhPHua9JDbuUSbcT1x3@Rf`l61dzUbaJH?J_R2I`6e)`{DJy8H@SdT>!&$3P1^zOjYv_A4&;Ipx&RK&z1^l_4!zSor(P3g=nQZ zjzE9^cPvun+o9g^>1-W2>46f=F_9V74Gn>NYnVZXcEh5<@%@(>Tn)#E%Sa^BRRzH| zq~v;%bG>zRI!hj>8_hD@!81e^6BDCS5;k)+7`>>V6JU8r2t4ovH=P?+8ZSp)&Z_>q zWp(vMf~9aYS3@Ksown2MaTM6F8mH0x46b8*;|X%WuGLAN97N96RvA~swH_}oFWuws z6D#`lc6UE%q0PE8ze1CKomaPO1}@J;mD;_Ivn!jdYHcA>mcJKRuO7E8}I}`YYx|Uqwc>(B0|9qKV|JE&E+|H8~a>3VyvV*>Y!b zpM#%2i+XiObAnZ1T}yl2wtHec0^d$KWT7I=rl3*hw^!`*7bWmhXZr5rZBxM0!1-RI z1kadYDYAF>Ipx4bRUae9hnLnX2eI(5peFOr5(J6T( z{-yVAoooo>Kp&gS09Vo}@bK}uF4o|)&7lAz@ z4d5_ zcT+B#5TjPT**vd2Uw3S?Tr_lJF({%9KCt=TQfUNBJRz_mYbdb0;inaYRLJ4IVE~W8QeJQN zZoCs%1iKEY3O2N79d+x@=9ni~MJA}eoqN*hG;^$Zi46ClMRSqXCCZ#vy2#{JJzE@K zfNs-0J?>li{zx6vvb43?v#z`i(=+FL7|C+@5)?{wZoZ|OkL|qBoMDK`3cEzQIk?{s z|ACNLTc4;`X$WdR>{FLbsK(f+Bj?15&r0-vik zX5y=pZnyA|{fI_FnLa-6*#{1k=^S6}fSrhbX0A?0K^jkiv5OAXXS>H|B*g`+=u)!x zr$!yCP>+fijpJ-<5w&n=jVb21q2%Qrz1yJBM&`WE0=NM#z-^!kyYwSUU(l6wz8ZsPn%;4 z6*T;}yTu|UY{O^~>CC#?(Z7SJn8@y3PpG?12Yi>k2^zH0Ze?o3|cXyAB=o4NHnUrwed=+we zJ%F!1=Gzkg$zeJ6PtePy>m^p<9TrcGm7)iG63+y$zVYZw=j`&zaz0A3?{z*oEFPvx z7u)SllKAy!XyN0MR1ejDu@6~(#oMVjd`ptO<642O9JoT%W?_4N z$N{T0=vv2x-B_wY6p8QUY(6jZL@atk-uzRxJTnEZV>IoYe>Df$9+H+T2maK-%U zDyVw8C-!0$T!#H~J|_1#-!@0)AQd=aMQe~a-)30AR!&UE2>P3dh}wSZUT`tO&K*c>j?^xS%yv#y3+IgWp+{{Itw=< z(+5S-= zIop8o6J@lvHp_cwb5BfO=F##;<9|$g9HJ6$p|@#RWV)yl5z>7<(yva)#h=+pyCAHV z+a9F{9Ut`dX>PybR6%{~85FX8%9S_Db%&^Q-SqQLRF@d2FzYSpt9!EqKDeJVB3{Kkq0KGd}K) z#F&(bjn%NDw-JBT2SV`Jmow&%e%K<4ZrmWui#2{S9P;>pQ~pu~G?1I4tGjb^mLp7= z1{uPEYVGLBt$UJz5IT@jaMVP*@-t#-kKw!zputN{?S*#nUHd~uC4Yq+j8El}wAu>8c!?vx9aOJ!rIUDs#&XcEe9(Bu#F-Lu>L1&EZ-DXSsB*Fw?WE z>fZU#Su&)SkehAR_W5<5S-v*SFC*Rovap5kc6FxRJp=OH-8clMx=m^R{xTEPln_4` zZ+^ik;5u%Lm3NMnuk$FHCEXG}T$4Tmzsrq6PufW+UjNI<5t54IC74pE<)93&nFY;5WnTxn_ZU)z>GQ^aOQWf|})>FRFA zC{iso!;Vzy~<%Df})ejU|B^@;EOmt!{r7BGlOgF&2j8XFqv6|ds&hB=Mf`fJ2 zIWWrCTWHdzFJp)N2Sxt-@dG2W{`*2q)Tq_+WmXaX(nLEMALkHf>Eh7g3WchJ^e>8? z88f;=Tubd{c2rGob8Gr4!}PA_WqV+hrz_+LzP~u`Ma(x34?3N7(B-==R2kE-1-Uwsc?sbL9&F15&jH03dLoyHQ^kFf>g}+2_uCO2m+{PG z5L3-a@@to{Cs+@puand2b--}+o_DSB{VaWR<+wWSzDLFzS%jbDXQeF|^zB21;55U{ zqQe)skVcNkOkbz#hmJ;|@v)O6{qQTH1{ zrTw{Yk&@BpW5+2}^X}WKrasNbw=#LpwZ;} z6K6lavDqgsRav~ukbOZ~bUtUcCo;gtM_>923DllHnol}8qQ$`Fsf8l>Tysh}bErbj zbHu$J3vJrw)m4-yCNRJ>_?t8vKi~RJD~t(we;!)ycBCXmK0uQ~!57S4fSDV{k+&9t z@ZWJOZDsj0o-|{#J}9gL4H##Lo8zLmJ$K#>wE5^KL_NBbWdfF!gVb zNT8M1QWu8HU~nQmqKouZk;DQDsFriZaHTsx`$ewpMdFBHx7QG-L6yjT_U@0jd8U46 z#V_Qy|NM?CyVdz16ZETVlsW<4RbwyiUW=r0HlH4yX;37A9JJFG?D_6m=!-H@4y*Xd#cc6O>4!Q)XEU=V`|?f*57rPA(PNB53?K#{lN`Z zs*u9qI1339sOu6ZMYp{~mTf+}C-+Co)@_=5 zI@M@>M*=xBb+(2}p$EK;4dIcSuHajK9GGEk^LL91XF^ag13#*4-dG$8ZrufmTHg)< zzvq*NPSyQ;jU1B9Vj=q$bVD??y1pJBJG-=Gx26t|wCH_t=*jv_01aezaZBoC4Y}TM zxKuF060p_5d2*0`a-WGhJxB|TWw*i5sFA=9V?P@#@~i7ynqh)v`Mq%3pE#UH)mSQK zg6-n%(9dHxAtU{>Sxb~pTKi@7%-+Z2bWq`xAdr882U?nPKra*?T49K}dR}T)E0lNxVdS*Ncw#D%V z7=CS2&I^#VGho2;`sD9CB8z=ji}h4$06#&<2x^ve}#aS4HsF z`Uzq0*_F0EB9_Y981*NiCcc5^FBi&}Bp&cxL_RJVQi=N-$2QY;z7EWskNbJ>Yio{D z%z{N)D7Onru5oN}Y>WI@1}HeudVuUD2foC;1qRplk#P2jeQZqvnFYXI%SeKoS?}Et z=TmVjBY~iKRGEV~(*;%1;5)a$tG8x0#g$^m#>R-VE8_CD)mhR9B{8+jVKO+Bn34qp zqk9i@1l&Q;F@?%ke^l?&dL<6Qc1}-48l1|fbYWB(ar1rB5)EbWE9H$xsK%pJry%@M z2)JTehYq7zIBi(-mo#7S=w+d_U<~CfUidcF?%D$nau+c1YTDH;$J=^|5*MJHK6h*>ue-EI?rvb>R zdT`)TV2zz~P4ah@?Y%njtI9By7s_eXtW$O zk5xIvv2i!TW7$w;00M(-k=@=W0oCSdx`*+0Zr<;__%~lbGz}&$_ypWHfXouoMw%{x zl*y4L*j5zb1pGmLzy{)rK2n0n5-Gi+aIiZ5H@jNn%mv8X>TdbVJG2F|@oh7kKnC>? zW({@r@i{Qsn{jED2Dh7Np7kJG3SG`OhB1SJiH$gTfX~=XC=v=@ zU|XUPw=6MjoYwl_P-O<3`d}hIi0@q@jxF>PR{d^)*mvnH>tZBZ`IyTI zO?Bc)0(*8U#bXnQ*S~p}zk<2ZBt%vxk~>)@Q;$ z>~wB}&}fSO3s4eM)O>1r!S^rVClgfq`sL0_$=ZV(ds|&yJxq+b;OE(|0R;J$eOFf! zuBmy^TQnmL3f6pYIRdD^2Axvusbw_;R+qmOy!fd?)MOc#RqcMcaJj5)*m9WeV>4zR%4Y*NmE#EO2TS~KCl$%-nILMuEXsiOtqzMpxDZBeb^6**^T zHofnD`qf4ZumYblV8n6dflSI+R{X#6Vrjmk+WLp zTdk5#zJ9Mr@*ox1P)bVT*lOzQU+lmCH$?HeF94G5aiKXWSSf3)+y{$%Gm>i9okN zgKIMqKq7=9&R1G2$@FtzVQsT@4HS}t>e#f>nU#q;I%CIoe8xj`PA0d-Mph-Lw(U|4bXlQ9rXF%euX$SIw^7={C^sp@^yvk^e$)IPn1~`7Bz{+55>y;n6(SMoUZ5xl zll$}uYFg}#AvMVWvnEeZQDi_C=@Y=@OikOB6yms*(-wdj_dydc*Prv-LZ5?ybkYJe z1woCHp^L0tqGxT0!DhV{lRsTJc`8o;8>C3k1h5ggNA(R;Q;sm>}u5ud3x9@UAauC1K2Eg{=Q}{7Qp<)!?*zzXZp+E~%XHh5`$Yp5gog?Z^8B5&Yv`5^;$^u1PvDD|grn z2Es;U;73dSoIWM`LNZZJ_Sq}xG9`MI*@4kVn6+Q?4GgH74br(t(_2N?P z4KS@Nty=4O#h$F8Q_gxa&RKyzDM`4&?j`&*iF49USJ)LgA{r=PraT?Nx-Pl$uZ`PP z1 z$qLag*e^Lz2JLTrF?yPmFA6NE{jM#_*@K`s2FvSSarwA#uoAO$=cCn`lmrAK^v!dC zD6ecvkvw+kZGK1!IH7^MbITtXe?$hOJM++WA_vV}3L4$;iDR>^Fc1l6jA=IwBH+T* z0R>r6CVvKFgumu_Q7_yI<_LfR+rSJ)^*}ukx3NWt)@5ULzH0i{?$>qWKjv+LpHM7R z%FEi|6v2Nb8uOA>Y4Xw6hG3nF`~mdPWc9Di;68>c{7d#n*zs{d9~gj!{z5l}Xj!@* z(;kGc3c`b-J$@taC@Qcq$j$9aW3(bL%J^qw0qzq0<~mu*$7(tuNp?M7vJ)`k`lKo# zCOl3%lZ21z)r_^If!%or!9S8 zeUF8iIhMG^K=WA4tUaAm|2bMolDjELSXdih-vgs-EjthLFMW&sOrmQg4OfVfwq5Kw z5Dr_pbEz9%3M!FGtb)3f*W)p}g7Xm5DZnrj@ktv>e2&1sUvI##t*oq+qh;*3hJjOD zktOJ8v+;PjQAg|;rgUr?>z~Z##v3r>nooNga^|zJYE~gSsNpD*_QKDK(T2ZqcBkd*u01@C|8UyEK<*c{L`y{gbVdkfe( zcE#tW<&hZ23WI?~<;D5=`IQwurC3mlhziKvPhA))K%lANx8LNyeB_3bevvijiO2@W zt0(N`u$VsUa|~2;s*CC*(D|Q?WK3CnDD&i{L-(cRI)D5UIp5!OhcwNXILtzqCKKn#1*e*oufKP!JTf*ds?MX1fy0AR!GrfTrj4#vReD)C=$Fteh8 z?V&pU2dD*xn#P%2b3V+d8Uf`xlbt5*bepVjL!#g}^F)oUGq1G}RK~uC=DWc}TekUQ zk+Zxb-qYY<$p~DNE%Dtf9B~~Pm6w4wmbzpltearDNe(9W1?we=Y)V9b)8X;42@w!a zqE)WA4N2s849*RS*wa~ZkL7_5G5J?7GkAtxj_4@{PM&0IwBty`VOH3`yE3rHoK6@O zH24@FkWYa>4bUHt@N>ih_q5O_&_O^`E0AF$I(i0{aR}M3^E_ZO@6>R>p~kwlwgyN! z;p61uqHS*q_yYq_VLXxbnCn1!`UV#c%(DuI!2yTc&d~{{-Pr}G_s)}`0cLx#nleBa zJ$X@O+DMy{Ok?J|#{bXcf;ux61hXOiKSs{`Pa(+0g0l`J|O>)tZ*J9d1 z`^W}G!x>S=r_$%{=|}!-ZCBs$T@1L+t6a{>7&-FDt*-?I7jWH`=R|vi41)HoaQCCEh54N1PfTykD7hUKuBG@JK$@s5gq}97cA0iD$=r(L4x5x zc-aF(17N5WXD4TMmq;ZxLqiQZtp7G#gHCho)Nxz`PBgld6godiR6r-ch0+3q0gm(F zMJyX=rbkCdxaA>uG~k{9bHsp4*178^%=rMuPQ3pA?6q++mP({a(AfvgM<*y6`ip%3 svmfZc-F>j}1PGb`TmS# +radius-search +knn-search + +**Student:** [Haritha Jayasinghe][haritha] + +**Mentors:** [Sérgio Agostinho][sergio], [Lars Glud][lars] + + +## Modernising the GPU Octree module + +Octrees are specialized data structures with nodes that split into eight subchildren, and are a widely used structure when working with point clouds, since they can be used to efficiently perform many operations. The PCL GPU module is a crucial yet somewhat overlooked component of PCL, and is implemented using explicit warp-level programming in order to maximize performance. + +In many cases the GPU algorithms can execute tasks magnitudes faster than it’s CPU counterpart, which can be crucial when working with large point clouds. Unfortunately the GPU API is quite limited and often lacks much of the functionality offered in the equivalent CPU algorithms. The primary aim of this task was to bridge these inconsistencies. + +While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search and Approximate Nearest Neighbors search were both returning incorrect results while one of the implementations of the other remaining module (Radius Search) was also returning incorrect results. + +In addition these functions were utilizing outdated CUDA primitives which could easily cause further bugs in the future. When diving into the code, it was also discovered that the GPU approximate nearest neighbours algorithm used a completely different traversal methodology from it's CPU counterpart. + +Due to these discoveries, the scope of the GPU modernization effort was expanded and prioritized over some of the other goals and thus a majority of the internship period was spent on addressing this task. + +### Fixes to Octree search methods and modernizing CUDA functions + +Related PRs - +https://github.com/PointCloudLibrary/pcl/pull/4146 +https://github.com/PointCloudLibrary/pcl/pull/4313 +https://github.com/PointCloudLibrary/pcl/pull/4306 + +After comprehensively going through the GPU search methods to Investigate their functionality and the causes of the above issues, it was determined that there were two separate bugs which were responsible for the above issues. + +In approximate nearest search and K nearest search, an outdated method was being utilized to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ . +In radius search, radius was not shared between warp threads, thus the search was being conducted for incorrect radius values. Synchronizing the radius values across the threads fixed this issue. + +Since much of the code inside the above functions utilized an outdated concept of using volatile memory for sharing data between threads, they were also replaced by utilizing warp primitives to synchronize thread data. + +### Implementation of new traversal mechanism of approximate nearest search + +Related PRs - +https://github.com/PointCloudLibrary/pcl/pull/4294 + +The existing implementation of approximate nearest search utilized a simple traversal mechanism which traversed down octree nodes until an empty node is found. Once an empty node is discovered, all points within the parent are searched exhaustively for the closest point. However the CPU counterpart of the approximate nearest search algorithm uses a heuristic (distance from query point to voxel center) to determine the most appropriate voxel to traverse, in case an empty node is discovered. Thus this algorithm will always traverse to the lowest level of an octree. The same traversal method was adapted to the morton code based octree traversal mechanism and implemented for the two GPU approximate nearest search methods. + +In addition a new test was designed to assess the functionality of the new traversal mechanism and to ensure that it tallies with that of CPU approximate nearest search. + +### Modifying search functions to return square distances + +Related PRs - +https://github.com/PointCloudLibrary/pcl/pull/4340 +https://github.com/PointCloudLibrary/pcl/pull/4338 + +One noticeable flaw in the current GPU search implementations was the inability to return square distances to the identified result points. In order to counter this, the search methods were modified to keep track of and return the distances to the identified results. For Approximate nearest search and K nearest search this was relatively easy, and did not incur a time penalty. + +However this was not the case for Radius search, as any octree node that was located within the search radius from the search point was automatically added to the results without performing a distance calculation. Thus a new kernel was introduced to efficiently perform distance computations for points located within these nodes. Due to the additional penalty of performing these computations (benchmark pending) it was decided to preserve the existing radius search methods as well, without deprecation. + +Additional tests were added to ensure the accuracy of all returned distances. + +### Addition of a GPU octree tutorial + + +This tutorial aims to provide users an in depth overview of the functionality offered by the modernized GPU octree module. It consists of; +An introduction to the GPU octree module and search functions +Differences between the CPU and GPU implementations +Instructions on when to use the GPU module +Code samples detailing the use of + Radius Search + Approximate nearest search + K nearest search +Visualizations depicting the functionality of above functions + +### Other work + +Related PRs - +https://github.com/larshg/pcl/pull/8 +https://github.com/larshg/pcl/pull/7 + +Some of the other work carried out to support the modernization of the GPU octree module include; +Addition of a CI job to ensure compilation of the GPU module +Adding GPU octree tests to the main test module + +Furthermore, there were some noticeable instances of code repetition which reduced the manageability of the code and these methods were cleaned and code that implemented similar functionality was refactored. + +In addition the GPU module contains “host” methods for approximate nearest search and radius search, which are essentially methods which are synchronous versions of the GPU code which do not use CUDA kernels (i.e. these methods run purely on CPU). Of these, it was decided that the GPU approximate nearest search method could be deprecated as it provides no significant advantage over the traditional CPU approximate nearest search function, however since initial investigations suggest that the host radius search function offers noticeably faster performance over the traditional CPU radius search function, the deprecation of this method has been put off untill comprehensive tests are carried out to investigate this behaviour. + +### Future work + +One additional drawback with the current GPU K nearest neighbour search algorithm is that it is currently restricted to k = 1 only. A review of the current code base makes it clear that significant modifications and additions are required to both the traversal mechanism as well as the distance computation kernel in order to extend support for any arbitrary K. This provides an interesting challenge to be tackled in the future, and which would bring additional value to the GPU octree module. + +## Introducing flexible type for indices + +## Summary + +The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing these goals. + +The work carried out during the period ensure that the GPU octree search functions now produce accurate results, are verified by tests, adheres to modern CUDA standards, tallies in most cases with their CPU counterpart, and provides users an in depth overview of their usage and functionality, and sets up the groundwork needed for a full transition to flexible types for point indices to ensure that PCL meets the growing needs of its community. From 8f1a46c42660105a39a942a2d16a8b47ea1f2ebb Mon Sep 17 00:00:00 2001 From: haritha-j Date: Fri, 28 Aug 2020 17:54:39 +0530 Subject: [PATCH 2/8] modify image layout --- gsoc-gpu.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/gsoc-gpu.md b/gsoc-gpu.md index bf95973..8613664 100644 --- a/gsoc-gpu.md +++ b/gsoc-gpu.md @@ -5,9 +5,7 @@ short: GSoC 2020 GPU & Refactoring permalink: /gsoc-gpu/ --- -approx-search -radius-search -knn-search +approx-search radius-search knn-search **Student:** [Haritha Jayasinghe][haritha] From 343d485fb36fafa69c3240f2076387dcad678612 Mon Sep 17 00:00:00 2001 From: Haritha Jayasinghe Date: Sun, 30 Aug 2020 10:49:23 +0530 Subject: [PATCH 3/8] fixes to GPU octree writeup --- gsoc-2020.md | 2 ++ gsoc-gpu.md | 79 +++++++++++++++++++++++----------------------------- 2 files changed, 37 insertions(+), 44 deletions(-) diff --git a/gsoc-2020.md b/gsoc-2020.md index f3717b1..c24df17 100644 --- a/gsoc-2020.md +++ b/gsoc-2020.md @@ -41,6 +41,8 @@ As well as to refactor and modernize the library by means of; * Introducing a fluent API for algorithms * Modernising the GPU Octree module to align with the it’s CPU counterpart +**Final report:** [url](gsoc-gpu.md) + ### Unified API for Algorithms **Student:** [Shrijit Singh][shrijit] diff --git a/gsoc-gpu.md b/gsoc-gpu.md index 8613664..dac348e 100644 --- a/gsoc-gpu.md +++ b/gsoc-gpu.md @@ -2,7 +2,7 @@ layout: page title: Google Summer of Code 2020 - Refactoring, Modernisation & Feature Addition with Emphasis on GPU Module short: GSoC 2020 GPU & Refactoring -permalink: /gsoc-gpu/ +permalink: /gsoc-2020-gpu/ --- approx-search radius-search knn-search @@ -14,86 +14,77 @@ permalink: /gsoc-gpu/ ## Modernising the GPU Octree module -Octrees are specialized data structures with nodes that split into eight subchildren, and are a widely used structure when working with point clouds, since they can be used to efficiently perform many operations. The PCL GPU module is a crucial yet somewhat overlooked component of PCL, and is implemented using explicit warp-level programming in order to maximize performance. +Octrees are specialized data structures with nodes that split into eight subchildren, and are a widely used structure when working with point clouds, since they can be used to efficiently perform search operations. The PCL GPU module is an unstable and yet somewhat overlooked component of PCL, implemented using explicit warp-level programming in order to maximize performance. -In many cases the GPU algorithms can execute tasks magnitudes faster than it’s CPU counterpart, which can be crucial when working with large point clouds. Unfortunately the GPU API is quite limited and often lacks much of the functionality offered in the equivalent CPU algorithms. The primary aim of this task was to bridge these inconsistencies. +In many cases the GPU algorithms can execute tasks orders of magnitude faster than it’s CPU counterpart, which can be crucial when working with large point clouds. Unfortunately the GPU API is quite limited and often lacks much of the functionality offered in the equivalent CPU algorithms. The primary aim of this task was to bridge these inconsistencies. -While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search and Approximate Nearest Neighbors search were both returning incorrect results while one of the implementations of the other remaining module (Radius Search) was also returning incorrect results. +While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search and Approximate Nearest Neighbors search were both returning incorrect results while one of the implementations of the remaining method (Radius Search) was also returning incorrect results. In addition to these search methods, there are two 'synchronous' versions of the radius search and approximate nearest search methods provided by this module, which provide CPU based implementations (i.e. non parallelized versions that do not use CUDA kernels) of their GPU based counterparts. -In addition these functions were utilizing outdated CUDA primitives which could easily cause further bugs in the future. When diving into the code, it was also discovered that the GPU approximate nearest neighbours algorithm used a completely different traversal methodology from it's CPU counterpart. +All of these functions were utilizing outdated CUDA primitives and idioms, risking deprecation in the near future. When diving into the code, it was also discovered that the GPU approximate nearest neighbours algorithm used a completely different traversal methodology from it's CPU counterpart. Due to these discoveries, the scope of the GPU modernization effort was expanded and prioritized over some of the other goals and thus a majority of the internship period was spent on addressing this task. ### Fixes to Octree search methods and modernizing CUDA functions -Related PRs - -https://github.com/PointCloudLibrary/pcl/pull/4146 -https://github.com/PointCloudLibrary/pcl/pull/4313 -https://github.com/PointCloudLibrary/pcl/pull/4306 +Related PRs: [[#4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[#4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[#4313]](https://github.com/PointCloudLibrary/pcl/pull/4313) -After comprehensively going through the GPU search methods to Investigate their functionality and the causes of the above issues, it was determined that there were two separate bugs which were responsible for the above issues. - -In approximate nearest search and K nearest search, an outdated method was being utilized to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ . -In radius search, radius was not shared between warp threads, thus the search was being conducted for incorrect radius values. Synchronizing the radius values across the threads fixed this issue. +After comprehensively going through the GPU search methods to investigate their functionality and the causes of the above issues, we identified two separate bugs as the underlying cause: + 1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ . + 2. In radius search, the correct radius was not shared between warp threads. Thus the search was being conducted for incorrect radius values. Synchronizing the radius values across the threads fixed this issue. Since much of the code inside the above functions utilized an outdated concept of using volatile memory for sharing data between threads, they were also replaced by utilizing warp primitives to synchronize thread data. ### Implementation of new traversal mechanism of approximate nearest search -Related PRs - -https://github.com/PointCloudLibrary/pcl/pull/4294 +Related PRs: [[#4294]](https://github.com/PointCloudLibrary/pcl/pull/4294) -The existing implementation of approximate nearest search utilized a simple traversal mechanism which traversed down octree nodes until an empty node is found. Once an empty node is discovered, all points within the parent are searched exhaustively for the closest point. However the CPU counterpart of the approximate nearest search algorithm uses a heuristic (distance from query point to voxel center) to determine the most appropriate voxel to traverse, in case an empty node is discovered. Thus this algorithm will always traverse to the lowest level of an octree. The same traversal method was adapted to the morton code based octree traversal mechanism and implemented for the two GPU approximate nearest search methods. +The existing implementation of approximate nearest search utilized a simple traversal mechanism which traverses down octree nodes until an empty node is found. Once an empty node is discovered, all points within the parent are searched exhaustively for the closest point. However the CPU counterpart of the approximate nearest search algorithm uses a heuristic (distance from query point to voxel center) to determine the most appropriate voxel to traverse, in case an empty node is discovered. Thus this algorithm will always traverse to the lowest level of an octree. The same traversal method was adapted to the morton code based octree traversal mechanism and implemented for the two GPU approximate nearest search methods. -In addition a new test was designed to assess the functionality of the new traversal mechanism and to ensure that it tallies with that of CPU approximate nearest search. +In addition a new test was designed to assess the functionality of the new traversal mechanism and to ensure that it tallies with that of the CPU approximate nearest search. ### Modifying search functions to return square distances -Related PRs - -https://github.com/PointCloudLibrary/pcl/pull/4340 -https://github.com/PointCloudLibrary/pcl/pull/4338 +Related PRs: [[#4338]](https://github.com/PointCloudLibrary/pcl/pull/4338) [[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340) One noticeable flaw in the current GPU search implementations was the inability to return square distances to the identified result points. In order to counter this, the search methods were modified to keep track of and return the distances to the identified results. For Approximate nearest search and K nearest search this was relatively easy, and did not incur a time penalty. -However this was not the case for Radius search, as any octree node that was located within the search radius from the search point was automatically added to the results without performing a distance calculation. Thus a new kernel was introduced to efficiently perform distance computations for points located within these nodes. Due to the additional penalty of performing these computations (benchmark pending) it was decided to preserve the existing radius search methods as well, without deprecation. +However this was not the case for Radius search, as any octree node that was located within the search radius from the query point was automatically added to the results without performing a distance calculation. Thus a new kernel was introduced to efficiently perform distance computations for points located within these nodes. Due to the additional penalty of performing these computations (benchmark pending), the original radius search method was kept. Additional tests were added to ensure the accuracy of all returned distances. ### Addition of a GPU octree tutorial -This tutorial aims to provide users an in depth overview of the functionality offered by the modernized GPU octree module. It consists of; -An introduction to the GPU octree module and search functions -Differences between the CPU and GPU implementations -Instructions on when to use the GPU module -Code samples detailing the use of - Radius Search - Approximate nearest search - K nearest search -Visualizations depicting the functionality of above functions +This tutorial aims to provide users an in-depth overview of the functionality offered by the modernized GPU octree module. It consists of: + - An introduction to the GPU octree module and search functions; + - Differences between the CPU and GPU implementations; + - Instructions on when to use the GPU module; + - Code samples detailing the use of: + - Radius Search; + - Approximate nearest search; + - K nearest search; + - Visualizations depicting the functionality of above functions. -### Other work +### Accessory tasks -Related PRs - -https://github.com/larshg/pcl/pull/8 -https://github.com/larshg/pcl/pull/7 +Related PRs: [[7]](https://github.com/larshg/pcl/pull/7) [[8]](https://github.com/larshg/pcl/pull/8) -Some of the other work carried out to support the modernization of the GPU octree module include; -Addition of a CI job to ensure compilation of the GPU module -Adding GPU octree tests to the main test module +Some of the other work carried out to support the modernization of the GPU octree module include: +- Addition of a CI job to ensure compilation of the GPU module +- Adding GPU octree tests to the main test module -Furthermore, there were some noticeable instances of code repetition which reduced the manageability of the code and these methods were cleaned and code that implemented similar functionality was refactored. +### Conclusion and future work -In addition the GPU module contains “host” methods for approximate nearest search and radius search, which are essentially methods which are synchronous versions of the GPU code which do not use CUDA kernels (i.e. these methods run purely on CPU). Of these, it was decided that the GPU approximate nearest search method could be deprecated as it provides no significant advantage over the traditional CPU approximate nearest search function, however since initial investigations suggest that the host radius search function offers noticeably faster performance over the traditional CPU radius search function, the deprecation of this method has been put off untill comprehensive tests are carried out to investigate this behaviour. +There were noticeable instances of code repetition which reduced the manageability of the code. These methods were cleaned and code that implemented similar functionality was refactored. -### Future work +In addition, from the synchronous search methods mentioned earlier, it was decided that the GPU approximate nearest search method could be deprecated as it provides no significant advantage over the traditional CPU approximate nearest search function. However, preliminary benchmarks suggest that the synchronous radius search function offers noticeably faster performance over the traditional CPU radius search function. Therefore, the deprecation of this method has been put off until comprehensive tests are carried out to investigate this behaviour. -One additional drawback with the current GPU K nearest neighbour search algorithm is that it is currently restricted to k = 1 only. A review of the current code base makes it clear that significant modifications and additions are required to both the traversal mechanism as well as the distance computation kernel in order to extend support for any arbitrary K. This provides an interesting challenge to be tackled in the future, and which would bring additional value to the GPU octree module. +One additional drawback with the current GPU K nearest neighbour search algorithm is that it is currently restricted to k = 1 only. A review of the current code base makes it clear that significant modifications and additions are required to both the traversal mechanism as well as the distance computation kernel in order to extend support for any arbitrary K. This provides an interesting challenge to be tackled in the future, which would bring additional value to the GPU octree module. -## Introducing flexible type for indices +## Introducing flexible types for indices ## Summary -The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing these goals. +The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing the above tasks. -The work carried out during the period ensure that the GPU octree search functions now produce accurate results, are verified by tests, adheres to modern CUDA standards, tallies in most cases with their CPU counterpart, and provides users an in depth overview of their usage and functionality, and sets up the groundwork needed for a full transition to flexible types for point indices to ensure that PCL meets the growing needs of its community. +The work carried out during the period ensure that the GPU octree search functions now produce accurate results, are verified by tests, adheres to modern CUDA standards, tallies in most cases with their CPU counterpart, and provides users an in-depth overview of their usage and functionality. Furthermore, it sets up the groundwork needed for a full transition to flexible types for point indices to ensure that PCL meets the growing needs of its community. From ea4d840c95a8608ce8a7e6850030663b832a55ac Mon Sep 17 00:00:00 2001 From: Haritha Jayasinghe Date: Sun, 30 Aug 2020 15:39:27 +0530 Subject: [PATCH 4/8] Add section on flexible indices --- gsoc-gpu.md | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 59 insertions(+), 3 deletions(-) diff --git a/gsoc-gpu.md b/gsoc-gpu.md index dac348e..36dc400 100644 --- a/gsoc-gpu.md +++ b/gsoc-gpu.md @@ -26,7 +26,7 @@ Due to these discoveries, the scope of the GPU modernization effort was expande ### Fixes to Octree search methods and modernizing CUDA functions -Related PRs: [[#4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[#4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[#4313]](https://github.com/PointCloudLibrary/pcl/pull/4313) +Related PRs: [[4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[4313]](https://github.com/PointCloudLibrary/pcl/pull/4313) After comprehensively going through the GPU search methods to investigate their functionality and the causes of the above issues, we identified two separate bugs as the underlying cause: 1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ . @@ -36,7 +36,7 @@ Since much of the code inside the above functions utilized an outdated concept o ### Implementation of new traversal mechanism of approximate nearest search -Related PRs: [[#4294]](https://github.com/PointCloudLibrary/pcl/pull/4294) +Related PRs: [[4294]](https://github.com/PointCloudLibrary/pcl/pull/4294) The existing implementation of approximate nearest search utilized a simple traversal mechanism which traverses down octree nodes until an empty node is found. Once an empty node is discovered, all points within the parent are searched exhaustively for the closest point. However the CPU counterpart of the approximate nearest search algorithm uses a heuristic (distance from query point to voxel center) to determine the most appropriate voxel to traverse, in case an empty node is discovered. Thus this algorithm will always traverse to the lowest level of an octree. The same traversal method was adapted to the morton code based octree traversal mechanism and implemented for the two GPU approximate nearest search methods. @@ -44,7 +44,7 @@ In addition a new test was designed to assess the functionality of the new trave ### Modifying search functions to return square distances -Related PRs: [[#4338]](https://github.com/PointCloudLibrary/pcl/pull/4338) [[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340) +Related PRs: [[4338]](https://github.com/PointCloudLibrary/pcl/pull/4338) [[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340) One noticeable flaw in the current GPU search implementations was the inability to return square distances to the identified result points. In order to counter this, the search methods were modified to keep track of and return the distances to the identified results. For Approximate nearest search and K nearest search this was relatively easy, and did not incur a time penalty. @@ -83,6 +83,62 @@ One additional drawback with the current GPU K nearest neighbour search algorith ## Introducing flexible types for indices +As laser scans and LIDAR becomes more popular, the need arises for handling clouds with a very large number of points. However, many algorithms within the Point Cloud Library are incapable of handling point clouds containing over 2 billion points due to their indices being limited to 32 bits, which caps the size of supported point clouds at 2 billion. Furthermore, there currently isn’t one standard type being used for indices, instead, a variety of types such as `int`, `long`, `unsigned_int`, and others are being used. Therefore, there is a pressing need to switch to a standard type for indices. + +On the flip side, due to the increased memory usage of types with larger capacity, the memory efficiency of the library may be significantly reduced, and further complications with caching etc. may arise, which can be a serious concern considering the large variety of platforms that PCL is used on. Thus, the ideal solution would be to allow the user to choose which point type to utilize at compile-time, based on his intended use case and platform. + +This flexibility can be offered to the user by transitioning the PCL library’s various modules to the `pcl::index_t` type. + +### Providing compile time options to select index types + +Related PRs: [[4166]](https://github.com/PointCloudLibrary/pcl/pull/4166) + +CMake options were added to allow users to select: +- Type of index (signed / unsigned – signed by default); +- Sign of index (8 / 16 / 32 / 64 – 32 by default); +at compile-time, from PCL 1.12 onwards. + +### Adding a CI job for testing 64bit unsigned index type + +Related RPs: [[4184]](https://github.com/PointCloudLibrary/pcl/pull/4184) + +An additional job was added to the CI pipeline to check for any failures that may arise when compiling/running tests with 64 bit, unsigned indices, as opposed to the default 32 bit, signed indices. The CI job was initially configured to only build the modules that have been transitioned to the new index type, so that, as more modules are transitioned, they could be added to the build configuration. + +### Transitioning fundamental classes to the `index_t` type + +Related PRs: [[4173]](https://github.com/PointCloudLibrary/pcl/pull/4173) [[4199]](https://github.com/PointCloudLibrary/pcl/pull/4199) [[4198]](https://github.com/PointCloudLibrary/pcl/pull/4198) [[4205]](https://github.com/PointCloudLibrary/pcl/pull/4205) [[4211]](https://github.com/PointCloudLibrary/pcl/pull/4211) [[4224]](https://github.com/PointCloudLibrary/pcl/pull/4224) [[4228]](https://github.com/PointCloudLibrary/pcl/pull/4228) [[4231]]( https://github.com/PointCloudLibrary/pcl/pull/4231) [[4256]](https://github.com/PointCloudLibrary/pcl/pull/4256) [[4257]](https://github.com/PointCloudLibrary/pcl/pull/4257) + +A set of fundamental classes such as `pcl::PointCloud` lie at the core of PCL. These classes contain various data representations which did not have a common type of index. In order to mitigate this issue, all such indices from these classes were switched to the `index_t` type. + +For situations where unsigned indices were required, a new type called `uindex_t` was also introduced, which acts as an unsigned version of the `index_t`. + +This transition was carried out for the following classes: +- PointCloud +- PCLPointCloud2 +- PCLBase +- PCLPointField +- Correspondences +- Vertices +- PCLImage + +During the above transition process, it was discovered that significant additional work was required to address the numerous sign comparison warnings and other errors that arose from the transition in some of the above classes, which took up considerable time. + +Furthermore, any changes beyond transitioning the above fundamental classes would have required additional workarounds to carry on, if they were to be carried out before the changes to the fundamental classes have been merged. (These features were planned to be merged in in PCL 1.12). Thus, work was shifted to the GPU module at this point. + +In addition, while the common module had already been modified to make it compatible with `index_t`, the tests for this module had not been modified. This was achieved with a very straightforward replacement of integer vectors with `index_t` vectors. + +### Transitioning the octree module + +Related PRs: [[4179]](https://github.com/PointCloudLibrary/pcl/pull/4179) + +All indices within the octree module were converted to `index_t` and its derivatives. This was also a fairly straightforward process of replacing types, once the fundamental types have already been transitioned. The tests for the octree module were also modified to achieve the same effect. + +### Conclusion and future work + +The work carried out primarily focuses on setting the stage for an easy transition towards flexible index types. To this end, the `index_t` type has been adapted into the fundamental classes and resulting complications have been addressed. The CI pipeline has also been modified to verify the success of this transition. + +However, since the above changes only partially cover the transition to flexible index types, additional work must be carried out to complete the transition. Specifically, the rest of the modules must be converted to index_t, and findings from the work done for the transition of the octree module demonstrate a fairly straightforward path towards the conversion of these modules and their tests. + ## Summary The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing the above tasks. From 8bcab6aac7803cdaea59b945d09db902acc3f538 Mon Sep 17 00:00:00 2001 From: Haritha Jayasinghe Date: Mon, 31 Aug 2020 01:37:12 +0530 Subject: [PATCH 5/8] restructure gsoc content and fixes to gpu write-up --- gsoc-gpu.md => assets/gsoc-2020/gpu.md | 28 +++++++++++++++---- gsoc-2020.md => assets/gsoc-2020/index.md | 4 +-- assets/images/{ => gsoc-2020}/gpu_approx.png | Bin assets/images/{ => gsoc-2020}/gpu_knn.png | Bin assets/images/{ => gsoc-2020}/gpu_radius.png | Bin 5 files changed, 24 insertions(+), 8 deletions(-) rename gsoc-gpu.md => assets/gsoc-2020/gpu.md (88%) rename gsoc-2020.md => assets/gsoc-2020/index.md (98%) rename assets/images/{ => gsoc-2020}/gpu_approx.png (100%) rename assets/images/{ => gsoc-2020}/gpu_knn.png (100%) rename assets/images/{ => gsoc-2020}/gpu_radius.png (100%) diff --git a/gsoc-gpu.md b/assets/gsoc-2020/gpu.md similarity index 88% rename from gsoc-gpu.md rename to assets/gsoc-2020/gpu.md index 36dc400..342c436 100644 --- a/gsoc-gpu.md +++ b/assets/gsoc-2020/gpu.md @@ -2,7 +2,7 @@ layout: page title: Google Summer of Code 2020 - Refactoring, Modernisation & Feature Addition with Emphasis on GPU Module short: GSoC 2020 GPU & Refactoring -permalink: /gsoc-2020-gpu/ +permalink: /gsoc-2020/gpu/ --- approx-search radius-search knn-search @@ -18,7 +18,18 @@ Octrees are specialized data structures with nodes that split into eight subchil In many cases the GPU algorithms can execute tasks orders of magnitude faster than it’s CPU counterpart, which can be crucial when working with large point clouds. Unfortunately the GPU API is quite limited and often lacks much of the functionality offered in the equivalent CPU algorithms. The primary aim of this task was to bridge these inconsistencies. -While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search and Approximate Nearest Neighbors search were both returning incorrect results while one of the implementations of the remaining method (Radius Search) was also returning incorrect results. In addition to these search methods, there are two 'synchronous' versions of the radius search and approximate nearest search methods provided by this module, which provide CPU based implementations (i.e. non parallelized versions that do not use CUDA kernels) of their GPU based counterparts. +The primary search methods provided by the GPU octree module are listed below +- A. Approximate Nearest Search + 1. Asynchronous (GPU based) Approximate Nearest Search + 2. Synchronous (CPU based) Approximate Nearest Search +- B. Radius Search + 1. Asynchronous (GPU based) Radius Search with common radius + 2. Asynchronous (GPU based) Radius Search with individual radius for each query + 3. Asynchronous (GPU based) Radius Search for specified indices with common radius + 4. Synchronous (CPU based) Radius Search +- C. Asynchronous (GPU based) K Nearest Search + +While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search (C) and Asynchronous Approximate Nearest Neighbors search(A-1) were both returning incorrect results while one of the implementations of the Radius Search (B-2) was also returning incorrect results. The two 'synchronous' versions of the radius search and approximate nearest search methods listed above (A-2 & B-4) provide CPU based implementations (i.e. non parallelized versions that do not use CUDA kernels) of their GPU based counterparts. All of these functions were utilizing outdated CUDA primitives and idioms, risking deprecation in the near future. When diving into the code, it was also discovered that the GPU approximate nearest neighbours algorithm used a completely different traversal methodology from it's CPU counterpart. @@ -29,7 +40,7 @@ Due to these discoveries, the scope of the GPU modernization effort was expande Related PRs: [[4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[4313]](https://github.com/PointCloudLibrary/pcl/pull/4313) After comprehensively going through the GPU search methods to investigate their functionality and the causes of the above issues, we identified two separate bugs as the underlying cause: - 1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ . + 1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed [here.](https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/) 2. In radius search, the correct radius was not shared between warp threads. Thus the search was being conducted for incorrect radius values. Synchronizing the radius values across the threads fixed this issue. Since much of the code inside the above functions utilized an outdated concept of using volatile memory for sharing data between threads, they were also replaced by utilizing warp primitives to synchronize thread data. @@ -53,7 +64,8 @@ However this was not the case for Radius search, as any octree node that was loc Additional tests were added to ensure the accuracy of all returned distances. ### Addition of a GPU octree tutorial - + +Related PRs: [[4347]](https://github.com/PointCloudLibrary/pcl/pull/4347) This tutorial aims to provide users an in-depth overview of the functionality offered by the modernized GPU octree module. It consists of: - An introduction to the GPU octree module and search functions; @@ -85,7 +97,7 @@ One additional drawback with the current GPU K nearest neighbour search algorith As laser scans and LIDAR becomes more popular, the need arises for handling clouds with a very large number of points. However, many algorithms within the Point Cloud Library are incapable of handling point clouds containing over 2 billion points due to their indices being limited to 32 bits, which caps the size of supported point clouds at 2 billion. Furthermore, there currently isn’t one standard type being used for indices, instead, a variety of types such as `int`, `long`, `unsigned_int`, and others are being used. Therefore, there is a pressing need to switch to a standard type for indices. -On the flip side, due to the increased memory usage of types with larger capacity, the memory efficiency of the library may be significantly reduced, and further complications with caching etc. may arise, which can be a serious concern considering the large variety of platforms that PCL is used on. Thus, the ideal solution would be to allow the user to choose which point type to utilize at compile-time, based on his intended use case and platform. +On the flip side, the usage of types with larger capacity leads to increased memory usage and cache misses. This might not be optimal for resource constrained platforms like embedded devices. Thus, the ideal solution would be to allow the user to choose which point type to utilize at compile-time, based on his intended use case and platform. This flexibility can be offered to the user by transitioning the PCL library’s various modules to the `pcl::index_t` type. @@ -100,7 +112,7 @@ at compile-time, from PCL 1.12 onwards. ### Adding a CI job for testing 64bit unsigned index type -Related RPs: [[4184]](https://github.com/PointCloudLibrary/pcl/pull/4184) +Related PRs: [[4184]](https://github.com/PointCloudLibrary/pcl/pull/4184) An additional job was added to the CI pipeline to check for any failures that may arise when compiling/running tests with 64 bit, unsigned indices, as opposed to the default 32 bit, signed indices. The CI job was initially configured to only build the modules that have been transitioned to the new index type, so that, as more modules are transitioned, they could be added to the build configuration. @@ -144,3 +156,7 @@ However, since the above changes only partially cover the transition to flexible The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing the above tasks. The work carried out during the period ensure that the GPU octree search functions now produce accurate results, are verified by tests, adheres to modern CUDA standards, tallies in most cases with their CPU counterpart, and provides users an in-depth overview of their usage and functionality. Furthermore, it sets up the groundwork needed for a full transition to flexible types for point indices to ensure that PCL meets the growing needs of its community. + +[haritha]: https://github.com/haritha-j +[sergio]: https://github.com/SergioRAgostinho +[lars]: https://github.com/larshg \ No newline at end of file diff --git a/gsoc-2020.md b/assets/gsoc-2020/index.md similarity index 98% rename from gsoc-2020.md rename to assets/gsoc-2020/index.md index c24df17..8b2b2e1 100644 --- a/gsoc-2020.md +++ b/assets/gsoc-2020/index.md @@ -22,7 +22,7 @@ After a long hiatus, PCL is once more participating in the Google Summer of Code Extending PCL's use case by generating bindings for its use with interface languages like Python, for rapid development and maximal speed. The approach makes use of Pybind11 to expose PCL's C++ code and generate bindings in the form of python modules by using necessary type information. It supports automatic regeneration of the bindings when the underlying C++ code changes, to work with PCL's active development cycle. -### [Refactoring, Modernisation & Feature Addition with Emphasis on GPU Module](gsoc-gpu.md) +### [Refactoring, Modernisation & Feature Addition with Emphasis on GPU Module](/gsoc-2020-gpu) **Student:** [Haritha Jayasinghe][haritha] @@ -41,7 +41,7 @@ As well as to refactor and modernize the library by means of; * Introducing a fluent API for algorithms * Modernising the GPU Octree module to align with the it’s CPU counterpart -**Final report:** [url](gsoc-gpu.md) +**Final report:** [url](/gsoc-2020-gpu) ### Unified API for Algorithms diff --git a/assets/images/gpu_approx.png b/assets/images/gsoc-2020/gpu_approx.png similarity index 100% rename from assets/images/gpu_approx.png rename to assets/images/gsoc-2020/gpu_approx.png diff --git a/assets/images/gpu_knn.png b/assets/images/gsoc-2020/gpu_knn.png similarity index 100% rename from assets/images/gpu_knn.png rename to assets/images/gsoc-2020/gpu_knn.png diff --git a/assets/images/gpu_radius.png b/assets/images/gsoc-2020/gpu_radius.png similarity index 100% rename from assets/images/gpu_radius.png rename to assets/images/gsoc-2020/gpu_radius.png From 1b6fb802eeb6a64518f90091914857c0f3db85aa Mon Sep 17 00:00:00 2001 From: Haritha Jayasinghe Date: Mon, 31 Aug 2020 01:49:16 +0530 Subject: [PATCH 6/8] wrong folder --- {assets/gsoc-2020 => gsoc-2020}/gpu.md | 0 {assets/gsoc-2020 => gsoc-2020}/index.md | 0 2 files changed, 0 insertions(+), 0 deletions(-) rename {assets/gsoc-2020 => gsoc-2020}/gpu.md (100%) rename {assets/gsoc-2020 => gsoc-2020}/index.md (100%) diff --git a/assets/gsoc-2020/gpu.md b/gsoc-2020/gpu.md similarity index 100% rename from assets/gsoc-2020/gpu.md rename to gsoc-2020/gpu.md diff --git a/assets/gsoc-2020/index.md b/gsoc-2020/index.md similarity index 100% rename from assets/gsoc-2020/index.md rename to gsoc-2020/index.md From 76b291086c32c02c93a209080947e34948b8926f Mon Sep 17 00:00:00 2001 From: haritha-j Date: Mon, 31 Aug 2020 02:13:24 +0530 Subject: [PATCH 7/8] Change images and paths for gsoc pages --- _config.yml | 2 +- assets/images/gsoc-2020/gpu_approx.png | Bin 4089 -> 0 bytes assets/images/gsoc-2020/gpu_knn.png | Bin 3741 -> 0 bytes assets/images/gsoc-2020/gpu_radius.png | Bin 14977 -> 11886 bytes gsoc-2020/gpu.md | 2 +- gsoc-2020/index.md | 4 ++-- 6 files changed, 4 insertions(+), 4 deletions(-) delete mode 100644 assets/images/gsoc-2020/gpu_approx.png delete mode 100644 assets/images/gsoc-2020/gpu_knn.png diff --git a/_config.yml b/_config.yml index c46a30f..8e6c94a 100644 --- a/_config.yml +++ b/_config.yml @@ -44,7 +44,7 @@ navigation: header_pages: - downloads.md - - gsoc-2020.md + - gsoc-2020/index.md - about.md # Exclude from processing. diff --git a/assets/images/gsoc-2020/gpu_approx.png b/assets/images/gsoc-2020/gpu_approx.png deleted file mode 100644 index 054edd35307901f57335d281578fee80d275dc6f..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 4089 zcmeHKYfuwc6h0wf6nWUz0v6I1nu;wXVsU85Ac9XaB|!vI2@j*vpklEp4bOtH<71Q- z#!z_`VQd}I1OyS3f|R9HF@-=i24XQZ0zm`=0U96{uy^Zprv6~6{^$%HzM0)*@7{aP zJ>Q&r&bP<5us2(pzh({qmQ245TLDaM0Em+dP0&c+`Tm3GXVE8qp>Y6KKj9bQq>Gh3 zfVGsl!8bUuK&M{2H(b4}X{2c8Os4inxsMNz`es92ppC{u&*^qu7ei0a9ladcVDPcw z4TV*2b+ofv7<9w0`lKM=N5jHwvhUv{r+T|ga%`>I*0eE^w|UGp1`>z3G0H^%nE#hO zbe6)+Lg^UHD6s>8ym zle&wF?r&ZWJlo5%VcBH{5g^wx<#R-Z=8oq%WJ@Ec?Db-fh6xq|xRKU>no~0EetjkX zybEB3DoTfpC0u}B*!Nk_!X7qJ+0GriK>*MbxiN~>EXD1VvRJ}Qe>DRc&undNWwY55 z7XXp5zs}xw7Ar3_j6Ea=lK`6;lsVm@9YPf3_UPznCLR)QGVggpDtFG1 zAo?dAsm9HtIZ+ZC^1$c$x68JjA{@oHh80YnSfp~H8l^9-Eiaz=ddkIET;>2?*@gbH z+{%HMLA9$1&C7gn21N$BS&k2smSBA=WxxFQrN}O!(XFIoP*BM~=6Uf1N}EJTa0K|d zgd>&nNS_A2~e{iRVa513e=lA)$xPISONO+fa#Fs*1gW zpnMf`C5Mmz%)06iJk~bn1QdC*6v67h^XebGxVdN7n84$kS?RqScebI^uP0OU#I;f9 zXoNy~Sr8GJ75=f}!Kk1J+~FoEDn#>`XW+kPKs3xaXC(^VD@zjhUg=(fI;+)@URtPg zU5zR5nlhwBUN%Q2rN~0??)U_HS3}fiBoj#d;qGA(=~39|#J~a&F*26uecs5Tbavy4 z?-@~eu$?P30c^-2goIuFs%LUs*Qrn_1P;#|F`$fDePaGLxVlBjlkKVtIJyYN+ao3!7K#k7a zHEWC_lh93HAl9_-;Z%pF9d?rSnjE?W6(iOAwzjszU3GsvG7O>)xSFN()Zak)fE0zC zf61u7*BD>E$WoCos0Nx_Dd^eFM5R7zROW%_(LD?)SF0JxlDb#8;Qn6HM zX~DIWS$4BlE|(h&2L7eyN^leaVT z!g17BM0vZ9S?d2<^RU$y8NJkzwXuRplYW5SEwjtokE<&zwwi5;wf(xh-nnzY0-bV? meKAkFnSYvR;Q1I(hcH5Tvy diff --git a/assets/images/gsoc-2020/gpu_knn.png b/assets/images/gsoc-2020/gpu_knn.png deleted file mode 100644 index 84982f9262f00a3f3077f7fc81ae995d4310e90a..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 3741 zcmeAS@N?(olHy`uVBq!ia0y~yU=adhCJrW`$hPa4SZ z_jGX#shIQjuA$$P2!V!&L9@9nW-U?O#A~G*qoiVSc*%-&8^tT`99)?Hutny{n&y*H zS)3E!3Yy)_C~Iku4Se_Qd+dUL2O7Prs`Notje^k-7#Sg8p?KNE_i{)q1;cZ*crq^=XMco@#zSZ0MJPByc_^khfaU4}WUeM=b_K2-k^{CM{n(6Bvc z&yj5;R*eTDPn>zh!oZMlFeTgV}W+AIq~#q>h)58fB(-m;oGt59TkLy zz?D;H&a7Dn(O0k2>R&vD~Tu*msE(-NMf9A}hhYSugyI(Opm~!XNoi}B>J0#i`?qX&5VReno z!qC~}A^-1>6+G6=#M`vNZRX9W)sS2#bNukb2{W%MGQ7VO%Fb{=>TB))PoF-0_z)mV zwT8-{HLgB_k#iYq_JWFJ>VP?`&#dtS8YQM<{Jc``_(m(1AJ;+k$Ak3425-Lqe*E#r z(PHC8wI4lzLR)$Si(4)SeuxK^&If|N2-dsXgrrl)9fRGJKk;j~7Rs#FPACOtA_fNe z4R`a_uU~Jx*^1@b($##x;Jb5l4j2&of_cbLx!L?S*VGzg6 zINhQ5NHb}j9$no+1Md-M^rCbSNPr-a(0i|f1O-751Ox%;O{7U^(t8n% z2oXaSkS@}{o9BDi_rB-H`SF~!&X2RsGk*w~nVp%v_w2pry5@<#?p-Q!W^w=ks5I~0 zHUt2|9{@n0M@9&K6T7C+1^$T-FgEovwDo02c_1ASE^u})e-xY@?uT#y0Kcyf&;wMT z#K>-M8du+V)E3nLgwx_aDSv2nZ$+wa*cjOzB0PpCh=J^M18?~?aKTw$@b zkayg6aVvJ3X){*qq=K{cY2TwUBe z<@^*m|B@>Qe!loxl#~6hAzqIaIZbu++0~IAaCS)%Nf9w&4L^jBIHwXhyMhPILC)~@ zoqve{|5N02^72B-iHiFA`il5Uh#);2MQ_T=%8H7Ki;9a2gCm4J{oTB5{e;~-xh_Qf zw~X6xPkRpp$_s&XW51ATYlrmqQsm?W``P~`{K5kNBi+sOUrGSgA$rjxdQ(J9^goAt zAsqfU!!LUNJ^W&_oIb)2?qYfy;R<*21ec)5DJ>!Mm#}~Ln*B$wgv71CdM_l&X}a1v z!c7n`FQiSz6_*;Yj@vi^3)J6X9;tuWxR&>5#>2_d-eg&3& zM4oDz#$;q`k{SwpY%<}X) z=4F~u^Lw{8(glJie=H?@WHk=qBF|y#$C*MJnm3++ahI zM*Q%f`}{d|+5cy1mz}$>Z_z;|0H9Fr&>r`)bD;N6Y9s*2c^GB4*;PMNcB!}6x1-ZQ z;|Coi&?A?huW}(?`HU~Cg13R(q7^e)4OK|l^y2_k04Umv8VX*^J=-f%c9#SIJ$|wX8MN^6y`5C4QAbw{KL!r8rOYNqLENDO0Rq2m$#u1$JswJo zK-*Q?uJ6XLAc#=mD>|#%uL=(i4iK-#xd@l-w~w_Co`_JYn=$LsSO(#2L|#Up z-pDcmfLsahK&3pse!2uBO)aQ40E|}l@{mQGh9Fs)?3l=aXDv6q&cXzv#Bj)UoH0(1 zPGeAHHIkid^1$#hPvcoa^m!)tyJ@Ja?Q-f9x)@puHEQ0 zu;@{X^^zC$sxx0in&Y=khwWaEEN4hxPrNg*DBhYQsbe+mZRIPTVLY6G&U2E1THDy{ zzV<2Ddgf3A_XRapKk$i76{1}6978d*)xtx{g>ZqCYMdGowVsJJ_{fE zxWX6D_a5h;KJND-v6=;Fd)`53cC|Ch^by-O>{n3g6VM3(`uVO{8n(S(@!bYmg@HV1 z`wKsD+@z&$50A2KFfX6`=%1gg3u7(VJWf1VZK^o2y;y-oY~MRj$@-CyH>XVc4oK}g zmw>ElJz3(R+Xj1|O>A01&y_TjYnXl{g`VJR9Y^-4VE0{SAFa@Y{^{IZ9@)2fA$oR< zJ3reDJwHA@z?GhpM|>X;ZIw(S#kxddE`7=gJ@cQHAO(g2+GG|(4N=O3Z`342BfI7p zgZXi>oi1?a4ofUaJ@|f>IhZ(h!zU<`lvBRG={2vVnGq=v_MS6p(kuQpwXnJ%I0R^|gpn?oP0^PF=C_(V#0HTzX*P|fLT+F~BH=fs z>)z|X!MgCQ6|WqNmI?V@z1i<2=lwLLY9VL!ID{8l_uD3Kp36U5nmN%0?GyRb9B9+@ z(VYzD7G6(AJVT+f7X22A>Von+lsD*_4Rz~j?B&< z9+n;kS^vH%PGnn@w_~5_c+Y!I`1v(H=PGhlrkL#IzU};z_mzi1*@gU`7~s3Ytp?#b zQL3Ws$hBWR-yS@|DRQxm`V&_MM#*!~1P+*^fUp57{lW0Ktn7T`HzC>F_UqN1lg}s` z?@pcih!Z)+Oy)@uP}O}0a>q4jjzHyRXe*2eX2Ir<&zvL$*0f9}Y!`HWnrBN*Dwzj+ zoy1D)OyTtv6xVpR6gJl?TR)r!N=*1uE*^iFam|dZ>~;r$D1Ni0!@Qa_Cv2il61`9J z>!(s3Vo&jY&9h(iEK*xJ9h;bwb>%j#8nT!)LdI~x2Ud7QON@Ta)X75RovgO!sf#lN zGFLO7um*6(LCxau4Q|qyqf}}m&cWSa)2c0%=eWzk=C{V4^bI;qb$D4zW*7_8+3DmM zW)3-sF{Rh4fQJxPF0qW?OZ@Eo;NbG{(UJLCzfbcJC8y-D>NV=vinX8HX(6YT4@>)* z-J%nFvD}7U7WYgTLf5vwudInd-_5g9k{tQZ_!Y8AD%Lw-Y4uNVJKLd zF2q1lN82doVSA(oA)Z*BZ zv*xFxo%2;Xgz}>Yv8zgS6#SNfjWsc40pFh;5+Alx9KXqsK_48qe9QjHe;WFwpqEO{ z;EPUPJ-1X*#$5cvhmWo@6iHhJv?aaFSvOD)Nyu4sc$uidQzBNGCKUB$^Wf;WtH@?k zOBtiz2dLX%*V0U1mKb+a9&KHWR=;kewqU@ZM6FpeH1hV!%Z{_Fil6gKnKW|V+G9v| z^g0FZjNz|(u71!qjE2~?SbJ`~hNpM92<^S$xCL`BPAf@iD4Hs$wys?J6-z;WYB zP5JPk|K#qjxNB~_F11_+o6ODFUs{su$~g)HyS~RH((!qxA$mlyp5D3rgmqupKJO=+ zLCaBZKSTOqAB0Ll3p`CI>?Wh8`#5oEBR%YND7**bWI~Pd2%i6#Z(`fwn*6z{MdP;c zDI$$6A61rP<{|U>*bM%f^Y}RJP13egEy6`W)H?)qv~>0AA}rtL3v_oft~*J+gw`OM z!s?QU`04|;1_M2tfWa55W;YeQCw0EU)4plun}+NgHEbS(?i7?Z;!wM`wic8j#PP{X zPEq%gQ_EE=Y?0#eV2#5X0R8n=?o(w$g*5Fj|EYy zYrTwf9TsRprdKgUR4Xe5QxYTUzh6@+1ieo(_^IpsPACO_=Ch&`-pC=o5aTO8g-FTf zWU=Txx!tncIjZ(`P~eA_GR?B3kWF?WT=hLfZK~s?_bZdq)c7@t)^R1umjkp8ziElv zr)@${XMJ&}3mQF|*LhEvTq$Ea!M!#i>#;7ZY;2^F;g&4gp|~zPTmKay!IjP}%IRG0 z^Ulsl1?g^lb%|0ojXH&|#RmbxTwR*a%dZeCa;q-Am9qw4Y;xZJg-=-DE~tB`|PUypqT`h4x{%V+lK<28!o=P)e!r1Mi&R#pLIb&c8l z4^g9o-gkB3Q4sug14@yf<95g=(khp0d3?#8d^GqltM{o6^byMa(@5BN3pcDbmY-5C*d!1 zcBcs>J#U7705KRNPsF*F&nPsvn>6c{$+Nz*x?y5jqe9jB>5khbEoAX!`Q?sljKhW^ zGkTZ`={my|9of&0X7omDW4@pzW=gwasoTj35~ z4!|*HY56%ZXbAQ=xvqR;O+Gdhs65)aiC*X!3y z=B_Ps=atUvF4%ss@SI=O>d>NS=8bcO9X3VzKC5konL=OV^u_~r(pN0f0mKudr|Tcj z6%=mgHp=lbt%pO%Nrzgjwwsw$8CTSo4KSPP>ZQ6?{t~(x zJf1T6jwXU#XX5xcP;uAh_iF|p3JadP{gln){4qYA3GGU6o{=L&AJThw9>BuzB!REcxK_?n!1^v=Xo4YsW@QH>u1MHkOz-)+0X= zi%DV}6zH~Z5s5wcZ9mI3g%>mi0lginB-A%rD;`989dyk8RF_KC((Q%$DSqmkrjke+ zb}-b2A;}Y_zbMaL4_(NNzBihZg2Qx8Jf1I&`e=_X&Hg38*6?lah552C&$8u1lZj)s ze%AhhqkEKkEHyP*XL)bt+VFV$CF~J+RxETjd6^LfNku0J3{R&a>|HP2~p*8rnE+2jDDViRPEdUteFJy*#w+x!)`- z>W6*t7V-6ul%fF#)sQFCS;RJ;NRpOXuzi+JV>H?56ld zA)KTR4ijwg@QxvwEly#oK#9YI%C+iRDE?`OlK?%h_J83W?l&j@Z!J0qJ4G1Z2LuJn6bl1uEl@O8s4kVWJaIemg#{5ms*gIA2JJFZ0N@!b4EGu|(Ao^K_simx9Jn9(On< zx8+nfw&TLTHns7QGYd1K4F))uefGlj@+SlCX3_%wq6Rcnhkbp~FW)!PgOx?PiOCKB zd|Z@Jf`?oFu()@8;H4<&Ex}EJi-U7PB+WyT*IIHDAe=^1ayC}^>m!KT9Zv{xWV?z@ z?cT@h0e0wbttV1!7g?u0oX45an`ni^lVtb9%m#=W9B2=MP+AtW$i_}uS{k!C@ZhdE zeasITDeaRn-{Cs-bdRMSqvr*!ryIyORdvV>fV@CqHQ&x=LmCP504bLb)Qh9pR=uQ})Ey#Q9 zB4Mu?v}yHP;kVIHI9Ex%UZL*V9r46m)z&F4r+3!`;!aI8IKXQF#p+a=o&gJ zf1MIUcQjSFPJd=o)jr*|RBrcC=s1PTk%Y|+QU=F1%knk_qA>>|6}!=*%b&eh{pU_k zTZ}MOs3nti`lV0rKBh?Ou(7i4!Bh5?M*4$ZY3mi=e{LZET)pemO!3d#Lrkc*cfo;! zh0kMYYZ@{c!r*I%AG_@4k1_sK&!#JFmg`pqlu1-(NS}9LyQGqq(kz%Awy}FJUL}mp ztKgVz2!;!v2$ngQ>mgqZJ#t_PFngwo>^~RDwqJ^`ZzL2}E`r#}K;7zRYF-=vD)NwZ z@;WPrSFu8LO%|`>#O3_=e}499ejov4B?*>$d?lhc2sN44Q0n@mkUOJ!naV5`&6|wR zQh>s8Rapw|+dDcC0oQU+WY8XeKfiZbSC}=%Ha1-66^m<7m@|tQf>{}K(M0j)@kY)p zt|jnbXHhFzi+XlamnY>nj&bzdsHOC@ca)p56AXi9`0u}ePZj61#16FQCb>4&4mn6b zsVU9{cbkHQ8Za0isH0S-hQZ#^{^Mo)mxcaqi?hwX8v`k~4ZM7PmnBOU(y&R{1qnei z_xfnFTKM8CDk|hOAAe5J>us(R%M*eK}j$2iLpF+~P64ZyS6~Q~4y2?@CD$9umo<`Z%;$4^SpytUjdq z^Je(FN-F7t2$VPDv|vA@z*?V|)q!v8u@$_IUhB|z6&BMQ%{%ii=u{)NNA=FLJ!HOx zIDV*)O68#4uDbWO@XUsk4b{DJQch9Y)P1^2j=+H%U4^u0SWe<*n zBWb2QHukmtbZ_9NYq&IZ$gmPr~rMa38-oxzYU2ZRRAb z;*{NBG0t!v$2rjMPU<>RH5oRvqo*>Ww^^e<6Y`vD;5oYp!wzD?ED(1`JL+%rE*!nJdAMrvr@!bHe?uJMC+at~cofPqice#W-xuM#QceutiHO#(=$4clWfZD@GzyG8;4RHIak=MM_(sS}r>EHB_ojz9sx{ZV5{Q%+8Hg~ab3i8MO)+Sf(*mB$I zdYPPx)I^le6G4Aq;|9&Xzh}-0O5qF%(AYc(%&56@<7yVPWb~CevFCm#U;Lf?=SIfm zw?LFTcgj7h4+dh%J1oW_VHZI0Eo4gZ)|V%TLBEVk)DR4@Am9$h7w0<|x7CjN@O4Re@eUI?c+WgzGOhauO*P=45P7gRk zowaO|I6!?HiQbFp+x-ZZxfN+NM5qsY$pQdMG9$IOSEn6u7^VIww77m%9z=r`MgbY6 z&q;z&>DoN_?R5JH+K$|$^0eTh!{`BcDOgTB)K){43B#nePSt3M{q0pHn3n!Pf|DZp))CC*6S(The(LnVXeAj46Cu zrL0a?jdhXAOl+*rFKfV9GFDX$-^a!hprlrATpA6pK~r(uPUcRHHjH5BTXRE9&=LV{ z2oC8*_J^epY;kv(VrXb+0{CY-!nkxMVq;@F6p>Pyv9YAt?|}5aeMK2+;8jWs52>() z$H|U6a1_>`2*|~8xD+xhTd)n-R733vQ5C|Jqy%l%Gp|7o#Bs06O&$PNWEgIKVmOjP zTmQCh@mLAx1{tTw%YDHM7vR(ViT7d;9)`MuH_IN74{Dx8*L@`bR1Ox%yPZvY6q*Lp zhdI#7qy#|NILvfk?tvkCP=A3DKm}1ADJ&W_n`f)*f<(ZLQZ`sLeaTznZU(T3tu}qH ztdDp#hr>z0NPxB^JcT<{!nHvs-R_qsc?3J!UakND+KfnoqWD2p>q{P=-KitcAX&B= znNmILCQAldr*(ar+$rBB@iEYLb1XB$h_mQ!W5tXEuav>CPjkU82>?u@jm^9Hj!!Wz zU3dEz2~;Zer7!Uo!rfnht}QbR6X!p{ZTk&gAqhMC^FlpLi?(O`_+S(F(Ld$9(s3mF z6wh{kFF7io7yy|2MX7>MOk?AN0F=|)(Y$Gp1&ZZTi(B>jc^q=)ojG`wgScQ008|7$ zINCaN6FTPTGE)g`9SlyLyS)C52n>p1kT@halpswaUrEr1AB!amUuyeX$*{e0%4RS z!KjW1_IneP5GaUn_Q9?atA;$~LE|@kGOzu3-n^r*`SYi9Ew-0&k!A@3y8udiq`?P& z&>lDEYTHb{c*=dOxVmzxx&oaYL6~~3sgmx4WNuL#%*k}?89dH-K`Qiczy@|2$nj}8 zCsZW#>_>uV$aaE_Lg?AYDoM0gg}U%(F^{9PlSzeXRe@4>MJNrv|d5!OV;?^&gw zs~BpKY;69+PDUv1Pw-tuX%X_>>xQPsg74G3l>|{t9Nx5Lr_|>3Cm5Cw4{)KOCp4hX z46E|qfM58azCJkJ>kb(R&!uqRtju$16h!laK6O3eT}~u>99{-(|2fhwoA!3dWnhnX zaTY}#!Je*LgxMT%P!@T4{PT0^hL2$Al#S^7nT>JD#e0N6*isb4fUe8E%r~c(QG!cc zjjS$XP=}+O7|^9?;g_jqTKxUnr+&t@-eP3;hm5rl0~#M?%M@elU_Y`8=8ZcuWHX(Du8Oj(|#_@tdy3%YV^-lN4R79UxnHb~N3Z zyaYP3^D}y*No5LB--RG=oGth=z~kaAKmy%;g0O6^XolrG!>0yPnL{X<$!5hUh>3~G z=3Gm@KKKZBh0zQaT|xJsG(MC@Lm$XRzRQg8WY8D`NyrrLYV1yFB_`nftYxShBn=#5jn->;mlRLWtM)<;V!ByEKoZ)IsYj|CK?C5YBT?SsB42a#t0?t*Q2Q z-0>p6i%f&8HnGTUVK7(tMnS-t@to9`Ks&zXiw8M>dXiuO_+#@_97-)q-LfM=O#pn) zv%Avc?oexX)(Yllyo@cA#|tnEspdgE%U{T#`;=rkE(8xpLF^=%GwLr4r2=NXq9?{O z7YS*){eHRM4Kl+1pF}{IJ$SA(lg(#(+y~v)I6RC1VA-M0+{}%?*^(haV`-l+9(1xG zrR7K9X^Ny(TPUsXLenz3W^iAojW`OMqXXJBwiJ*cocV+AzHMcK&}%gUFm0d|28k|S zU;jsIGow|LEETqB4B@y)Ks5~I5KgrHaTQo*SXc7D8mFDqApo*?kImo9^@BMzj2C37 zt$0rFBFTa8R}*{esX=)w{PE-9I+P4G>r^3-$~}fiB_J$;R%7fh%<|%}(oTE#W!=yw u0CE}A1`2lF4U?Ar#jQc3{Ou+(d`=`RzwPzum5cGkWK9j-+ZAe$BK`|W)!tYD literal 14977 zcmeHuc|4Twzy3oh%2tVzNZEHamaKh9vJA<-OLnsFdl5w_vNX0L`xdf}t!y!vWM9VE zWnafOhI5ba`M%Ei@AvolopZll#cO7Gp69;b%XMAv%iAYvic}XEFF+7PrS$mWQwTbH z1N=KjP6}Q*uwfqrUnra(>$^b^bqnzyNjxt#_{1$p>ES%(IUV`A3mckW5SD#n64`M>plJMjP7 z0q2ZAFnFIkHWLa?-M7J>fy7-{LkkKDbP6igUqVf4M`ow55ds^JBtm~XRyjF2?TIr( z(4=@MZ5iK1){ll;QY_Tr1N@Nl%0uZNo*!Rs3D9#_jIE+jC{LuZ<8{h&PS}Q`Vgm?z z5X+{Nw&0Vd+ZfEI#Ab2Ed23dq$&>P2$bPHEU2ZWe{RB3plA$qllAemH2UmnSFE7PY z4hYgq$3;BNk>3!u;#j zNt#^tXXDfZPw?0v+<-1JFoO?uJjHT66)t=91=W3JH~gp5S717p`QT9p?I%2R7%6ZNzlRJa(`d}C&vK+79YjApx~MENYOUd&_k8x zd36J6wFJwhi#K1v;HXBFvSm;KO$xqm%Gu`9nc(NFF{H=2W2>GLsq@;ti0YaerP#0Y z!2_J|zV8Ee-Bi`;eLG7_Jm>zlr>CW-JK4W4G3J;m!Xs&Ixvd+e#oKLdS<3xVMIYU>*iKCja_>2k|1jxQC~qp)n1>X@9zPsm~3GSsQJ4 zs4&=!?6A1|N%eBPJP9=EUF0It=tJAz->XGU21WmB$1_7r@A*)HyXENtdl;|gDd#n6 zTf}jnETNAx-Tx_u)ykk`=OJsT>G4)#iAjCFjF96I_Jn|yJ({9vB&<52Sh8#n`_hlU z%iPNEkYV~r`!go!Xhs`*A94=Xig_)|n@rKQ%1WD>nhFw%Dm@qDXli1jh7)*mqw!!q zIdA$K0|Nt2@gl2{v9YwY^b2tw`nM{N#`u?Nzw1(f2j&SCs`|b>n4WdemZhP+^zenT z$OfN)y!P1hq0lT5r{wx2KR&qTlr#9~Gi2T+m2{NlbiOBHNU6^q#uy)zn8*Dv!7{Vg zX6>4e-MO+qP8#%rj>oi+H*-dsS#HqHE`pfskd^#JtShTgzT?su^C1Ut051ZG%Ej6vyP9COG!!D6j3$36v_-veWS5u zi>NdxI(+-XOehKC9T?b9@Zn}-2A6<Ow(2h7;o?JMQSZnh$fOo=U125E*u-1_A=>hoP(9`g7fM*u zWEO?@Nul9pI)9`Z=kAv~LX*(qr|#Y{RI5?RkRRa?H_n2?6NV3Kk382Lh5yd2tqH*U z1RIZBcc(oxFge~?egOfgmipc6%$cjE{q@^iu)`iNx3 zS(6e5yp~psY*o#~R3z%I(u?cG`}UqlOcC1)v6DN4lzA9OFNsu+l3m8PqY$+6aN7`- zRdeyC{n_>$TL~s=uYlzH(UntIx(Nf(E-$CO6LmTtR0PnRg@UaH--IAuh~Hc|XYYPx zY&KR5RhD$-_bjQqSJbDG+N@1m2&96@AeU7Ptsse4fTI7>z9oT+;d8iEvz=PGrVc29 zMj|=))r2aUE1t%^T_$zcmNbrdOMy-Bh0HDuJonL`FE(qr-`uZo)Y{rAaE_MYwj8<0 zr)oF1iaZw=m)=Ep&yNMa*Y~gu0e)UyRTIr{Y7!`Vc%^8W5A`Uscf9bex3s}-?S#;p zO|OJ};bCpZhXVD|18=qFdxCdhLG0?EYlWdEl!zG8S0h1ToSA{Q(v@O(Ry?_1(4eBY z|Ihfl=jpD6V*9w-c(I_lg|O#0oiIF5l8WpFHO&=^6t)gCplMOu8-wdfQI2J)RuX-j;cSA`z-Tvdt%v0f+^d#1GQ z*J;%R{erjRw(VAD;(TSk&BRwyC^*0Xm+hyI&Ur=-hl@HVShC3Y+g!%bIq9x*8Cmz% zwu~=*569knwn(mPTs0!JAfp~zapEv1S$1ZWTU)Ps z`qj6KQhZs&`Lc%A_!ZN=ydv*79shUyd>E}ns#hHSM>O9 zdIm%l7GYP?BR)n5JMgyN1zF>Xym=S7RPVuD`5rOw3$f)t&icz<;xXB6xo(hgvQ zof`gLa`Ho#N-tm6v6@CTJd>1ypxu&r-{Qa?+KVu4FB8G{0a1}d{Mo{F?@=S#ClZ<; zo{QKDC3i`GZ$2|gb{W@_r$p>c`4~que|aA`i`1SBv)Hm*Po;kK_4P5B`jhsm%duoo%4A1=1`%Q&M!*E(bxRPRYOh|tF>w^~@gaB(F^ zJ7_pwgEp+;$^@-AdUG3i+Co8F=|`Fo^Zmep)|@*@a#mXzo82fszqum6dd zd`*N~!PX+oG3cJ2rF-;XZ_SZav+892V2gF?oj?&<-hU6;&sBwT&xV$R1D}F{2-9{S zHC*N)=&80hkoqEvXBDlHFB+epAo&}_g~P3PX{r>!Np42&Hy9%aujk|+M}aP!o|aaL zcL!Ce^cfeF5}hP&7$g|PqvX5nwPGbnqZK3}8SHv(-~(SO1&nD|U`=_gbx<=Sr`FJ6 zc#2IamY81}M#`hT7fqyBD30JGPkcDId7pL}J4UUq9!lQSq@h%bhVm(%CElf{HTWCn(c7>h)!G-B61k@h*-UD9SaOK#opL3%gu?Bm;G0hY@D>2sKTB%>TyS_?g3L6> z?AS{Wgz~X%mF+5$CW%#><|}xI9kuP@uTp4iV-U-pY$!u0W372PzZ(NfzI7~zQ~8p< zyRCI{)-v^7E=ZhX^PXVXN2!z65v>%ZBcpT#)FR3dDPts0zK(4yS!?Xz+{}M=POy9a zhe~o1g10l>%1|eQ!C)sLSt3$qLDH>0R6ye?ROqnA^H7BUnPJ~syHDANF$E)jU32ak z60dXDO~m3MTDfack7PK!EE?}!z*P?;58q{OsaC{uE4%rm#XQQ?j+tS(5FF26+t-Nk zcuV^;oQ8$5hl?Kb+@#{~6OXk^vcC%E>@m#~Pf~hs%3T{~eMf?*qd{FJRYA)xUsXto z`WE39;{7xg|6Y?v^?cKWlwtP6qiW2ehd;dkM}MKMttHP=Yh-+{jm2DeUnNFUN6H{h zAz1u;QhrR;j*+xBt{9;r>OsxNd0DckjuINHT5jDz%%JBjkOlQ$yj!=A34TMjQ}S=d zzcg7q(iE_1!ObrHV`WV$QXZWr%}~LPTc>RL!W&}Y>8_%>gHjfyAxOM`YWD4p+nraQ z2oczg4E+hNydHV;`Ut9UVXwEZm6ZIR4kSA|oXWQ6ww~|UR%8B~Ij z$k$@6K9@7-+&ZcTiK?5Mf3WK8X@bE3ik$!p6#u^Vdw|jmmFnG{r zq^UmvH-sHaVpE81A(Zh+%E|_T+LNyiDhIaS&&$(ucV`ENs|)$@x#LnNGNj(bX~3Fu zX%7#egcNhhPHua9JDbuUSbcT1x3@Rf`l61dzUbaJH?J_R2I`6e)`{DJy8H@SdT>!&$3P1^zOjYv_A4&;Ipx&RK&z1^l_4!zSor(P3g=nQZ zjzE9^cPvun+o9g^>1-W2>46f=F_9V74Gn>NYnVZXcEh5<@%@(>Tn)#E%Sa^BRRzH| zq~v;%bG>zRI!hj>8_hD@!81e^6BDCS5;k)+7`>>V6JU8r2t4ovH=P?+8ZSp)&Z_>q zWp(vMf~9aYS3@Ksown2MaTM6F8mH0x46b8*;|X%WuGLAN97N96RvA~swH_}oFWuws z6D#`lc6UE%q0PE8ze1CKomaPO1}@J;mD;_Ivn!jdYHcA>mcJKRuO7E8}I}`YYx|Uqwc>(B0|9qKV|JE&E+|H8~a>3VyvV*>Y!b zpM#%2i+XiObAnZ1T}yl2wtHec0^d$KWT7I=rl3*hw^!`*7bWmhXZr5rZBxM0!1-RI z1kadYDYAF>Ipx4bRUae9hnLnX2eI(5peFOr5(J6T( z{-yVAoooo>Kp&gS09Vo}@bK}uF4o|)&7lAz@ z4d5_ zcT+B#5TjPT**vd2Uw3S?Tr_lJF({%9KCt=TQfUNBJRz_mYbdb0;inaYRLJ4IVE~W8QeJQN zZoCs%1iKEY3O2N79d+x@=9ni~MJA}eoqN*hG;^$Zi46ClMRSqXCCZ#vy2#{JJzE@K zfNs-0J?>li{zx6vvb43?v#z`i(=+FL7|C+@5)?{wZoZ|OkL|qBoMDK`3cEzQIk?{s z|ACNLTc4;`X$WdR>{FLbsK(f+Bj?15&r0-vik zX5y=pZnyA|{fI_FnLa-6*#{1k=^S6}fSrhbX0A?0K^jkiv5OAXXS>H|B*g`+=u)!x zr$!yCP>+fijpJ-<5w&n=jVb21q2%Qrz1yJBM&`WE0=NM#z-^!kyYwSUU(l6wz8ZsPn%;4 z6*T;}yTu|UY{O^~>CC#?(Z7SJn8@y3PpG?12Yi>k2^zH0Ze?o3|cXyAB=o4NHnUrwed=+we zJ%F!1=Gzkg$zeJ6PtePy>m^p<9TrcGm7)iG63+y$zVYZw=j`&zaz0A3?{z*oEFPvx z7u)SllKAy!XyN0MR1ejDu@6~(#oMVjd`ptO<642O9JoT%W?_4N z$N{T0=vv2x-B_wY6p8QUY(6jZL@atk-uzRxJTnEZV>IoYe>Df$9+H+T2maK-%U zDyVw8C-!0$T!#H~J|_1#-!@0)AQd=aMQe~a-)30AR!&UE2>P3dh}wSZUT`tO&K*c>j?^xS%yv#y3+IgWp+{{Itw=< z(+5S-= zIop8o6J@lvHp_cwb5BfO=F##;<9|$g9HJ6$p|@#RWV)yl5z>7<(yva)#h=+pyCAHV z+a9F{9Ut`dX>PybR6%{~85FX8%9S_Db%&^Q-SqQLRF@d2FzYSpt9!EqKDeJVB3{Kkq0KGd}K) z#F&(bjn%NDw-JBT2SV`Jmow&%e%K<4ZrmWui#2{S9P;>pQ~pu~G?1I4tGjb^mLp7= z1{uPEYVGLBt$UJz5IT@jaMVP*@-t#-kKw!zputN{?S*#nUHd~uC4Yq+j8El}wAu>8c!?vx9aOJ!rIUDs#&XcEe9(Bu#F-Lu>L1&EZ-DXSsB*Fw?WE z>fZU#Su&)SkehAR_W5<5S-v*SFC*Rovap5kc6FxRJp=OH-8clMx=m^R{xTEPln_4` zZ+^ik;5u%Lm3NMnuk$FHCEXG}T$4Tmzsrq6PufW+UjNI<5t54IC74pE<)93&nFY;5WnTxn_ZU)z>GQ^aOQWf|})>FRFA zC{iso!;Vzy~<%Df})ejU|B^@;EOmt!{r7BGlOgF&2j8XFqv6|ds&hB=Mf`fJ2 zIWWrCTWHdzFJp)N2Sxt-@dG2W{`*2q)Tq_+WmXaX(nLEMALkHf>Eh7g3WchJ^e>8? z88f;=Tubd{c2rGob8Gr4!}PA_WqV+hrz_+LzP~u`Ma(x34?3N7(B-==R2kE-1-Uwsc?sbL9&F15&jH03dLoyHQ^kFf>g}+2_uCO2m+{PG z5L3-a@@to{Cs+@puand2b--}+o_DSB{VaWR<+wWSzDLFzS%jbDXQeF|^zB21;55U{ zqQe)skVcNkOkbz#hmJ;|@v)O6{qQTH1{ zrTw{Yk&@BpW5+2}^X}WKrasNbw=#LpwZ;} z6K6lavDqgsRav~ukbOZ~bUtUcCo;gtM_>923DllHnol}8qQ$`Fsf8l>Tysh}bErbj zbHu$J3vJrw)m4-yCNRJ>_?t8vKi~RJD~t(we;!)ycBCXmK0uQ~!57S4fSDV{k+&9t z@ZWJOZDsj0o-|{#J}9gL4H##Lo8zLmJ$K#>wE5^KL_NBbWdfF!gVb zNT8M1QWu8HU~nQmqKouZk;DQDsFriZaHTsx`$ewpMdFBHx7QG-L6yjT_U@0jd8U46 z#V_Qy|NM?CyVdz16ZETVlsW<4RbwyiUW=r0HlH4yX;37A9JJFG?D_6m=!-H@4y*Xd#cc6O>4!Q)XEU=V`|?f*57rPA(PNB53?K#{lN`Z zs*u9qI1339sOu6ZMYp{~mTf+}C-+Co)@_=5 zI@M@>M*=xBb+(2}p$EK;4dIcSuHajK9GGEk^LL91XF^ag13#*4-dG$8ZrufmTHg)< zzvq*NPSyQ;jU1B9Vj=q$bVD??y1pJBJG-=Gx26t|wCH_t=*jv_01aezaZBoC4Y}TM zxKuF060p_5d2*0`a-WGhJxB|TWw*i5sFA=9V?P@#@~i7ynqh)v`Mq%3pE#UH)mSQK zg6-n%(9dHxAtU{>Sxb~pTKi@7%-+Z2bWq`xAdr882U?nPKra*?T49K}dR}T)E0lNxVdS*Ncw#D%V z7=CS2&I^#VGho2;`sD9CB8z=ji}h4$06#&<2x^ve}#aS4HsF z`Uzq0*_F0EB9_Y981*NiCcc5^FBi&}Bp&cxL_RJVQi=N-$2QY;z7EWskNbJ>Yio{D z%z{N)D7Onru5oN}Y>WI@1}HeudVuUD2foC;1qRplk#P2jeQZqvnFYXI%SeKoS?}Et z=TmVjBY~iKRGEV~(*;%1;5)a$tG8x0#g$^m#>R-VE8_CD)mhR9B{8+jVKO+Bn34qp zqk9i@1l&Q;F@?%ke^l?&dL<6Qc1}-48l1|fbYWB(ar1rB5)EbWE9H$xsK%pJry%@M z2)JTehYq7zIBi(-mo#7S=w+d_U<~CfUidcF?%D$nau+c1YTDH;$J=^|5*MJHK6h*>ue-EI?rvb>R zdT`)TV2zz~P4ah@?Y%njtI9By7s_eXtW$O zk5xIvv2i!TW7$w;00M(-k=@=W0oCSdx`*+0Zr<;__%~lbGz}&$_ypWHfXouoMw%{x zl*y4L*j5zb1pGmLzy{)rK2n0n5-Gi+aIiZ5H@jNn%mv8X>TdbVJG2F|@oh7kKnC>? zW({@r@i{Qsn{jED2Dh7Np7kJG3SG`OhB1SJiH$gTfX~=XC=v=@ zU|XUPw=6MjoYwl_P-O<3`d}hIi0@q@jxF>PR{d^)*mvnH>tZBZ`IyTI zO?Bc)0(*8U#bXnQ*S~p}zk<2ZBt%vxk~>)@Q;$ z>~wB}&}fSO3s4eM)O>1r!S^rVClgfq`sL0_$=ZV(ds|&yJxq+b;OE(|0R;J$eOFf! zuBmy^TQnmL3f6pYIRdD^2Axvusbw_;R+qmOy!fd?)MOc#RqcMcaJj5)*m9WeV>4zR%4Y*NmE#EO2TS~KCl$%-nILMuEXsiOtqzMpxDZBeb^6**^T zHofnD`qf4ZumYblV8n6dflSI+R{X#6Vrjmk+WLp zTdk5#zJ9Mr@*ox1P)bVT*lOzQU+lmCH$?HeF94G5aiKXWSSf3)+y{$%Gm>i9okN zgKIMqKq7=9&R1G2$@FtzVQsT@4HS}t>e#f>nU#q;I%CIoe8xj`PA0d-Mph-Lw(U|4bXlQ9rXF%euX$SIw^7={C^sp@^yvk^e$)IPn1~`7Bz{+55>y;n6(SMoUZ5xl zll$}uYFg}#AvMVWvnEeZQDi_C=@Y=@OikOB6yms*(-wdj_dydc*Prv-LZ5?ybkYJe z1woCHp^L0tqGxT0!DhV{lRsTJc`8o;8>C3k1h5ggNA(R;Q;sm>}u5ud3x9@UAauC1K2Eg{=Q}{7Qp<)!?*zzXZp+E~%XHh5`$Yp5gog?Z^8B5&Yv`5^;$^u1PvDD|grn z2Es;U;73dSoIWM`LNZZJ_Sq}xG9`MI*@4kVn6+Q?4GgH74br(t(_2N?P z4KS@Nty=4O#h$F8Q_gxa&RKyzDM`4&?j`&*iF49USJ)LgA{r=PraT?Nx-Pl$uZ`PP z1 z$qLag*e^Lz2JLTrF?yPmFA6NE{jM#_*@K`s2FvSSarwA#uoAO$=cCn`lmrAK^v!dC zD6ecvkvw+kZGK1!IH7^MbITtXe?$hOJM++WA_vV}3L4$;iDR>^Fc1l6jA=IwBH+T* z0R>r6CVvKFgumu_Q7_yI<_LfR+rSJ)^*}ukx3NWt)@5ULzH0i{?$>qWKjv+LpHM7R z%FEi|6v2Nb8uOA>Y4Xw6hG3nF`~mdPWc9Di;68>c{7d#n*zs{d9~gj!{z5l}Xj!@* z(;kGc3c`b-J$@taC@Qcq$j$9aW3(bL%J^qw0qzq0<~mu*$7(tuNp?M7vJ)`k`lKo# zCOl3%lZ21z)r_^If!%or!9S8 zeUF8iIhMG^K=WA4tUaAm|2bMolDjELSXdih-vgs-EjthLFMW&sOrmQg4OfVfwq5Kw z5Dr_pbEz9%3M!FGtb)3f*W)p}g7Xm5DZnrj@ktv>e2&1sUvI##t*oq+qh;*3hJjOD zktOJ8v+;PjQAg|;rgUr?>z~Z##v3r>nooNga^|zJYE~gSsNpD*_QKDK(T2ZqcBkd*u01@C|8UyEK<*c{L`y{gbVdkfe( zcE#tW<&hZ23WI?~<;D5=`IQwurC3mlhziKvPhA))K%lANx8LNyeB_3bevvijiO2@W zt0(N`u$VsUa|~2;s*CC*(D|Q?WK3CnDD&i{L-(cRI)D5UIp5!OhcwNXILtzqCKKn#1*e*oufKP!JTf*ds?MX1fy0AR!GrfTrj4#vReD)C=$Fteh8 z?V&pU2dD*xn#P%2b3V+d8Uf`xlbt5*bepVjL!#g}^F)oUGq1G}RK~uC=DWc}TekUQ zk+Zxb-qYY<$p~DNE%Dtf9B~~Pm6w4wmbzpltearDNe(9W1?we=Y)V9b)8X;42@w!a zqE)WA4N2s849*RS*wa~ZkL7_5G5J?7GkAtxj_4@{PM&0IwBty`VOH3`yE3rHoK6@O zH24@FkWYa>4bUHt@N>ih_q5O_&_O^`E0AF$I(i0{aR}M3^E_ZO@6>R>p~kwlwgyN! z;p61uqHS*q_yYq_VLXxbnCn1!`UV#c%(DuI!2yTc&d~{{-Pr}G_s)}`0cLx#nleBa zJ$X@O+DMy{Ok?J|#{bXcf;ux61hXOiKSs{`Pa(+0g0l`J|O>)tZ*J9d1 z`^W}G!x>S=r_$%{=|}!-ZCBs$T@1L+t6a{>7&-FDt*-?I7jWH`=R|vi41)HoaQCCEh54N1PfTykD7hUKuBG@JK$@s5gq}97cA0iD$=r(L4x5x zc-aF(17N5WXD4TMmq;ZxLqiQZtp7G#gHCho)Nxz`PBgld6godiR6r-ch0+3q0gm(F zMJyX=rbkCdxaA>uG~k{9bHsp4*178^%=rMuPQ3pA?6q++mP({a(AfvgM<*y6`ip%3 svmfZc-F>j}1PGb`TmS# radius-search knn-search +radius-search **Student:** [Haritha Jayasinghe][haritha] diff --git a/gsoc-2020/index.md b/gsoc-2020/index.md index 8b2b2e1..c1eb86b 100644 --- a/gsoc-2020/index.md +++ b/gsoc-2020/index.md @@ -22,7 +22,7 @@ After a long hiatus, PCL is once more participating in the Google Summer of Code Extending PCL's use case by generating bindings for its use with interface languages like Python, for rapid development and maximal speed. The approach makes use of Pybind11 to expose PCL's C++ code and generate bindings in the form of python modules by using necessary type information. It supports automatic regeneration of the bindings when the underlying C++ code changes, to work with PCL's active development cycle. -### [Refactoring, Modernisation & Feature Addition with Emphasis on GPU Module](/gsoc-2020-gpu) +### [Refactoring, Modernisation & Feature Addition with Emphasis on GU Module](/gsoc-2020/gpu) **Student:** [Haritha Jayasinghe][haritha] @@ -41,7 +41,7 @@ As well as to refactor and modernize the library by means of; * Introducing a fluent API for algorithms * Modernising the GPU Octree module to align with the it’s CPU counterpart -**Final report:** [url](/gsoc-2020-gpu) +**Final report:** [url](/gsoc-2020/gpu) ### Unified API for Algorithms From 19ec188dc138be2a566150a99624afaa42e50cf1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?S=C3=A9rgio=20Agostinho?= Date: Mon, 31 Aug 2020 11:15:20 +0200 Subject: [PATCH 8/8] Apply suggestions from code review Co-authored-by: aPonza <39060879+aPonza@users.noreply.github.com> --- gsoc-2020/gpu.md | 32 +++++++++++++++++--------------- gsoc-2020/index.md | 4 ++-- 2 files changed, 19 insertions(+), 17 deletions(-) diff --git a/gsoc-2020/gpu.md b/gsoc-2020/gpu.md index 9fff12a..c31d0af 100644 --- a/gsoc-2020/gpu.md +++ b/gsoc-2020/gpu.md @@ -29,7 +29,9 @@ The primary search methods provided by the GPU octree module are listed below 4. Synchronous (CPU based) Radius Search - C. Asynchronous (GPU based) K Nearest Search -While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search (C) and Asynchronous Approximate Nearest Neighbors search(A-1) were both returning incorrect results while one of the implementations of the Radius Search (B-2) was also returning incorrect results. The two 'synchronous' versions of the radius search and approximate nearest search methods listed above (A-2 & B-4) provide CPU based implementations (i.e. non parallelized versions that do not use CUDA kernels) of their GPU based counterparts. +The two 'synchronous' versions of the radius search and approximate nearest search methods listed above (A-2 & B-4) provide CPU based implementations (i.e. non parallelized versions that do not use CUDA kernels) of their GPU based counterparts. + +While the initial plan was not to spend an extensive amount of time on the GPU octree module, upon closer inspection it was discovered that there were many irregularities and errors within the GPU octree module. Specifically, two of the three primary methods offered by the GPU octree module, namely K Nearest Neighbours search (C) and Asynchronous Approximate Nearest Neighbors search(A-1) were both returning incorrect results. Furthermore, one of of Radius Search's variants (B-2) was also returning incorrect results. All of these functions were utilizing outdated CUDA primitives and idioms, risking deprecation in the near future. When diving into the code, it was also discovered that the GPU approximate nearest neighbours algorithm used a completely different traversal methodology from it's CPU counterpart. @@ -40,12 +42,12 @@ Due to these discoveries, the scope of the GPU modernization effort was expande Related PRs: [[4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[4313]](https://github.com/PointCloudLibrary/pcl/pull/4313) After comprehensively going through the GPU search methods to investigate their functionality and the causes of the above issues, we identified two separate bugs as the underlying cause: - 1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed [here.](https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/) + 1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with [warp level primitives introduced in CUDA 9.0](https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/). 2. In radius search, the correct radius was not shared between warp threads. Thus the search was being conducted for incorrect radius values. Synchronizing the radius values across the threads fixed this issue. -Since much of the code inside the above functions utilized an outdated concept of using volatile memory for sharing data between threads, they were also replaced by utilizing warp primitives to synchronize thread data. +Since much of the code inside the above functions utilized an outdated concept of using volatile memory for sharing data between threads, they were also replaced by newer warp primitives to synchronize thread data. -### Implementation of new traversal mechanism of approximate nearest search +### Implementation of a new traversal mechanism for approximate nearest search Related PRs: [[4294]](https://github.com/PointCloudLibrary/pcl/pull/4294) @@ -106,8 +108,8 @@ This flexibility can be offered to the user by transitioning the PCL library’s Related PRs: [[4166]](https://github.com/PointCloudLibrary/pcl/pull/4166) CMake options were added to allow users to select: -- Type of index (signed / unsigned – signed by default); -- Sign of index (8 / 16 / 32 / 64 – 32 by default); +- Signedness of index (signed / unsigned – signed by default); +- Size of index (8 / 16 / 32 / 64 – 32 by default); at compile-time, from PCL 1.12 onwards. ### Adding a CI job for testing 64bit unsigned index type @@ -125,17 +127,17 @@ A set of fundamental classes such as `pcl::PointCloud` lie at the core of PCL. T For situations where unsigned indices were required, a new type called `uindex_t` was also introduced, which acts as an unsigned version of the `index_t`. This transition was carried out for the following classes: -- PointCloud -- PCLPointCloud2 -- PCLBase -- PCLPointField -- Correspondences -- Vertices -- PCLImage +- `PointCloud` +- `PCLPointCloud2` +- `PCLBase` +- `PCLPointField` +- `Correspondences` +- `Vertices` +- `PCLImage` During the above transition process, it was discovered that significant additional work was required to address the numerous sign comparison warnings and other errors that arose from the transition in some of the above classes, which took up considerable time. -Furthermore, any changes beyond transitioning the above fundamental classes would have required additional workarounds to carry on, if they were to be carried out before the changes to the fundamental classes have been merged. (These features were planned to be merged in in PCL 1.12). Thus, work was shifted to the GPU module at this point. +Furthermore, any changes beyond transitioning the above fundamental classes would have required additional workarounds to carry on, if they were to be carried out before the changes to the fundamental classes have been merged. (These features were planned to be merged in PCL 1.12). Thus, work was shifted to the GPU module at this point. In addition, while the common module had already been modified to make it compatible with `index_t`, the tests for this module had not been modified. This was achieved with a very straightforward replacement of integer vectors with `index_t` vectors. @@ -159,4 +161,4 @@ The work carried out during the period ensure that the GPU octree search functio [haritha]: https://github.com/haritha-j [sergio]: https://github.com/SergioRAgostinho -[lars]: https://github.com/larshg \ No newline at end of file +[lars]: https://github.com/larshg diff --git a/gsoc-2020/index.md b/gsoc-2020/index.md index c1eb86b..1fa8742 100644 --- a/gsoc-2020/index.md +++ b/gsoc-2020/index.md @@ -22,7 +22,7 @@ After a long hiatus, PCL is once more participating in the Google Summer of Code Extending PCL's use case by generating bindings for its use with interface languages like Python, for rapid development and maximal speed. The approach makes use of Pybind11 to expose PCL's C++ code and generate bindings in the form of python modules by using necessary type information. It supports automatic regeneration of the bindings when the underlying C++ code changes, to work with PCL's active development cycle. -### [Refactoring, Modernisation & Feature Addition with Emphasis on GU Module](/gsoc-2020/gpu) +### [Refactoring, Modernisation & Feature Addition with Emphasis on GPU Module](/gsoc-2020/gpu) **Student:** [Haritha Jayasinghe][haritha] @@ -71,4 +71,4 @@ This project aims to transition the existing API to forward-compatible unified A [aponza]: https://github.com/aPonza [kunal]: https://github.com/kunaltyagi [sergio]: https://github.com/SergioRAgostinho -[lars]: https://github.com/larshg \ No newline at end of file +[lars]: https://github.com/larshg