Closed
Description
pandas version:
'0.19.2'
import requests
url ="http://www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/eisdeqty_c.htm"
response = requests.get(url)
if response.status_code == 200:
soup = BeautifulSoup(response.content, "lxml")
table = soup.find("table", class_="table_grey_border")
board_data = pandas.read_html(table.prettify(),header=0, flavor="bs4")
return board_data[0]
Problem description
股份代號 股份名稱 買賣單位 附註 Unnamed: 4 Unnamed: 5 Unnamed: 6
0 1 長和 500 # H O F
1 2 中電控股 500 # H O F
2 3 香港中華煤氣 1000 # H O F
3 4 九龍倉集團 1000 # H O F
4 5 匯豐控股 400 # H O F
5 6 電能實業 500 # H O F
6 7 凱富能源 2000 # NaN NaN NaN
股份代號 this column data should be 1->00001
, 2->00002
datatype:
股份代號 int64
股份名稱 object
買賣單位 int64
附註 object
Unnamed: 4 object
Unnamed: 5 object
Unnamed: 6 object
dtype: object
<class 'pandas.core.frame.DataFrame'>
why missing the 0000 data in the columns
actually, the 股份代號 datatype should be object.