1. Html網頁結構介紹-網頁到底如何傳送資料?爬蟲必學
2. 資料傳遞:Get與Post差異,網路封包傳送的差異
3. Html爬蟲Get教學-抓下Yahoo股票資訊,程式交易的第一步
4. Html爬蟲Get實戰-全台最大美食平台FoodPanda爬蟲,把熊貓抓回家
5. 資料分析實戰,熊貓FoodPanda熱門美食系列|看出地區最火料理種類
6. Json爬蟲教學-Google趨勢搜尋|掌握最火關鍵字
7. Json爬蟲實戰-24小時電商PChome爬蟲|雖然我不是個數學家但這聽起來很不錯吧
8. Html爬蟲Post教學-台灣股市資訊|網韭菜們的救星
9. Html爬蟲Post實戰-全球美食平台UberEat爬蟲
10. Pandas爬蟲教學-Yahoo股市爬蟲|不想再盯盤
11. Pandas爬蟲實戰-爬下全台各地區氣象預報歷史資料
12. 資料分析實戰-天氣預報圖像化|一張圖巧妙躲過雨季
æ¨ä¸å®æå¾å¥½å¥ï¼çºä½ä¸ç´æ¥ç¬å°ç£æ°£è±¡å±ç網ç«ï¼åå å¨æ¼ï¼å°ç£æ°£è±¡å±ç網ç«ç¡æ³æå°ãæ·å²ã天氣ãæåå¾å¸¸æéè¦èª¿æ¥ä¸åååç天氣ï¼ä¸ç®¡æ¯å¾äºäºæ¥ï¼æè åå°å»ºç©ï¼é½è·å¤©æ°£é常æéä¿ï¼å æ¤æé¸æå°2345天氣çç¶ä¸æ¥çã
å©ç¨ãJsonç¬è²æå¸ï¼Google趨å¢æå°ãèãJsonç¬è²å¯¦æ°ï¼PChomeãææå°çJsonç¬è²æ¹å¼ï¼åå¾è©²ç¶²ç«å°åæ·å²è³æçAPIï¼å¨è©²ç¶²ç«æä¸F12å¾ï¼å¨éæ°æ´ç網é ï¼F5ï¼ï¼å³å¯çå°APIï¼å¦ä¸åæ示ï¼ã è§å¯è©²APIç網åæç¼ç¾ï¼ä¸»è¦ææå¹¾ååæ¸ï¼ å°åçç·¨èã å°ç£çç·¨èï¼åºå®æ¯2ï¼ã æ¥è©¢å¹´åã æ¥è©¢æ份ãç±æ¤å¯ç¥ï¼æ³è¦åå¾ä¸åæéæå°é»çè³æï¼å°±æ¯å¨é裡é²è¡èª¿æ´ã
http://tianqi.2345.com/Pc/GetHistory?areaInfo%5BareaId%5D=54511&areaInfo%5BareaType%5D=2&date%5Byear%5D=2021&date%5Bmonth%5D=8
å°æ¯åæ¸ï¼
http://tianqi.2345.com/Pc/GetHistory?areaInfo%5BareaId%5D=ãå°åçç·¨èã&areaInfo%5BareaType%5D=ãå°ç£çç·¨èã&date%5Byear%5D=ãæ¥è©¢å¹´åã&date%5Bmonth%5D=ãæ¥è©¢æ份ã
æ¥ä¸ä¾å°±æ¯å©ç¨request å¥ä»¶è«æ±ï¼åå©ç¨Json å¥ä»¶å°çµæé²è¡è§£æã
#è«æ±ç¶²ç«
list_req = requests.get(url)
#å°æ´å網ç«çç¨å¼ç¢¼ç¬ä¸ä¾
getJson = json.loads(list_req.content)
getTable = pd.read_html(getJson['data'],header = 0)
getTable[0] # æå°è³æ
å¨å¯¦éè³ææåæï¼å¸¸æéè¦è³å°ä¸å¹´è³æï¼å æ¤ç¨å¼ç¢¼çç¯ä¾ä»¥å°åçºä¾ï¼ç¬å12åæç天氣è³æãæ¯åææå®æçè³æï¼å©ç¨ãPandasåä½µè³æï¼concatãmergeã課ç¨ææçpd.concat æ¹æ³é²è¡åä½µï¼æ¹è½çå æ´çè³æã
#--- åå¾å¤§éè³æï¼è©²å°å12åæ
today = datetime.datetime.today()
areaId = '71294' # å°å
areaType = '2'
containar = pd.DataFrame() # æºåä¸å容å¨
for i in tqdm.tqdm(range(12)):
countDay = today - relativedelta(months=i)
year = countDay.year
month = countDay.month
# è¦æåç網å
url = 'http://tianqi.2345.com/Pc/GetHistory?areaInfo%5BareaId%5D='+ str(areaId) +'&areaInfo%5BareaType%5D='+ str(areaType) +'&date%5Byear%5D='+ str(year) +'&date%5Bmonth%5D='+ str(month)
#è«æ±ç¶²ç«
print(str(i)+'éå§è«æ±')
list_req = requests.get(url)
print(str(i)+'è«æ±å®æ')
#å°æ´å網ç«çç¨å¼ç¢¼ç¬ä¸ä¾
getJson = json.loads(list_req.content)
getTable = pd.read_html(getJson['data'],header = 0)
# åä½µè³æ
containar = pd.concat([containar, getTable[0]])
# ä¼æ¯ä¸ä¸
time.sleep(random.randint(45,70))
作者:楊超霆 行銷搬進大程式 創辦人