æ¥çºèåé¢èª²ç¨ä¸æ使ç¨å°çsalelist.csvæªæ¡ãå¨è©²æªæ¡ä¸æå¾å¤çååé·å®ç´éãè¥ä»å¤©æ¨æ¯èéï¼æ¨æä¸ææ³è¦çååç¢ååé çé·å®ç¸½éé¡ï¼æ¾åºå ¬å¸çééæ¯å¢ï¼groupby()éåæ¹æ³å¯ä»¥å¹«æ¨åå°ã
é裡就ç¨ä¸å模æ¬çé·å®è³æä¾å¯å ¥ï¼ä¸è¬å¨é²è¡è³æåææï¼é½æ¯ä½¿ç¨csvæªæ¡æ ¼å¼ä¾åååï¼ç¡éé¿å 使ç¨xlsæxlsxçæ ¼å¼ãåå å¨æ¼xlsæxlsxçæ ¼å¼æç´éexcelä¸çä¸äºæ¬ä½æ¨£å¼è¨å®ï¼å å«é¡è²ã大å°ãæ¡ç·çï¼å æ¤æªæ¡æ大ä¸è¨±å¤ï¼ä½éäºå¤é¤çè¨å®å¨ä½¿ç¨pandasçæåé½æ¯ç¨ä¸å°çã
import pandas as pd
#--- å¯å
¥è³æ
salelist = pd.read_csv('salelist.csv') #ä¹å¯ä½¿ç¨read_excel
é¦å ï¼groupby()éåæ¹æ³è£¡é¢çµ¦ä»æ³è¦åä½µçæ¬ä½ï¼å¾é¢æ¥ä¸åä½µçæ¹å¼ï¼å°±å¯ä»¥å®æè³æï¼èåä½µçæ¹å¼æmax()ãmin()ãcount()ãsum()...çï¼å¯ä»¥é¸æ
salelist.groupby("product").mean()
以ä¸ç¨å¼ç¢¼å·è¡ä¹çµæï¼
key uid quantity price
product
apple 5.500000 3.000000 5.000000 10.0
banana 4.000000 3.500000 8.000000 34.0
cherry 4.000000 3.000000 3.000000 60.0
guava 6.500000 2.500000 2.000000 20.0
orange 6.333333 3.666667 4.333333 5.0
salelist[['product','quantity']].groupby("product").mean()
以ä¸ç¨å¼ç¢¼å·è¡ä¹çµæï¼
quantity
product
apple 5.000000
banana 8.000000
cherry 3.000000
guava 2.000000
orange 4.333333
# æ¯çè¨å®ç總é¡
salelist['éé¡'] = salelist['quantity'] * salelist['price']
salelist[['tid','éé¡']].groupby("tid").mean()
以ä¸ç¨å¼ç¢¼å·è¡ä¹çµæï¼
éé¡
tid
T0001 50.0
T0002 204.0
T0003 80.0
T0004 135.0
T0005 35.0
作者:楊超霆 行銷搬進大程式 創辦人