å¨æ¬èª²ç¨ä¸ï¼å°æä»ç´¹å¹¾åPandaså¥ä»¶æ常使ç¨çæ¹æ³ï¼å §å®¹å å«read_csvå¯å ¥æªæ¡ãsort_valuesè³ææåºãdropåªé¤è³æãastypeè³æåæ è½æãreplaceå代è³æå §å®¹ä»¥åæ´æcolumnså稱ãè¨å¾å¨å·è¡ä»¥ä¸ç¨å¼ç¢¼ä¹åï¼å¿ é è¦å å¯å ¥pandaså¥ä»¶åï¼
import pandas as pd
é裡就ç¨ä¸å模æ¬çé·å®è³æä¾å¯å ¥ï¼ä¸è¬å¨é²è¡è³æåææï¼é½æ¯ä½¿ç¨csvæªæ¡æ ¼å¼ä¾åååï¼ç¡éé¿å 使ç¨xlsæxlsxçæ ¼å¼ãåå å¨æ¼xlsæxlsxçæ ¼å¼æç´éexcelä¸çä¸äºæ¬ä½æ¨£å¼è¨å®ï¼å å«é¡è²ã大å°ãæ¡ç·çï¼å æ¤æªæ¡æ大ä¸è¨±å¤ï¼ä½éäºå¤é¤çè¨å®å¨ä½¿ç¨pandasçæåé½æ¯ç¨ä¸å°çã
#--- å¯å
¥è³æ
salelist = pd.read_csv('salelist.csv') #ä¹å¯ä½¿ç¨read_excel
sort_values()æ¹æ³å¯ä»¥å°ææ¬ä½é²è¡è³æçæåºï¼èè³ææåºæåéæ¸èéåï¼ä½¿ç¨ãascendingãåæ¸ä¾é²è¡èª¿æ´ã
# æåº(éæ¸)
salelist['quantity'].sort_values(ascending=False) # False=éæ¸ï¼ True=éå¢
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
5 10
1 6
0 5
6 5
9 5
2 4
8 4
3 3
4 2
7 2
Name: quantity, dtype: int64
æåå¨åé¢ç課ç¨Pandasäºè§£è³æï¼è³æåæãåè³æç¶ä¸å·²ç¶å¸æå¦ä½åæ´æ¬çè³æãèè¥è¦é²è¡æ¬èæ¬çè¨ç®ï¼åªéè¦å°æ¬ä½ååºä¾ç´æ¥é²è¡æ¸å¸éç®å³å¯ã
# è³æ+-*/
salelist['quantity'] * salelist['price']
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
0 50
1 204
2 20
3 180
4 40
5 340
6 25
7 40
8 20
9 50
dtype: int64
ææåæåæåªé¤æ¬ä½çéè¦æï¼å°±å¿ é 使ç¨å°drop()ãdrop()æ¹æ³æ¯æ´åªé¤å¤åæ¬ä½ï¼ä½å»ºè°å¨é²è¡åªé¤ä¹åï¼å¯ä»¥å ç¨copy()æ¹æ³è¤è£½ä¸ä»½ç¸åçè®æ¸ï¼ä»¥å ç¼ç¾åªé¯è³æå¾æè«åã
# ç§»é¤ uid è age
salelist.drop(columns=['tid','uid'])
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
key date product quantity price
0 1 109-1-15 apple 5 10
1 2 109-1-18 banana 6 34
2 3 109-2-28 orange 4 5
3 4 109-2-28 cherry 3 60
4 5 109-2-28 guava 2 20
5 6 109-9-1 banana 10 34
6 7 109-9-1 orange 5 5
7 8 109-9-1 guava 2 20
8 9 110-1-15 orange 4 5
9 10 110-1-15 apple 5 10
astype()æ¹æ³è² 責èçæ¬ä½çåæ è½æï¼ä¸»è¦æ¯å¨èçintãfloatãstréä¸åè³æåæ ã
salelist['price'].astype('float64') #int, str
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
0 10.0
1 34.0
2 5.0
3 60.0
4 20.0
5 34.0
6 5.0
7 20.0
8 5.0
9 10.0
Name: price, dtype: float64
å°dataFrameè³æåæ è½æListå°±å¿ é è¦ä½¿ç¨å°tolist()éåæ¹æ³ã
alelist.values # è½æåarray
salelist.values.tolist() # è½æålist
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
[[1, 'T0001', '109-1-15', 1, 'apple', 5, 10.0],
[2, 'T0002', '109-1-18', 5, 'banana', 6, 34.0],
[3, 'T0003', '109-2-28', 4, 'orange', 4, 5.0],
[4, 'T0003', '109-2-28', 3, 'cherry', 3, 60.0],
[5, 'T0003', '109-2-28', 3, 'guava', 2, 20.0],
[6, 'T0004', '109-9-1', 2, 'banana', 10, 34.0],
[7, 'T0004', '109-9-1', 2, 'orange', 5, 5.0],
[8, 'T0004', '109-9-1', 2, 'guava', 2, 20.0],
[9, 'T0005', '110-1-15', 5, 'orange', 4, 5.0],
[10, 'T0005', '110-1-15', 5, 'apple', 5, 10.0]]
replaceéæ¹æ³é常ç好ç¨ï¼ä½æä¾ç §éæ±çä¸åï¼æä¸äºè®åãå è¨è«replace()éåæ¹æ³è£¡é¢çåæ¸ï¼åè æ¾ãæ³å代çå串ãï¼å¾è æ¾ãæ³è¢«å代æçå串ããèè¥replace()åé¢æå ä¸strï¼åæ¯å代æ¬ä½å §å串ã
salelist['product'].str.replace('an','@')
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
0 apple
1 b@@a
2 or@ge
3 cherry
4 guava
5 b@@a
6 or@ge
7 guava
8 or@ge
9 apple
Name: product, dtype: object
è¥åªææ¬ä½å ä¸replace()ï¼åæ¯æª¢æ¥æ´åæ¬ä½çå串ææ²æ符åï¼ææé²è¡å代ï¼å æ¤ä¸æ¹çç¯ä¾å·è¡å®å¾ï¼æææ²æä»»ä½çè®åã
salelist['product'].replace('an','@')
以ä¸ç¨å¼ç¢¼å·è¡çµæï¼
0 apple
1 banana
2 orange
3 cherry
4 guava
5 banana
6 orange
7 guava
8 orange
9 apple
Name: product, dtype: object
è¥æ³è¦å°ææçæ¬ä½å稱é²è¡ä¿®æ¹ï¼åªéè¦æå®ä¸åé£å給columnså³å¯ãé裡éè¦æ³¨æçæ¯ï¼é£åçé·åº¦å¿ é è·columnsçé·åº¦ä¸æ¨£ã
salelist.columns = ['è³æç·¨è','交æåºè','交ææ¥æ','顧客編è','ç¢å','æ¸é','å¹æ ¼']
作者:楊超霆 行銷搬進大程式 創辦人