ç¶é²è¡è³æåæåï¼æåä¸å®å¾äºè§£éåè³æçä¸äºæ¦æ³ï¼å¦åä¹é£ä»¥ååæãéæåæ¨å¯ææ³åï¼ãå°±ç´æ¥ç¨excelæéçä¸ä¸å°±å¥½äºåãï¼éç確æ¯ä¸åå¾å¿«éæ¹ä¾¿çæ¹å¼ï¼ä½å¨ä»¥ä¸æ æ³çæåï¼åè使ç¨excelæé©å¾å ¶åï¼
å æ¤æ¥ä¸ä¾å°ä»ç´¹å¹¾åPandasçæ¹æ³ï¼æ¯å¨è³æåæçåæï¼äºè§£è³ææé常實ç¨çæ¹æ³ã
ç±æ¼æ¯åå¸pandasç使ç¨ï¼å æ¤ä¹ä¸å¤ªé©åå¼å ¥å¤ªè¤éçè³æï¼å æ¤å¨æ¬æä¸å èªå»ºä¸åç°¡å®çè³æ表ãæ¤èæ¹ä¾¿å¨æ¥çæ令çµææï¼ç´æ¥è·åè³æ表åæ¯å°ï¼è½æ´å¿«ç解ç¨å¼ç¢¼çéä½æ¹å¼åï¼pandaså¥ä»¶æ製é åºä¾çè³ææèªå·±çè³ææ ¼å¼ãDataFrameãã
import pandas as pd
#--- 製é è³æ
data = {'顧客編è':[1,2,3,4,5,6,7],
'å§å':['Jacky','Lily','Kevin',
'Bob','Harry','Bill','Harry'],
'年齡':[21,21,35,18,15,49,7]}
member = pd.DataFrame(data)
ååºåäºçè³æãå¾å¤æåæªæ¡é大ï¼exceléä¸èµ·ä¾ï¼åç¨pandasæéå¾ï¼å¯ä»¥å ç¨éåæ¹æ³ç°¡å®ç窺æ¢ä¸ä¸è³æå §å®¹ã
member.head() # ååºåäºçè³æ
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
顧客編è å§å 年齡
0 1 Jacky 21
1 2 Lily 21
2 3 Kevin 35
3 4 Bob 18
4 5 Harry 15
éåæ¹æ³æä¾äºå ©åé常éè¦çè³è¨ï¼
ç±æ¼æåå¨å¾çºçè³æèçæï¼è¥åæ ä¸åå¯è½æé æç¨å¼åºé¯ï¼å æ¤å¯ä»¥å å¨é裡æ¥çã
member.info() # è³æè³è¨
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 顧客編è 7 non-null int64
1 å§å 7 non-null object
2 年齡 7 non-null int64
dtypes: int64(2), object(1)
memory usage: 296.0+ bytes
æ¥çéçè³æå ±æå¤å°æ¬ãåã
member.shape # (åæ¸ï¼æ¬æ¸)
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
(7, 3)
ååºææçæ¬ä½ãéåæ¹æ³é常好ç¨ï¼åå å¨æ¼ï¼æåå¨å¾çºåæ¬ä½è³ææéè¦è¼¸å ¥æ¬ä½å稱ï¼èæ常æäºè³æçæ¬ä½å稱å¾é·æå¾è¤éï¼ç´æ¥ç¨æå¾å¾å®¹ææé¯ãéæå°±å¯ä»¥å·è¡æ¤æ¹æ³ï¼ç¶å¾è¤è£½æ¨æ³è¦çæ¬ä½ã
member.columns #ååºæææ¬ä½
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
Index(['顧客編è', 'å§å', '年齡'], dtype='object')
ååºææåçç´¢å¼ãç´¢å¼å¦ææ²æé²è¡ä¿®æ¹ç話ï¼é½ææ¯ç¨rangeçæ¹å¼ä½ç·¨èã
member.index #ååºææå
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
RangeIndex(start=0, stop=7, step=1)
åå¾ææ¬ä½çææè³æï¼æ¹å¼æ¯ä½¿ç¨ä¸æ¬èå è¦æ¬ä½å稱ï¼ä»¥å串çæ¹å¼è¼¸å ¥ãæ¤èåè¨ï¼æ¬ä½åç¨±å¿ é è¼¸å ¥çä¸æ¬¡ä¸å·®ï¼å¦åææä¸å°è³æã
member['年齡'] #ååºææ¬ä½
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
0 21
1 21
2 35
3 18
4 15
5 49
6 7
Name: 年齡, dtype: int64
åå¾å¤æ¬çè³æã使ç¨ä¸æ¬èå è¦é£åï¼é£åä¸è¼¸å ¥æ³è¦æåçå¤åæ¬ä½å稱ã
# ååº uid èage æ¬ä½
member[['顧客編è','å§å']]
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
顧客編è å§å
0 1 Jacky
1 2 Lily
2 3 Kevin
3 4 Bob
4 5 Harry
5 6 Bill
6 7 Harry
ææåæåå¨åå¾æ¬ä½æï¼æéè¦å°æ¬ä½å §çè³æé²è¡ç¯©é¸ï¼å æ¤å¯ä»¥å¨ä¸æ¬èä¸å å ¥ãå¤æ·ãï¼æ ¼å¼å¦ä¸ï¼ã
è®æ¸å稱[å¤æ·æ¢ä»¶]
é裡èä¸åä¾åï¼æé¸åºè®æ¸memberçãå§åãéåæ¬ä½ä¸ï¼ååå«åHarryç人ã
# åªé¡¯ç¤º'å§å' çº Harry ç交ææ¸æ
member[member['å§å'] == 'Harry']
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
顧客編è å§å 年齡
4 5 Harry 15
6 7 Harry 7
æ¨å¯è½æå¾å¥½å¥[member['å§å'] == 'Harry'
éé¨åç編寫æ¹å¼ï¼å
¶å¯¦éä¸å¡ä¹æ¯å¯ä»¥å·è¡çãå¯ä»¥çå°è¼¸åºççµæï¼æ¯ä»¥å¸æboolençæ ¼å¼åç¾ï¼èè³æä¸Trueçé¨åï¼ä¹æ¯ç¬¦åæ¢ä»¶ãæ被æé¸åºä¾çè³æï¼
member['å§å'] == 'Harry'
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
0 False
1 False
2 False
3 False
4 True
5 False
6 True
Name: å§å, dtype: bool
å¨ä¸ä¸æ®µå·²ç¶ç¥éäºdataframeæ ¼å¼ä¸ä¹ææBoolençæ ¼å¼ï¼é£å°±ä»£è¡¨èªªå¯ä»¥å課ç¨å¤æ·å¼Ifææçï¼é²è¡å¤åæ¢ä»¶çè³æå¤æ·è篩é¸ãä½å¨DataFrameçé¨åæ¯è¼ç¹å¥ï¼ä»åçé輯éç®åæç¹å®ç符èï¼ã
step1 = member['å§å'] == 'Harry'
step2 = member['年齡'] < 10
member[(step1 & step2)]
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
顧客編è å§å 年齡
6 7 Harry 7
ååºè©²æ¬ä½çæ大å¼ï¼æ¤æ¬ä½å¿ é æ¯æ¸å¼æ¬ä½ã
member['年齡'].max() #æ大å¼
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
49
ååºè©²æ¬ä½çæå°å¼ï¼æ¤æ¬ä½å¿ é æ¯æ¸å¼æ¬ä½ã
member['年齡'].min() #æå°å¼
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
7
ååºè©²æ¬ä½çå¹³åå¼ï¼æ¤æ¬ä½å¿ é æ¯æ¸å¼æ¬ä½ã
member['年齡'].mean() #å¹³åå¼
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
23.714285714285715
ååºè©²æ¬ä½çæ¨æºå·®ï¼æ¤æ¬ä½å¿ é æ¯æ¸å¼æ¬ä½ã
member['年齡'].std() #æ¨æºå·®
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
13.96082955646841
ååºè©²æ¬ä½ç總æ¸éã
member['年齡'].count() #總æ¸é
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
7
ååºè©²æ¬ä½ççµ±è¨æ¸å¼ï¼æ¤æ¬ä½å¿ é æ¯æ¸å¼æ¬ä½ã
member['年齡'].describe() #æ¬ä½è³è¨
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
count 7.000000
mean 23.714286
std 13.960830
min 7.000000
25% 16.500000
50% 21.000000
75% 28.000000
max 49.000000
Name: 年齡, dtype: float64
å°æ¤æ¬ä½é²è¡å 總ãéåæ¹æ³é¤äºå¯ä»¥å 總æ¸åå¤ï¼æåçè³æåæ ä¹å¯ä»¥å 總ï¼æå°è©²æ¬ä½ææçå串æ¥åä¸èµ·ã
member['年齡'].sum() # å 總
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
166
é裡ç¹å¥é¡å¤ä»ç´¹value_counts()éåæ¹æ³ï¼å çºä»å¨pandasçè³ææ¥çä¸é常çæ¹ä¾¿ãå¨ä¸æ¹çæ¡ä¾ä¸æåå¯ä»¥çå°ï¼value_counts()æ¹æ³å¯ä»¥å¹«æåææ¬ä½å §çè³æé²è¡ç¸½åæ¸è¨ç®ï¼ä¸¦ä¸ä¾ç §ç±å¤§å°å°é²è¡æåºï¼å æ¤å¯ä»¥å¿«éäºè§£è©²æ¬ä½ä¸ååé ç®çå æ¯ã
member['å§å'].value_counts() #è¨ç®åæ¸ä¸¦ä¾å¤§å°å°æåº
以ä¸ç¨å¼ç¢¼çå·è¡çµæï¼ã
Harry 2
Jacky 1
Lily 1
Kevin 1
Bob 1
Bill 1
Name: å§å, dtype: int64
作者:楊超霆 行銷搬進大程式 創辦人