從字典列表建立 Pandas DataFrame
Luqman Khan
2022年5月16日
字典是一個緊湊而靈活的 Python 容器,用於儲存單獨的鍵值對映。字典用大括號 ({}
) 編寫,其中包括用逗號 (,)
和 :
分隔每個鍵與其值的關鍵字對。
下面顯示了三個字典,其中包含一個骰子游戲的示例。
讓我們以骰子游戲為例。在這種情況下,兩名玩家滾動他們的六個骰子並與相應的玩家儲存骰子。
import pandas as pd
from numpy.random import randint
# create datset from multiple dictionaries
dataset_list = [
{"Harry": 1, "Josh": 3, "dices": "first dice"},
{"Harry": 5, "Josh": 1, "dices": "second dice"},
{"Harry": 6, "Josh": 2, "dices": "third dice"},
{"Harry": 2, "Josh": 3, "dices": "fourth dice"},
{"Harry": 6, "Josh": 6, "dices": "fifth dice"},
{"Harry": 4, "Josh": 3, "dices": "sixth dice"},
]
df = pd.DataFrame(dataset_list)
print(df)
print()
harry = []
josh = []
for i in range(6):
harry.append(randint(1, 7))
josh.append(randint(1, 7))
我們從包含字典的專案列表中建立了一個資料集,因為我們知道 DataFrame
採用鍵值對。這就是為什麼這適用於字典。
輸出:
Harry Josh dices
0 1 3 first dice
1 5 1 second dice
2 6 2 third dice
3 2 3 fourth dice
4 6 6 fifth dice
5 4 3 sixth dice
我們在上一個例子中手動設定骰子;現在,我們將使用 numpy
庫中定義的 randint
方法。我們在下一行建立了兩個名為 harry
和 josh
的空白列表。接下來,我們建立了一個 for
迴圈,該範圍定義為 0-6,使用 append()
方法將兩個已定義列表中的隨機數作為元素附加,如下所示。
import pandas as pd
from numpy.random import randint
print()
harry = []
josh = []
for i in range(6):
harry.append(randint(1, 7))
josh.append(randint(1, 7))
# create datset from multiple dictionaries
dataset_list = [
{"Harry": harry[0], "Josh": josh[0], "dices": "first dice"},
{"Harry": harry[1], "Josh": josh[1], "dices": "second dice"},
{"Harry": harry[2], "Josh": josh[2], "dices": "third dice"},
{"Harry": harry[3], "Josh": josh[3], "dices": "fourth dice"},
{"Harry": harry[4], "Josh": josh[4], "dices": "fifth dice"},
{"Harry": harry[5], "Josh": josh[5], "dices": "sixth dice"},
]
df = pd.DataFrame(dataset_list)
print(df)
請記住,randint()
的範圍從給定的 1 到 n-1
,或者預設情況下從零到 n-1
,這就是我們定義從 1-7
的範圍的原因。
輸出
Harry Josh dices
0 4 1 first dice
1 4 2 second dice
2 3 4 third dice
3 1 1 fourth dice
4 4 5 fifth dice
5 4 4 sixth dice
現在我們在 for
迴圈的幫助下減少了程式碼行,並將整個字典附加到一個列表中,進一步在名為 index
的列表中附加索引與玩家回合相對應並設定為 DataFrame
中的索引。
import pandas as pd
from numpy.random import randint
dataset_list = []
index = []
for i in range(1, 7):
dataset_list.append({"Harry": randint(1, 7), "Josh": randint(1, 7)})
index.append("dice " + str(i))
print("\nAfter reducing the code\n")
df = pd.DataFrame(dataset_list, index=index)
print(df)
輸出:
Harry Josh
dice 1 2 4
dice 2 2 3
dice 3 6 5
dice 4 5 2
dice 5 4 2
dice 6 1 1
所有示例:
import pandas as pd
from numpy.random import randint
# create datset from multiple dictionaries
dataset_list = [
{"Harry": 1, "Josh": 3, "dices": "first dice"},
{"Harry": 5, "Josh": 1, "dices": "second dice"},
{"Harry": 6, "Josh": 2, "dices": "third dice"},
{"Harry": 2, "Josh": 3, "dices": "fourth dice"},
{"Harry": 6, "Josh": 6, "dices": "fifth dice"},
{"Harry": 4, "Josh": 3, "dices": "sixth dice"},
]
df = pd.DataFrame(dataset_list)
print(df)
print()
harry = []
josh = []
for i in range(6):
harry.append(randint(1, 7))
josh.append(randint(1, 7))
# create datset from multiple dictionaries
dataset_list = [
{"Harry": harry[0], "Josh": josh[0], "dices": "first dice"},
{"Harry": harry[1], "Josh": josh[1], "dices": "second dice"},
{"Harry": harry[2], "Josh": josh[2], "dices": "third dice"},
{"Harry": harry[3], "Josh": josh[3], "dices": "fourth dice"},
{"Harry": harry[4], "Josh": josh[4], "dices": "fifth dice"},
{"Harry": harry[5], "Josh": josh[5], "dices": "sixth dice"},
]
df = pd.DataFrame(dataset_list)
print(df)
dataset_list = []
index = []
for i in range(1, 7):
dataset_list.append({"Harry": randint(1, 7), "Josh": randint(1, 7)})
index.append("dice " + str(i))
print("\nAfter reducing the code\n")
df = pd.DataFrame(dataset_list, index=index)
print(df)
輸出:
Harry Josh dices
0 1 3 first dice
1 5 1 second dice
2 6 2 third dice
3 2 3 fourth dice
4 6 6 fifth dice
5 4 3 sixth dice
Harry Josh dices
0 4 1 first dice
1 4 2 second dice
2 3 4 third dice
3 1 1 fourth dice
4 4 5 fifth dice
5 4 4 sixth dice
After reducing the code
Harry Josh
dice 1 2 4
dice 2 2 3
dice 3 6 5
dice 4 5 2
dice 5 4 2
dice 6 1 1