Detail the use of to_dict in pandas
- 2020-11-03 22:31:19
- OfStack
Summary: to_dict in pandas can convert data of type DataFrame
There are six types of conversions available, one for each of the parameters' dict', 'list',' series', 'split',' records', 'index', and one for each
Help on method to_dict in module pandas.core.frame:
to_dict(orient='dict') method of pandas.core.frame.DataFrame instance
Convert DataFrame to dictionary.
Parameters
----------
orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
Determines the type of the values of the dictionary.
- dict (default) : dict like {column -> {index -> value}}
- list : dict like {column -> [values]}
- series : dict like {column -> Series(values)}
- split : dict like
{index -> [index], columns -> [columns], data -> [values]}
- records : list like
[{column -> value}, ... , {column -> value}]
- index : dict like {index -> {column -> value}}
.. versionadded:: 0.17.0
Abbreviations are allowed. `s` indicates `series` and `sp`
indicates `split`.
Returns
-------
result : dict like {column -> {index -> value}}
1. Select the parameter orient='dict'
dict is also the default parameter. The data data type below is the DataFrame structure, which forms {column - > {index - > value}} such a structure of the dictionary, can be considered as a double dictionary structure
- Extract the value of each column and its index separately, then combine them into a dictionary
- Then use the above column attribute as the keyword (key), with the value (values) as the dictionary above
The query is: data_dict[key1][key2]
-data_ES43en selects the data name when orient='dict' for the parameter
-key1 is the key value of the column attribute (outer layer)
-key2 is the key value corresponding to the inner dictionary
data
Out[9]:
pclass age embarked home.dest sex
1086 3rd 31.194181 UNKNOWN UNKNOWN male
12 1st 31.194181 Cherbourg Paris, France female
1036 3rd 31.194181 UNKNOWN UNKNOWN male
833 3rd 32.000000 Southampton Foresvik, Norway Portland, ND male
1108 3rd 31.194181 UNKNOWN UNKNOWN male
562 2nd 41.000000 Cherbourg New York, NY male
437 2nd 48.000000 Southampton Somerset / Bernardsville, NJ female
663 3rd 26.000000 Southampton UNKNOWN male
669 3rd 19.000000 Southampton England male
507 2nd 31.194181 Southampton Petworth, Sussex male
In[10]: data_dict=data.to_dict(orient= 'dict')
In[11]: data_dict
Out[11]:
{'age': {12: 31.19418104265403,
437: 48.0,
507: 31.19418104265403,
562: 41.0,
663: 26.0,
669: 19.0,
833: 32.0,
1036: 31.19418104265403,
1086: 31.19418104265403,
1108: 31.19418104265403},
'embarked': {12: 'Cherbourg',
437: 'Southampton',
507: 'Southampton',
562: 'Cherbourg',
663: 'Southampton',
669: 'Southampton',
833: 'Southampton',
1036: 'UNKNOWN',
1086: 'UNKNOWN',
1108: 'UNKNOWN'},
'home.dest': {12: 'Paris, France',
437: 'Somerset / Bernardsville, NJ',
507: 'Petworth, Sussex',
562: 'New York, NY',
663: 'UNKNOWN',
669: 'England',
833: 'Foresvik, Norway Portland, ND',
1036: 'UNKNOWN',
1086: 'UNKNOWN',
1108: 'UNKNOWN'},
'pclass': {12: '1st',
437: '2nd',
507: '2nd',
562: '2nd',
663: '3rd',
669: '3rd',
833: '3rd',
1036: '3rd',
1086: '3rd',
1108: '3rd'},
'sex': {12: 'female',
437: 'female',
507: 'male',
562: 'male',
663: 'male',
669: 'male',
833: 'male',
1036: 'male',
1086: 'male',
1108: 'male'}}
2, when the keyword orient=' list'
Similar to 1, except that the inner layer becomes a list with the structure {column - > [values]}
Query method: data_list[keys][index]
data_list is the data name corresponding to the keyword orient='list'
keys is the key value of the column attribute, such as 'age', 'embarked' in this example
index is an integer index, starting at 0 and ending at 0
In[19]: data_list=data.to_dict(orient='list')
In[20]: data_list
Out[20]:
{'age': [31.19418104265403,
31.19418104265403,
31.19418104265403,
32.0,
31.19418104265403,
41.0,
48.0,
26.0,
19.0,
31.19418104265403],
'embarked': ['UNKNOWN',
'Cherbourg',
'UNKNOWN',
'Southampton',
'UNKNOWN',
'Cherbourg',
'Southampton',
'Southampton',
'Southampton',
'Southampton'],
'home.dest': ['UNKNOWN',
'Paris, France',
'UNKNOWN',
'Foresvik, Norway Portland, ND',
'UNKNOWN',
'New York, NY',
'Somerset / Bernardsville, NJ',
'UNKNOWN',
'England',
'Petworth, Sussex'],
'pclass': ['3rd',
'1st',
'3rd',
'3rd',
'3rd',
'2nd',
'2nd',
'3rd',
'3rd',
'2nd'],
'sex': ['male',
'female',
'male',
'male',
'male',
'male',
'female',
'male',
'male',
'male']}
3. Keyword parameter orient='series'
Form the structure {column - > Series(values)}
The invocation format is: data_series[key1][key2] or data_dict[key1]
data_series is the name for the data
key1 is the key value of the column attribute, such as 'age', 'embarked' in this example
key2 USES the original index of the data (optional)
In[21]: data_series=data.to_dict(orient='series')
In[22]: data_series
Out[22]:
{'age': 1086 31.194181
12 31.194181
1036 31.194181
833 32.000000
1108 31.194181
562 41.000000
437 48.000000
663 26.000000
669 19.000000
507 31.194181
Name: age, dtype: float64, 'embarked': 1086 UNKNOWN
12 Cherbourg
1036 UNKNOWN
833 Southampton
1108 UNKNOWN
562 Cherbourg
437 Southampton
663 Southampton
669 Southampton
507 Southampton
Name: embarked, dtype: object, 'home.dest': 1086 UNKNOWN
12 Paris, France
1036 UNKNOWN
833 Foresvik, Norway Portland, ND
1108 UNKNOWN
562 New York, NY
437 Somerset / Bernardsville, NJ
663 UNKNOWN
669 England
507 Petworth, Sussex
Name: home.dest, dtype: object, 'pclass': 1086 3rd
12 1st
1036 3rd
833 3rd
1108 3rd
562 2nd
437 2nd
663 3rd
669 3rd
507 2nd
Name: pclass, dtype: object, 'sex': 1086 male
12 female
1036 male
833 male
1108 male
562 male
437 female
663 male
669 male
507 male
Name: sex, dtype: object}
4. Keyword parameter orient='split'
The formation of {index - > [index], columns - > [columns], data - > [values]} takes data, index, and attribute names and separates them into a dictionary
The call method is data_split[' index'],data_split[' data'],data_split[' columns']
data_split=data.to_dict(orient='split')
data_split
Out[38]:
{'columns': ['pclass', 'age', 'embarked', 'home.dest', 'sex'],
'data': [['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
['1st', 31.19418104265403, 'Cherbourg', 'Paris, France', 'female'],
['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
['3rd', 32.0, 'Southampton', 'Foresvik, Norway Portland, ND', 'male'],
['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
['2nd', 41.0, 'Cherbourg', 'New York, NY', 'male'],
['2nd', 48.0, 'Southampton', 'Somerset / Bernardsville, NJ', 'female'],
['3rd', 26.0, 'Southampton', 'UNKNOWN', 'male'],
['3rd', 19.0, 'Southampton', 'England', 'male'],
['2nd', 31.19418104265403, 'Southampton', 'Petworth, Sussex', 'male']],
'index': [1086, 12, 1036, 833, 1108, 562, 437, 663, 669, 507]}
5, when the keyword orient='records'
Form [{column - > value},... , {column - > The structure of value}]
The whole form a list, and the inner layer is to extract each row of the original data to form a dictionary
The invocation format is data_records[index][key1]
data_records=data.to_dict(orient='records')
data_records
Out[41]:
[{'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 31.19418104265403,
'embarked': 'Cherbourg',
'home.dest': 'Paris, France',
'pclass': '1st',
'sex': 'female'},
{'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 32.0,
'embarked': 'Southampton',
'home.dest': 'Foresvik, Norway Portland, ND',
'pclass': '3rd',
'sex': 'male'},
{'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 41.0,
'embarked': 'Cherbourg',
'home.dest': 'New York, NY',
'pclass': '2nd',
'sex': 'male'},
{'age': 48.0,
'embarked': 'Southampton',
'home.dest': 'Somerset / Bernardsville, NJ',
'pclass': '2nd',
'sex': 'female'},
{'age': 26.0,
'embarked': 'Southampton',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 19.0,
'embarked': 'Southampton',
'home.dest': 'England',
'pclass': '3rd',
'sex': 'male'},
{'age': 31.19418104265403,
'embarked': 'Southampton',
'home.dest': 'Petworth, Sussex',
'pclass': '2nd',
'sex': 'male'}]
6, when the keyword orient='index'
The formation of {index - > {column - > The structure of value}}, call format is exactly the opposite of 'dict', please think for yourself
data_index=data.to_dict(orient='index')
data_index
Out[43]:
{12: {'age': 31.19418104265403,
'embarked': 'Cherbourg',
'home.dest': 'Paris, France',
'pclass': '1st',
'sex': 'female'},
437: {'age': 48.0,
'embarked': 'Southampton',
'home.dest': 'Somerset / Bernardsville, NJ',
'pclass': '2nd',
'sex': 'female'},
507: {'age': 31.19418104265403,
'embarked': 'Southampton',
'home.dest': 'Petworth, Sussex',
'pclass': '2nd',
'sex': 'male'},
562: {'age': 41.0,
'embarked': 'Cherbourg',
'home.dest': 'New York, NY',
'pclass': '2nd',
'sex': 'male'},
663: {'age': 26.0,
'embarked': 'Southampton',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
669: {'age': 19.0,
'embarked': 'Southampton',
'home.dest': 'England',
'pclass': '3rd',
'sex': 'male'},
833: {'age': 32.0,
'embarked': 'Southampton',
'home.dest': 'Foresvik, Norway Portland, ND',
'pclass': '3rd',
'sex': 'male'},
1036: {'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
1086: {'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
1108: {'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'}}