pandas Adjusting the Order of Columns and the Implementation of Adding Columns
- 2021-10-16 02:07:55
- OfStack
In the operation of excel, adjusting the order of columns and adding 1 columns are also often used. Next, we use pandas to realize this 1 function.
1. Adjust the order of columns
>>> df = pd.read_excel(r'D:/myExcel/1.xlsx')
>>> df
A B C D
0 bob 12 78 87
1 millor 15 92 21
>>> df.columns
Index(['A', 'B', 'C', 'D'], dtype='object')
# This is the simplest and most commonly used 1 Method, which is equivalent to specifying the column name to let pandas
# From df Get from
>>> df[['A', 'D', 'C', 'B']]
A D C B
0 bob 87 78 12
1 millor 21 92 15
# This is also ok
>>> df[['A', 'A', 'A', 'A']]
A A A A
0 bob bob bob bob
1 millor millor millor millor
2. Add a column or columns
(1) Add directly
>>> df['E']=[1, 2]
>>> df
A B C D E
0 bob 12 78 87 1
1 millor 15 92 21 2
(2) Call the assign method. This method is good at adding new columns according to existing columns, through basic operations, or calling functions
>>> df
A B C D
0 bob 12 78 87
1 millor 15 92 21
# Among them E Is a column name, according to B Column -C The value of the column
>>> df.assign(E=df['B'] - df['C'])
A B C D E
0 bob 12 78 87 -66
1 millor 15 92 21 -77
# You can also add two columns
>>> df.assign(E=df['B'] - df['C'], F=df['B'] * df['C'])
A B C D E F
0 bob 12 78 87 -66 936
1 millor 15 92 21 -77 1380
Haha, that's what pandas says about adjusting the order of columns and adding new columns
Supplement: pandas modifies column names in DataFrame & Adjust the order of columns
Modify column name:
Invoke the interface directly:
df.rename()
Look at the definition in the interface under 1:
def rename(self, *args, **kwargs):
"""
Alter axes labels.
Function / dict values must be unique (1-to-1). Labels not contained in
a dict / Series will be left as-is. Extra labels listed don't throw an
error.
See the :ref:`user guide <basics.rename>` for more.
Parameters
----------
mapper, index, columns : dict-like or function, optional
dict-like or functions transformations to apply to
that axis' values. Use either ``mapper`` and ``axis`` to
specify the axis to target with ``mapper``, or ``index`` and
``columns``.
axis : int or str, optional
Axis to target with ``mapper``. Can be either the axis name
('index', 'columns') or number (0, 1). The default is 'index'.
copy : boolean, default True
Also copy underlying data
inplace : boolean, default False
Whether to return a new DataFrame. If True then value of copy is
ignored.
level : int or level name, default None
In case of a MultiIndex, only rename labels in the specified
level.
Returns
-------
renamed : DataFrame
See Also
--------
pandas.DataFrame.rename_axis
Examples
--------
``DataFrame.rename`` supports two calling conventions
* ``(index=index_mapper, columns=columns_mapper, ...)``
* ``(mapper, axis={'index', 'columns'}, ...)``
We *highly* recommend using keyword arguments to clarify your
intent.
>>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
>>> df.rename(index=str, columns={"A": "a", "B": "c"})
a c
0 1 4
1 2 5
2 3 6
>>> df.rename(index=str, columns={"A": "a", "C": "c"})
a B
0 1 4
1 2 5
2 3 6
Using axis-style parameters
>>> df.rename(str.lower, axis='columns')
a b
0 1 4
1 2 5
2 3 6
>>> df.rename({1: 2, 2: 4}, axis='index')
A B
0 1 4
2 2 5
4 3 6
"""
axes = validate_axis_style_args(self, args, kwargs, 'mapper', 'rename')
kwargs.update(axes)
# Pop these, since the values are in `kwargs` under different names
kwargs.pop('axis', None)
kwargs.pop('mapper', None)
return super(DataFrame, self).rename(**kwargs)
Note:
1 *, the input can be an array or tuple, and the input array or tuple will be split into 1 element.
Two *, input must be in dictionary format
Example:
>>>import pandas as pd
>>>a = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9]})
>>> a
A B C
0 1 4 7
1 2 5 8
2 3 6 9
# Will the column name A Replace with column name a , B Replace with b , C Replace with c
>>>a.rename(columns={'A':'a', 'B':'b', 'C':'c'}, inplace = True)
>>>a
a b c
0 1 4 7
1 2 5 8
2 3 6 9
Adjust the order of columns:
Such as:
>>> import pandas
>>> dict_a = {'user_id':['webbang','webbang','webbang'],'book_id':['3713327','4074636','26873486'],'rating':['4','4','4'],
'mark_date':['2017-03-07','2017-03-07','2017-03-07']}
>>> df = pandas.DataFrame(dict_a) # Create from a dictionary DataFrame
>>> df # Create a good df Column names are sorted alphabetically by default, and the order in the dictionary is not 1 Sample, in the dictionary 'user_id','book_id','rating','mark_date'
book_id mark_date rating user_id
0 3713327 2017-03-07 4 webbang
1 4074636 2017-03-07 4 webbang
2 26873486 2017-03-07 4 webbang
Modify column names directly:
>>> df = df[['user_id','book_id','rating','mark_date']] # Adjust the column order to 'user_id','book_id','rating','mark_date'
>>> df
user_id book_id rating mark_date
0 webbang 3713327 4 2017-03-07
1 webbang 4074636 4 2017-03-07
2 webbang 26873486 4 2017-03-07
Just do it.