Merging straightening and remodeling instances of arrays in numpy and pandas
- 2021-07-06 11:06:43
- OfStack
Merge
Merging two array in numpy
In numpy, concatenate can be used, and the parameter axis=0 indicates that two arrays are merged in the vertical direction, which is equivalent to np. vstack; The parameter axis=1 indicates that two arrays are merged horizontally, which is equivalent to np. hstack.
Vertical direction:
np.concatenate([arr1,arr2],axis=0)
np.vstack([arr1,arr2])
Horizontal direction:
np.concatenate([arr1,arr2],axis=1)
np.hstack([arr1,arr2])
import numpy as np
import pandas as pd
arr1=np.ones((3,5))
arr1
Out[5]:
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
arr2=np.random.randn(15).reshape(arr1.shape)
arr2
Out[8]:
array([[-0.09666833, 1.47064828, -1.94608976, 0.2651279 , -0.32894787],
[ 1.01187699, 0.39171167, 1.49607091, 0.79216196, 0.33246644],
[ 1.71266238, 0.86650837, 0.77830394, -0.90519422, 1.55410056]])
np.concatenate([arr1,arr2],axis=0) # Merge on the vertical axis
Out[9]:
array([[ 1. , 1. , 1. , 1. , 1. ],
[ 1. , 1. , 1. , 1. , 1. ],
[ 1. , 1. , 1. , 1. , 1. ],
[-0.09666833, 1.47064828, -1.94608976, 0.2651279 , -0.32894787],
[ 1.01187699, 0.39171167, 1.49607091, 0.79216196, 0.33246644],
[ 1.71266238, 0.86650837, 0.77830394, -0.90519422, 1.55410056]])
np.concatenate([arr1,arr2],axis=1) # Merge on the horizontal axis
Out[10]:
array([[ 1. , 1. , 1. , ..., -1.94608976,
0.2651279 , -0.32894787],
[ 1. , 1. , 1. , ..., 1.49607091,
0.79216196, 0.33246644],
[ 1. , 1. , 1. , ..., 0.77830394,
-0.90519422, 1.55410056]])
Merging two DataFrame in pandas
In pandas, the merge is implemented through the concat method, specifying the parameter axis=0 or axis=1, and merging two arrays on the vertical and horizontal axes. Unlike numpy, the two dataframe are placed in one list, namely [frame1, frame2]
from pandas import DataFrame
frame1=DataFrame([[1,2,3],[4,5,6]])
frame2=DataFrame([[7,8,9],[10,11,12]])
pd.concat([frame1,frame2], ignore_index=True) # The merged array is 1 Iterable lists.
Out[25]:
0 1 2
0 1 2 3
1 4 5 6
0 7 8 9
1 10 11 12
pd.concat([frame1,frame2], axis=1, ignore_index=True)
Out[27]:
0 1 2 3 4 5
0 1 2 3 7 8 9
1 4 5 6 10 11 12
Straightening and reshaping
Straightening means changing a 2-dimensional array into a 1-dimensional array. By default, Numpy arrays are created in row-first order. In terms of space, this means that for a 2-dimensional number, the data items in each row are stored in adjacent positions within. Another order is column first.
For historical reasons, row-first and column-first are also referred to as C and Fortran sequences, respectively. In Numpy, row and column precedence can be achieved by keyword parameters order= 'C' and order= 'F'.
Straighten:
arr=np.arange(15).reshape(3,-1)
arr
Out[29]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
arr.ravel('F') # Give priority to columns and flatten them.
Out[30]: array([ 0, 5, 10, ..., 4, 9, 14])
arr.ravel('C') # Default order. # According to the row priority, flat.
Out[31]: array([ 0, 1, 2, ..., 12, 13, 14])
Remodeling:
After Fortran sequence remodeling, straighten by column and straighten by column the original data.
arr.reshape((5,3),order='F')
Out[32]:
array([[ 0, 11, 8],
[ 5, 2, 13],
[10, 7, 4],
[ 1, 12, 9],
[ 6, 3, 14]])
After C sequence remodeling, straighten by row and straighten by row the original data.
arr.reshape((5,3),order='C')
Out[33]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])