Pandas apply returns multiple columns
Originally, I wanted to process the dataframe line by line through NP. Vectorize() and return several new fields. An error valueerror: setting an array element with a sequence
def test():
arr = np.random.randn(4,4)
cols = ['a', 'b', 'c']
df = pd.DataFrame(data=arr,columns=['e','f','g','h'])
def func(a,b,c):
output1 = a+1
output2 = b*2
output3 = c-4
return pd.Series([output1,output2,output3])
vfunc = np.vectorize(func)
df[cols] = vfunc(df['e'],df['f'],df['g'])
print(df)
test()
The reason for the error is that the assigned DF [cols] is inconsistent with the dimension returned by vffunc, and the shape between the returned data frame and the result does not match. Use apply to solve it, and the parameter result_ Type = “expand” means that the result will be converted into columns, and each returned value will be used as the value in the column of result dataframe. In apply (func), the number of results returned by func should be the same as the number of col columns in DF [col]
def test():
arr = np.random.randn(4,4)
cols = ['a', 'b', 'c']
df = pd.DataFrame(data=arr,columns=['e','f','g','h'])
def func(row):
a,b,c = row['e'],row['f'],row['g']
output1 = a+1
output2 = b*2
output3 = c-4
return output1,output2,output3
df[cols] = df.apply(func,axis=1, result_type="expand")
print(df)
test()
output
e f g h a b c
0 0.493280 -0.092513 -3.014135 -0.361842 1.493280 -0.185027 -7.014135
1 0.300695 -0.745392 0.591653 -1.752471 1.300695 -1.490785 -3.408347
2 -0.033944 -1.556307 -0.359979 1.808213 0.966056 -3.112615 -4.359979
3 0.701741 -0.272337 0.041114 0.150049 1.701741 -0.544674 -3.958886
For a single column
df['id']
And
ID = ['id']
df[ID]
The results obtained are different. The former is [1,2,3,4], and the latter is [[1], [2], [3], [4]
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html
Read More:
- ValueError: Found array with dim 4. Estimator expected and ValueError: Expected 2D array, got 1D array i
- Pandas ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.an
- [Solved] Pandas rename Error: ValueError: operands could not be broadcast together with shapes (1,2) (3,) (1,2)
- [Solved] ValueError: only one element tensors can be converted to Python scalars
- Plt.acorr() Function Error: ValueError: object too deep for desired array
- Pandas read_csv pandas.errors.ParserError: Error tokenizing data
- Python errors: valueerror: if using all scalar values, you must pass an index (four solutions)
- Error reading file by pandas pandas.errors.EmptyDataError: no columns to parse from file
- Python 3.X error: valueerror: data type must provide an itemsize
- Python IndexError: too many indices for array: array is 1-dimensional, but 2 were i..
- Python: How to Reshape the data in Pandas DataFrame
- [Solved] Python-selenium locates an element cannot be clicked error: ElementClickInterceptedException
- Python Valueerror: cannot index with vector containing Na / Nan values
- [resolution] str.contains() problem] valueerror: cannot index with vector containing Na/Nan values
- Python Pandas Typeerror: invalid type comparison
- pandas.DataFrame() Initializes NULL Error: DataFrame [How to Solve]
- [Solved] Pandas dataframe merge error: Different types cannot be merged
- [Solved] AttributeError: module ‘pandas‘ has no attribute ‘rolling_count‘
- Python+ Pandas + Evaluation of Music Equipment over the years (Notes)
- Python Pandas Error: KeyError: 0 [How to Solve]