Problem description;
when using dataframe, perform the following operations:
df[df.line.str.contains('G')]
The purpose is to find out all the lines in the line column of DF that contain the character ‘g’
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-10f8503f73f2> in <module>()
----> df.line.str.contains('G')
D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2983
2984 # Do we have a (boolean) 1d indexer?
-> 2985 if com.is_bool_indexer(key):
2986 return self._getitem_bool_array(key)
2987
D:\Anaconda3\lib\site-packages\pandas\core\common.py in is_bool_indexer(key)
128 if not lib.is_bool_array(key):
129 if isna(key).any():
--> 130 raise ValueError(na_msg)
131 return False
132 return True
ValueError: cannot index with vector containing NA / NaN values
Obviously, it means that there are Na or Nan values in the line column, so Baidu has a lot of methods on the Internet to teach you how to delete the Na / Nan values in the line column.
However, deleting the row containing Na / Nan value in the line column still can’t solve the problem!! What shall I do?
Solution:
it’s very simple. In fact, it’s very likely that the element formats in the line column are not all STR formats, and there may be int formats, etc.
so you just need to unify the format of the line column into STR format!
The operation is as follows:
df['line'] = df['line'].apply(str) #Change the format of the line column to str
df[df.line.str.contains('G')] #Execute your corresponding statement
solve the problem!!