Tag Archives: cannot index with vector containing Na / Nan values

[resolution] str.contains() problem] valueerror: cannot index with vector containing Na/Nan values

Problem description;
when using dataframe, perform the following operations:

df[df.line.str.contains('G')]

The purpose is to find out all the lines in the line column of DF that contain the character ‘g’

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-10f8503f73f2> in <module>()
---->  df.line.str.contains('G')

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2983 
   2984         # Do we have a (boolean) 1d indexer?
-> 2985         if com.is_bool_indexer(key):
   2986             return self._getitem_bool_array(key)
   2987 

D:\Anaconda3\lib\site-packages\pandas\core\common.py in is_bool_indexer(key)
    128             if not lib.is_bool_array(key):
    129                 if isna(key).any():
--> 130                     raise ValueError(na_msg)
    131                 return False
    132             return True

ValueError: cannot index with vector containing NA / NaN values

Obviously, it means that there are Na or Nan values in the line column, so Baidu has a lot of methods on the Internet to teach you how to delete the Na / Nan values in the line column.

However, deleting the row containing Na / Nan value in the line column still can’t solve the problem!! What shall I do?

Solution:
it’s very simple. In fact, it’s very likely that the element formats in the line column are not all STR formats, and there may be int formats, etc.
so you just need to unify the format of the line column into STR format!
The operation is as follows:

df['line'] = df['line'].apply(str) #Change the format of the line column to str

df[df.line.str.contains('G')] #Execute your corresponding statement

solve the problem!!

Python Valueerror: cannot index with vector containing Na / Nan values

Problem description;
when using dataframe, perform the following operations:

df[df.line.str.contains('G')]

The purpose is to find out all the lines in the line column of DF that contain the character ‘g’

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-10f8503f73f2> in <module>()
---->  df.line.str.contains('G')

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2983 
   2984         # Do we have a (boolean) 1d indexer?
-> 2985         if com.is_bool_indexer(key):
   2986             return self._getitem_bool_array(key)
   2987 

D:\Anaconda3\lib\site-packages\pandas\core\common.py in is_bool_indexer(key)
    128             if not lib.is_bool_array(key):
    129                 if isna(key).any():
--> 130                     raise ValueError(na_msg)
    131                 return False
    132             return True

ValueError: cannot index with vector containing NA / NaN values

Obviously, it means that there are Na or Nan values in the line column, so Baidu has a lot of methods on the Internet to teach you how to delete the Na / Nan values in the line column.

However, deleting the row containing Na / Nan value in the line column still can’t solve the problem!! What shall I do?

Solution:
it’s very simple. In fact, it’s very likely that the element formats in the line column are not all STR formats, and there may be int formats, etc.
so you just need to unify the format of the line column into STR format!
The operation is as follows:

df['line'] = df['line'].apply(str) #Change the format of the line column to str

df[df.line.str.contains('G')] #Execute your corresponding statement

solve the problem!!