Reading CSV document with Pandas is an error that cannot be read because there is Chinese in the document. The error is due to the failure of the 'utf-8' codec to decode the 0 bit byte 0xc4
Solutions:
After reading the file, add encoding=’ GBK ‘,
such as: pddata=pd. Read_csv (' felipe.csv ',encoding=' GBK ')
Interested to continue to see the reason!
As you know, the default encoding we use in Python is UTF-8. For an introduction to coding, I recommend taking a look at Liao Da’s Python tutorial, “Strings and Coding.” Since UTF-8 format cannot correctly read CSV files with Chinese characters, it would be a good idea to select a format that can read Chinese characters.
So what format can read Chinese characters?We open the Python3 official website: find the section on standard characters. The diagram below:
So what format do you want to change?You can see that the third column of the table, Language, represents what Language the encoding supports. So let’s find out.
!
I’m not going to show you the table here, but if you’re interested, go to the website. Anyway, under my careful search, there is big5; Big5hkscs; Gb2312; GBK; Gb18030. Hz; The five formats iso2022_jp_2
may support Chinese. After my test, I found gb2312; GBK; Gb18030
can read CSV files with Chinese smoothly. (Since all three are ok, let’s have a good GBK.)
It works!
Read More:
- Unicode decodeerror: ‘UTF-8’ codec can’t decode byte 0x80 in position 3131: invalid start byte solution
- Unicode decodeerror: ‘UTF-8’ codec can’t decode byte 0xd3 in position 238: invalid continuation B
- Successfully resolved Unicode decodeerror: ‘UTF-8’ codec can’t decode byte 0xd3 in position 238: invalid continuation B
- UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xc3 in position 54: invalid continuation byte
- Solve the ‘UTF-8’ codec can’t decode byte 0xe9 in position 3114: invalid continuation byte error
- When reading the CSV file with Python 3, the Unicode decodeerror: ‘UTF-8’ codec can’t decode byte 0xd0 in position 0: invalid con appears
- UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xd6 in position 3089: invalid continuation byte
- Run Python file for the first time with eclipse / pydev: “UTF-8 ‘codec can’t decode byte 0xc4 in position
- ‘ascii‘ codec can‘t decode byte 0x90 in position 614: ordinal not in range(128)
- python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: trunca
- Python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3:
- SyntaxError: (unicode error) ‘unicodeescape‘ codec can‘t decode bytes in position 2-3: truncated \UX
- SyntaxError: (unicode error) ‘unicodeescape‘ codec can‘t decode bytes in position 2-3: truncated \UX
- Syntax Error: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position resolution
- How to solve the problem of syntax error: (Unicode error)’unicodescape ‘codec can’t decode bytes in position 2-3: truncat
- Cause: org.xml.sax.SAXParseException; lineNumber: 5; columnNumber: 26; Byte 1 of 1-byte UTF-8 sequence is invalid.
- JSON parse e rror: Invalid UTF-8 middle byte 0x3f;
- ERROR: invalid byte sequence for encoding “UTF8”: 0x00
- Unicode encodeerror: ‘GBK’ codec can’t encode character solution
- UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 0-2: ordinal not in range(128)