Python Csv Write To File Unreadable In Excel (chinese Characters)
I am trying to performing text analysis on Chinese texts. The program is provided below. I got the result with unreadable characters such as 浜烘皯鏃ユ姤绀捐. And if I chan
Solution 1:
For UTF-8 encoding, Excel requires a BOM (byte order mark) codepoint written at the start of the file or it will assume an ANSI
encoding, which is locale-dependent. U+FEFF
is the Unicode BOM. Here's an example that will open in Excel correctly:
#!python2#coding:utf8import csv
data = [[u'American',u'美国人'],
[u'Chinese',u'中国人']]
withopen('results.csv','wb') as f:
f.write(u'\ufeff'.encode('utf8'))
w = csv.writer(f)
for row in data:
w.writerow([item.encode('utf8') for item in row])
Python 3 makes this easier. Use 'w', newline='', encoding='utf-8-sig'
parameters instead of 'wb'
which will accept Unicode strings directly and automatically write a BOM:
#!python3#coding:utf8import csv
data = [['American','美国人'],
['Chinese','中国人']]
withopen('results.csv','w',newline='',encoding='utf-8-sig') as f:
w = csv.writer(f)
w.writerows(data)
There is also a 3rd–party unicodecsv
module that makes Python 2 easier to use as well:
#!python2#coding:utf8import unicodecsv
data = [[u'American',u'美国人'],
[u'Chinese',u'中国人']]
withopen('results.csv','wb') as f:
w = unicodecsv.writer(f,encoding='utf-8-sig')
w.writerows(data)
Solution 2:
Here is another way kinda tricky:
#!python2#coding:utf8import csv
data = [[u'American',u'美国人'],
[u'Chinese',u'中国人']]
withopen('results.csv','wb') as f:
f.write(u'\ufeff'.encode('utf8'))
w = csv.writer(f)
for row in data:
w.writerow([item.encode('utf8') for item in row])
This code block generate csv file encoded utf-8 .
- open file with notepad++ (or other Editor with encode feature)
- Encoding -> convert to ANSI
- save
Open file with Excel, it's OK.
Post a Comment for "Python Csv Write To File Unreadable In Excel (chinese Characters)"