如何使用 Python 檢測文字檔案的編碼

Created: November-22, 2018

Python 中有一個有用的包 - chardet，它有助於檢測檔案中使用的編碼。實際上沒有程式可以 100％放心地說使用了哪種編碼 - 這就是為什麼 chardet 給編碼檔案編碼的概率最高的原因。Chardet 可以檢測以下編碼：

你可以使用 pip 命令安裝 chardet ：

pip install chardet

之後你可以在命令列中使用 chardet：

% chardetect somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0

或者在 python 中：

import chardet    
rawdata = open(file, "r").read()
result = chardet.detect(rawdata)
charenc = result['encoding']