프로그래밍/Python
[python] 문자열 인코딩 확인하기. chardet
홍반장水_
2024. 3. 22. 14:25
반응형
https://pypi.org/project/chardet/
Project description
Chardet: The Universal Character Encoding Detector




- ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
- Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
- EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
- EUC-KR, ISO-2022-KR, Johab (Korean)
- KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
- ISO-8859-5, windows-1251 (Bulgarian)
- ISO-8859-1, windows-1252, MacRoman (Western European languages)
- ISO-8859-7, windows-1253 (Greek)
- ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
- TIS-620 (Thai)
Note
Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily disabled until we can retrain the models.
Requires Python 3.7+.
반응형