universal character encoding detector
Find a file
2012-06-20 11:31:41 +09:00
testdata first commit 2012-06-20 10:41:36 +09:00
.gitignore first commit 2012-06-20 10:41:36 +09:00
cchardet.pyx first commit 2012-06-20 10:41:36 +09:00
charsetdetect.h first commit 2012-06-20 10:41:36 +09:00
libcharsetdetect.dll first commit 2012-06-20 10:41:36 +09:00
readme.md fix 2012-06-20 11:31:41 +09:00
setup.py first commit 2012-06-20 10:41:36 +09:00
tests.py add benchmark 2012-06-20 11:29:50 +09:00

cChardet

This library is high speed universal character encoding detector. - binding to libcharsetdetect

Requires

Cython: http://www.cython.org/

uchardet-enhanced: https://bitbucket.org/medoc/uchardet-enhanced/overview

pip install or easy_install -U cython

Benchmark

see tests.TestCchardetSpeed

Sample(shift_jis): testdata/wikipediaJa_One_Thousand_and_One_Nights.txt

Result

chardet: 4.009999990463257s, shift_jis

cchardet: 0.0009999275207519531s shift_jis

Contact

My blog

Sorry for my poor English :)