diff --git a/readme.md b/readme.md index 85fe794..b32fe4e 100644 --- a/readme.md +++ b/readme.md @@ -1,3 +1,5 @@ + + # cChardet This library is high speed universal character encoding detector. - binding to [charsetdetect](https://bitbucket.org/medoc/uchardet-enhanced/overview). diff --git a/readme.rst b/readme.rst new file mode 100644 index 0000000..65e1240 --- /dev/null +++ b/readme.rst @@ -0,0 +1,127 @@ +cChardet +======== + +This library is high speed universal character encoding detector. - +binding to `charsetdetect`_. + +This library is faster than `chardet`_. + +Support codecs +============== + +- Big5 +- EUC-JP +- EUC-KR +- GB18030 +- gb18030 +- HZ-GB-2312 +- IBM855 +- IBM866 +- ISO-2022-CN +- ISO-2022-JP +- ISO-2022-KR +- ISO-8859-2 +- ISO-8859-5 +- ISO-8859-7 +- ISO-8859-8 +- KOI8-R +- Shift\_JIS +- TIS-620 +- UTF-8 +- UTF-16BE +- UTF-16LE +- UTF-32BE +- UTF-32LE +- windows-1250 +- windows-1251 +- windows-1252 +- windows-1253 +- windows-1255 +- x-euc-tw +- X-ISO-10646-UCS-4-2143 +- X-ISO-10646-UCS-4-3412 +- x-mac-cyrillic + +Requires +======== + +- Cython: `http://www.cython.org/`_ + +Install +======= + +Build uchardet-enhanced +~~~~~~~~~~~~~~~~~~~~~~~ + +1. $cd /tmp + +2. $git clone git://github.com/PyYoshi/cChardet.git + +3. $cd cChardet + +4. $python setup.py build + +5. $sudo python setup.py install + +Test +==== + +- $sudo easy\_install or pip install -U chardet nose + +- $nosetests –nocapture tests.py + +Benchmark +========= + +see `tests.TestCchardetSpeed`_ + +Sample(shift\_jis): +~~~~~~~~~~~~~~~~~~~ + +- `test/testdata/wikipediaJa\_One\_Thousand\_and\_One\_Nights\_SJIS.txt`_ + +PC Spec.: +~~~~~~~~~ + +- CPU: Intel Core i7 860 2.8GHz + +- RAM: DDR3-1333 16GB + +- Platform: Windows 7 HP x64, Python 2.7.3 32-bit + +Result: +~~~~~~~ + +- chardet: 4.009999990463257s, shift\_jis + +- cchardet: 0.0009999275207519531s, shift\_jis + +License +======= + +- This library files(“cchardet.pyx”,“setup.py”,“tests.py”) are “The MIT + License”. + +- Other Library License: Please, look at the “ext” directory. + +Thanks +====== + +- `https://bitbucket.org/medoc/uchardet-enhanced/overview`_ + +- `http://www.cython.org/`_ + +Contact +======= + +`My blog`_ + +Sorry for my poor English :) + +.. _charsetdetect: https://bitbucket.org/medoc/uchardet-enhanced/overview +.. _chardet: http://pypi.python.org/pypi/chardet +.. _`http://www.cython.org/`: http://www.cython.org/ +.. _tests.TestCchardetSpeed: https://github.com/PyYoshi/cChardet/blob/master/test/tests.py#L415 +.. _test/testdata/wikipediaJa\_One\_Thousand\_and\_One\_Nights\_SJIS.txt: https://github.com/PyYoshi/cChardet/blob/master/test/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt +.. _`https://bitbucket.org/medoc/uchardet-enhanced/overview`: https://bitbucket.org/medoc/uchardet-enhanced/overview +.. _My blog: http://blog.remu.biz