This commit is contained in:
PyYoshi 2012-07-07 12:22:34 +09:00
parent d841345327
commit ffb55ca55b

View file

@ -4,7 +4,6 @@
This library is high speed universal character encoding detector. - binding to [charsetdetect](https://bitbucket.org/medoc/uchardet-enhanced/overview). This library is high speed universal character encoding detector. - binding to [charsetdetect](https://bitbucket.org/medoc/uchardet-enhanced/overview).
This library is faster than [chardet](http://pypi.python.org/pypi/chardet). This library is faster than [chardet](http://pypi.python.org/pypi/chardet).
# Support codecs # Support codecs
* Big5 * Big5
* EUC-JP * EUC-JP
@ -38,14 +37,12 @@ This library is faster than [chardet](http://pypi.python.org/pypi/chardet).
* X-ISO-10646-UCS-4-2143 * X-ISO-10646-UCS-4-2143
* X-ISO-10646-UCS-4-3412 * X-ISO-10646-UCS-4-3412
* x-mac-cyrillic * x-mac-cyrillic
# Requires # Requires
* Cython: [http://www.cython.org/](http://www.cython.org/) * Cython: [http://www.cython.org/](http://www.cython.org/)
e.g.) Ubuntu 12.04 e.g.) Ubuntu 12.04
$sudo apt-get install build-essential python-dev cython $sudo apt-get install build-essential python-dev cython
# Installation # Installation
$cd /tmp $cd /tmp
@ -60,7 +57,6 @@ e.g.) Ubuntu 12.04
or or
$sudo easy_install cchardet $sudo easy_install cchardet
# Example # Example
```python ```python
@ -72,26 +68,22 @@ print(result)
result2 = cchardet.detect_with_confidence(msg) result2 = cchardet.detect_with_confidence(msg)
print(result2) print(result2)
``` ```
# Test # Test
$sudo easy_install or pip install -U chardet nose $sudo easy_install or pip install -U chardet nose
$cd test $cd test
$nosetests --nocapture tests.py $nosetests --nocapture tests.py
# Benchmark # Benchmark
code: [tests.TestCchardetSpeed](https://github.com/PyYoshi/cChardet/blob/master/test/tests.py#L415) code: [tests.TestCchardetSpeed](https://github.com/PyYoshi/cChardet/blob/master/test/tests.py#L415)
sample: [test/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt](https://github.com/PyYoshi/cChardet/blob/master/test/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt) sample: [test/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt](https://github.com/PyYoshi/cChardet/blob/master/test/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt)
### Performance: ### Performance:
CPU: Intel Core i7 860 2.8GHz CPU: Intel Core i7 860 2.8GHz
RAM: DDR3-1333 16GB RAM: DDR3-1333 16GB
Platform: Windows 7 HP x64, Python 2.7.3 32-bit Platform: Windows 7 HP x64, Python 2.7.3 32-bit
### Result: ### Result:
<table> <table>
@ -105,17 +97,14 @@ Platform: Windows 7 HP x64, Python 2.7.3 32-bit
<td>cchardet</td><td>500.03</td><td>shift_jis</td> <td>cchardet</td><td>500.03</td><td>shift_jis</td>
</tr> </tr>
</table> </table>
# License # License
* This library files("cchardet.pyx","setup.py","tests.py") are "The MIT License". * This library files("cchardet.pyx","setup.py","tests.py") are "The MIT License".
* Other Libraries License: Please, look at the [ext](https://github.com/PyYoshi/cChardet/tree/master/src/ext) directory. * Other Libraries License: Please, look at the [ext](https://github.com/PyYoshi/cChardet/tree/master/src/ext) directory.
# Thanks # Thanks
* [https://bitbucket.org/medoc/uchardet-enhanced/overview](https://bitbucket.org/medoc/uchardet-enhanced/overview) * [https://bitbucket.org/medoc/uchardet-enhanced/overview](https://bitbucket.org/medoc/uchardet-enhanced/overview)
* [http://www.cython.org/](http://www.cython.org/) * [http://www.cython.org/](http://www.cython.org/)
# Contact # Contact
[My blog](http://blog.remu.biz) [My blog](http://blog.remu.biz)