update readme

This commit is contained in:
PyYoshi 2016-10-17 14:02:06 +09:00
parent b9f5a14ef9
commit 696f9b3449

View file

@ -3,46 +3,42 @@ cChardet
cChardet is high speed universal character encoding detector. - binding to [charsetdetect](https://bitbucket.org/medoc/uchardet-enhanced/overview). cChardet is high speed universal character encoding detector. - binding to [charsetdetect](https://bitbucket.org/medoc/uchardet-enhanced/overview).
## Support codecs ## Support codecs
* Big5
* EUC-JP - Big5
* EUC-KR - EUC-JP
* GB18030 - EUC-KR
* HZ-GB-2312 - GB18030
* IBM855 - HZ-GB-2312
* IBM866 - IBM855
* ISO-2022-CN - IBM866
* ISO-2022-JP - ISO-2022-CN
* ISO-2022-KR - ISO-2022-JP
* ISO-8859-2 - ISO-2022-KR
* ISO-8859-5 - ISO-8859-2
* ISO-8859-7 - ISO-8859-5
* ISO-8859-8 - ISO-8859-7
* KOI8-R - ISO-8859-8
* Shift_JIS - KOI8-R
* TIS-620 - Shift_JIS
* UTF-8 - TIS-620
* UTF-16BE - UTF-8
* UTF-16LE - UTF-16BE
* UTF-32BE - UTF-16LE
* UTF-32LE - UTF-32BE
* WINDOWS-1250 - UTF-32LE
* WINDOWS-1251 - WINDOWS-1250
* WINDOWS-1252 - WINDOWS-1251
* WINDOWS-1253 - WINDOWS-1252
* WINDOWS-1255 - WINDOWS-1253
* EUC-TW - WINDOWS-1255
* X-ISO-10646-UCS-4-2143 - EUC-TW
* X-ISO-10646-UCS-4-3412 - X-ISO-10646-UCS-4-2143
* x-mac-cyrillic - X-ISO-10646-UCS-4-3412
- x-mac-cyrillic
## Requires ## Requires
* Cython: [http://www.cython.org/](http://www.cython.org/)
e.g.) Ubuntu 12.04 - Cython: [http://www.cython.org/](http://www.cython.org/)
```bash
$ sudo apt-get install build-essential python-dev cython
```
## Installation ## Installation
@ -50,7 +46,6 @@ $ sudo apt-get install build-essential python-dev cython
$ cd /tmp $ cd /tmp
$ git clone git://github.com/PyYoshi/cChardet.git $ git clone git://github.com/PyYoshi/cChardet.git
$ cd cChardet $ cd cChardet
$ python setup.py build
$ python setup.py install $ python setup.py install
``` ```
@ -65,35 +60,53 @@ $ pip install -U cchardet
```python ```python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
import cchardet as chardet import cchardet as chardet
with open(r"tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f: with open(r"src/tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read() msg = f.read()
result = chardet.detect(msg) result = chardet.detect(msg)
print(result) print(result)
``` ```
## Benchmark ## Benchmark
code: [tests.TestCchardetSpeed](https://github.com/PyYoshi/cChardet/blob/master/src/tests/bench.py)
sample: [tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt](https://github.com/PyYoshi/cChardet/blob/master/src/tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt) ```bash
$ cd src/
$ pip install chardet
$ python tests/bench.py
```
### Performance: ### Performance
CPU: Intel Core i7 860 2.8GHz
RAM: DDR3-1333 16GB CPU: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz
Platform: Kubuntu 12.04 amd64, Python 2.7.3 64-bit RAM: DDR3 1600Mhz 16GB
### Result: Platform: Ubuntu 16.04 amd64
#### Python 2.7.12
<table> <table>
<tr> <tr>
<th></th><th>Request (call/s)</th> <th></th><th>Request (call/s)</th>
</tr> </tr>
<tr> <tr>
<td>chardet</td><td>0.32</td> <td>chardet</td><td>0.26</td>
</tr> </tr>
<tr> <tr>
<td>cchardet</td><td>975.46</td> <td>cchardet</td><td>1408.73</td>
</tr>
</table>
#### Python 3.5.2
<table>
<tr>
<th></th><th>Request (call/s)</th>
</tr>
<tr>
<td>chardet</td><td>0.28</td>
</tr>
<tr>
<td>cchardet</td><td>1380.40</td>
</tr> </tr>
</table> </table>