Jun Da's WebCentral

Home | Academic | Chinese | CALL | Systems | Personal | Contact


Chinese text computing

(This is the 1998 version. An updated 2004 version is now available)


Character Statistics Page

Chinese character frequency lists and diagrams
(Last updated: 2000-01-29)

This page provides two kinds of lists: Character frequency lists and digram lists. You can search for information about individual characters using the Search form.

Note: A GB enabled web browser is needed to view this page and the frequency lists properly. Check out this simple tutorial for more information. Please read the Technical Notes page for detailed information about data collection and computing.

1. Character Frequency List 汉字单字字频列表

Corpus Total number of characters Number of distincitve characters Frequeny List
CHISA 海外学人 3,799,731 5,229 chisa.html
CW 计算机世界 1,857,538 3,027 cw.html
FHY 枫华园 4,718,131 5,376 fhy.html
HDTX 华德通讯 2,317,618 4,946 hdtx.html
HXWZ 华夏文摘 11,578,283 5,764 hxwz.html
XYS 新语丝 1,883,289 5,130 xys.html
EBOOKS 新语丝电子书籍收藏 19,221,863 6,267 ebook.html
Total 整个语料库


6,538 total.html

2. Digram List 两字符串频率列表

See this page which contains diagram list based on both raw frequency and mutual information scores.


Chinese Computing Site Map

Chinese Text Computing Sitemap
Title page
Technical notes
Chinese computing FAQ
Relevant links
What's new
Copyright notice
My homepage

Copyright. 1998-2000. Jun Da. jda@mtsu.edu