Chinese text computing
         | | | | | |      

Frequency statistics 频率统计

Bigram frequencies and mutual information in Modern Chinese

Note: A bigram may be a nonsense combination of characters.
Pick a corpus: 请选择语料库

Bigram frequency is equal or greater than:
(Enter a number Between 1 and 60,000. For example: 50)


Mutual Information value is equal or greater than:
(Enter a number Between 0 and 30. For example: 3.5)



  • JAVASCRIPT should be enabled within your browser if you want to download the data. The downloading button on this page relies on Javascript to redirect the browser to the downloading page.
  • At this moment, bigram frequency information is only available for two sub-corpora: The general fictions sub-corpus and the news sbu-corpus.
  • There are 973,338 bigrams in the general fiction sub-corpus and 730,067 in the news sub-corpus. It may take some time for the results to display (depending on the display criteria you set). Please wait patiently after you click on the Submit button.
  • To search for individual bigram information, click here!


Copyright. 1998-2024. Jun Da. Page last updated: 2010-09-16