Please refer to http://www.umiacs.umd.edu/users/resnik/nlstat_tutorial_summer1998/Lab_ngrams.html for computing procedure and formula.
The following guidelines can be used:
If the scores are high or medium, the collocation strength is strong. If MI is below 1, it is less likely that the two tokens are related. MI scores between 1 and 3 are in the gray area. My intuitive judegement of the bigram lists with MI score larger than 2.5 appear to be bisyllabic words in Chinese, though such intuition needs to be verified.
Other statistical measures such as t-score, likelihood ratio, chi-square and Yule's Y are often used to measure collocation strength. For an introduction and comparison of those measures, please refer to, among others: