Your search for 'dc_creator:( "Keh-Jiann CHEN" ) OR dc_contributor:( "Keh-Jiann CHEN" )' returned 1 result. Modify search

Sort Results by Relevance | Newest titles first | Oldest titles first

Academia Sinica Balanced Corpus

(706 words)

Author(s): Keh-Jiann CHEN | Chu-Ren HUANG
1. The Sinica Corpus Academia Sinica Balanced Corpus (Sinica Corpus) is the first proportionally sampled Chinese corpus with part-of-speech tagging. The corpus (Sinica 1.0) was compiled and opened to the research community through direct license in 1995 (Huang et al. 1995). Its size was two million words. After 10 years of further development, it was upgraded to the Sinica 5.0 with ten million words in 2005. Its on-line web service is available at The corpus can also be accessed through direct licensing from the ROCLING Society (…
Date: 2017-03-02