Tools

In the course of research and teaching, I have created and continue to maintain a number of online digital tools and platforms related to Chinese, including the following:

Chinese Text Project (ctext.org)

A full-text digital library for pre-modern Chinese works, now the largest such resource in the world. Leveraging crowdsourcing and specially developed OCR technology, the system provides access to tens of thousands of works, many of which have never before been digitized. The collection continues to grow through collaborations with major academic libraries and scanning centers, and new functionality is continually being added. Recent additions include crowdsourced semantic annotation and a crowdsourced knowledge graph of historical data exposed in RDF and other formats as Linked Open Data.

Text Tools for ctext.org


A set of digital research tools for online analysis of textual materials (primarily but not exclusively Chinese materials). Functionality includes n-grams, regular expressions, text reuse identification, and network visualization.

TextRef.org

A simple but effective distributed catalog system designed to collate data on online editions of pre-modern Chinese texts available within many different systems. Data is published online elsewhere in a standard format, then aggregated into the online system for cross-catalog search, allowing many different groups to maintain parts of the data without prior coordination. The tool is designed to be independent of many details of the data itself; a second installation of the same software, BiogRef.org, provides the same functionality for biographical databases of historical Chinese individuals.

Comments are closed.