Author Archives: dsturgeon

Unsupervised Extraction of Training Data for Pre-Modern Chinese OCR

Published in the Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference (FLAIRS-30), 2017. Abstract Many mainstream OCR techniques involve training a character recognition model using labeled exemplary images of each individual character to be recognized. For modern … Continue reading

Comments Off on Unsupervised Extraction of Training Data for Pre-Modern Chinese OCR

Text Tools for ctext.org

This tutorial introduces some of the main functionality of the “Text Tools” plugin for the Chinese Text Project database and digital library along with suggested example tasks and use cases. [Online version of this tutorial: https://dsturgeon.net/texttools (English); https://dsturgeon.net/texttools-ja (Japanese)] 1 … Continue reading

Comments Off on Text Tools for ctext.org

Harvard-Yenching Library East Asian Digital Humanities Series

Looking forward to discussing the Chinese Text Project at the second meeting of this exciting new series! Introducing the Chinese Text Project The Chinese Text Project is an online open-access digital library that makes pre-modern Chinese texts available to readers … Continue reading

Comments Off on Harvard-Yenching Library East Asian Digital Humanities Series

Practical introduction to ctext.org

This tutorial briefly summarizes some of the most common tasks on the Chinese Text Project database and digital library from a user perspective, with suggested example tasks intended to introduce core functionality of the system. Online versions of this tutorial: … Continue reading

Comments Off on Practical introduction to ctext.org

Towards a sustainable digital infrastructure for historical Chinese texts

Paper presented at the Open Conference on Digital Infrastructures for Global Philology, Leipzig University, 21 February 2017. [Download slides] This paper describes the current status and initial results of an ongoing project to create a scalable and sustainable infrastructure for … Continue reading

Comments Off on Towards a sustainable digital infrastructure for historical Chinese texts

Deep Dive into Digital and Data Methods for Chinese Studies

I’m really looking forward to taking part in the University of Michigan’s “Deep Dive into Digital and Data Methods for Chinese Studies” series later this month, where I’ll be leading the following sessions: Text Reuse in Early Chinese Texts: A … Continue reading

Comments Off on Deep Dive into Digital and Data Methods for Chinese Studies

Classical Chinese Literature in a Digital Age

I’m very excited to be visiting Tsukuba University in Japan next week, where I will be giving a talk titled “Classical Chinese Literature in a Digital Age” (December 15), and also presenting a paper on “Optical Character Recognition for pre-modern … Continue reading

Comments Off on Classical Chinese Literature in a Digital Age

Towards a dynamic, scalable digital library of pre-modern Chinese

Paper to be presented at the 7th International Conference of Digital Archives and Digital Humanities, December 2016, National Taiwan University This paper contrasts two radically different approaches to full-text digital library design and implementation: firstly, the “static database approach”, in … Continue reading

Comments Off on Towards a dynamic, scalable digital library of pre-modern Chinese

Harvard Yenching Library Chinese materials added to ctext.org

Update to the CTP: Thanks to the support of Harvard Yenching Library, over 5 million pages of scanned materials from the Yenching Library collection have been added to the Library section of the site, including high quality images from the … Continue reading

Comments Off on Harvard Yenching Library Chinese materials added to ctext.org

Digitizing Millions of Pages of Chinese History

A poster presented at the 60th anniversary celebration of Harvard’s Fairbank Center:

Comments Off on Digitizing Millions of Pages of Chinese History