-
Recent Posts
Recent Comments
Archives
- July 2024
- April 2024
- August 2023
- April 2023
- March 2023
- November 2022
- June 2022
- October 2021
- March 2021
- November 2020
- August 2020
- October 2019
- September 2019
- August 2019
- May 2019
- March 2019
- January 2019
- December 2018
- April 2018
- March 2018
- January 2018
- December 2017
- November 2017
- September 2017
- June 2017
- May 2017
- April 2017
- March 2017
- February 2017
- December 2016
- November 2016
- October 2016
- September 2016
- July 2016
- June 2016
- April 2016
- February 2016
- January 2016
- November 2015
- October 2015
- September 2015
- July 2015
- March 2015
- February 2015
- December 2014
- September 2014
- May 2014
- April 2014
- September 2013
- January 2013
- October 2012
- September 2012
- April 2012
- March 2012
- November 2011
- May 2011
- October 2010
- June 2010
- March 2010
Categories
Related links
Category Archives: Digital Humanities
JADH Poster: DH research and teaching with digital library APIs
At this year’s Japanese Association for Digital Humanities conference, as well as giving a keynote on digital infrastructure, I also presented this poster on the specific example of full-text digital library APIs being used in ctext.org and for teaching at … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers
Comments Off
Collaboration at scale: emerging infrastructures for digital scholarship
Keynote lecture, Japanese Association for Digital Humanities (JADH 2017), Kyoto Abstract Modern technological society is possible only as a result of collaborations constantly taking place between countless individuals and groups working on tasks which at first glance may seem independent … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers
Comments Off
Digital humanities and the digital library
Subtitled “OCR, crowdsourcing, and text mining of Chinese historical texts” Paper to be presented at the CADAL Project Work Conference on Digital Resources Sharing and Application, Zhejiang University, 16 June 2017. 数字人文与数字图书馆:中国历代文献的文字识别、群众外包及文本挖掘 本次演讲介绍中国哲学书电子化计划中的主要技术。中国哲学书电子化计划是全球最大规模的前现代中文传世文献电子图书馆之一,目前,每日有25,000多用户使用其公开操作界面。主要原创技术可归类为三种:(一)前现代中文资料的文字识别技术(OCR)、(二)借用大量用户劳力的群众外包界面、(三)既实现与其它线上工具之间的整合、又提供文本挖掘途径的开放式应用程式界面(API)。 第一个原创技术是专门为中国前现代文献设计的文字识别技术。此技术利用前现代文献常见的写作、印刷特征以及已数字化的大量文献来实现具有高精确性以及扩充性的文字识别系统。该系统已处理2,500多万页资料,其结果已在网络上公开。 第二,通过独特的群众外包界面,世界各地的用户可纠正文字识别错误,补充后设资料,从而能够及时参与数字化过程并积极协助内容的扩展。全球用户每日提供上百次的校勘,系统将此及时储存到具有版本控制功能的数据库。 第三,系统的应用程式界面可用于文本挖掘,亦可用于扩充一般使用界面的功能, 从而有效地借用日益增长的资料库文本内容来达到数字人文研究和教学的目的。通过此应用程式界面,为Python等程式语言所开发的专门组件可用于数字人文教学;JavaScript组件便于他人开发易用的线上工具,使他人所开发的应用工具能够直接读取和操作电子图书馆中的各种内容。 In this talk … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers
Comments Off
Crowdsourcing a digital library of pre-modern Chinese
Seminar in the Digital Classicist London 2017 series at the Institute of Classical Studies, University of London, 9 June 2017. Traditional digital libraries, including those in the field of pre-modern Chinese, have typically followed top-down, centralized, and static models of … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers, Video
Comments Off
Unsupervised Extraction of Training Data for Pre-Modern Chinese OCR
Published in the Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference (FLAIRS-30), 2017. Abstract Many mainstream OCR techniques involve training a character recognition model using labeled exemplary images of each individual character to be recognized. For modern … Continue reading
Posted in Chinese, Digital Humanities
Comments Off
Text Tools for ctext.org
This tutorial introduces some of the main functionality of the “Text Tools” plugin for the Chinese Text Project database and digital library along with suggested example tasks and use cases. [Online version of this tutorial: https://dsturgeon.net/texttools (English); https://dsturgeon.net/texttools-ja (Japanese)] 1 … Continue reading
Posted in Chinese, Digital Humanities
Comments Off
Harvard-Yenching Library East Asian Digital Humanities Series
Looking forward to discussing the Chinese Text Project at the second meeting of this exciting new series! Introducing the Chinese Text Project The Chinese Text Project is an online open-access digital library that makes pre-modern Chinese texts available to readers … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers
Comments Off
Practical introduction to ctext.org
This tutorial briefly summarizes some of the most common tasks on the Chinese Text Project database and digital library from a user perspective, with suggested example tasks intended to introduce core functionality of the system. Online versions of this tutorial: … Continue reading
Posted in Chinese, Digital Humanities
Comments Off
Towards a sustainable digital infrastructure for historical Chinese texts
Paper presented at the Open Conference on Digital Infrastructures for Global Philology, Leipzig University, 21 February 2017. [Download slides] This paper describes the current status and initial results of an ongoing project to create a scalable and sustainable infrastructure for … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers, Video
Comments Off
Deep Dive into Digital and Data Methods for Chinese Studies
I’m really looking forward to taking part in the University of Michigan’s “Deep Dive into Digital and Data Methods for Chinese Studies” series later this month, where I’ll be leading the following sessions: Text Reuse in Early Chinese Texts: A … Continue reading
Posted in Chinese, Digital Humanities, Talks and conference papers
Comments Off