A workshop held as part of Charting the European D-SEA: Digital Scholarship in East Asian Studies.
Setup:
- Recommended web browsers: Firefox or Chrome; Safari and Edge should also work for most tasks, but have not been fully tested.
- Create a ctext account and log in
- Check your e-mail (and spam folder) for an e-mail sent from the system, and click the link in the e-mail to validate your account.
- Go to “Settings” at the bottom left, enter the API key specified in the live session in the box under “API key”, and click “Save”
- Install the Text Tools plugin by opening this link, and then clicking “Install”
- Install the Annotation plugin by opening this link, and then clicking “Install”
Some parts of the material that will be covered in the session are available in step-by-step tutorials, which also include other details and examples and might be useful if you want to come back to the material later:
- Practical introduction to ctext.org – interactive guide to core functionality of the Chinese Text Project.
- Text Tools for ctext.org – interactive guide to using the Text Tools plugin for the Chinese Text Project for text mining and data visualization.
- Data Wiki tutorial – interactive guide to using the Data Wiki.
- the posts on text reuse and regular expressions on Digital Sinology.
- SPARQL querying for ctext.org data – an introduction to querying ctext data using the industry standard SPARQL query language.
- Classical Chinese Digital Humanities (on Digital Sinology) – step-by-step guide to getting started with programming in Python, and accessing the CTP API for simple text mining.
These parts of the ctext.org instructions should also be useful:
Some additional details and examples of the techniques used are available in the slides for “PKU Workshop 2023” (Chinese),
Lastly, some of these papers may be of interest – and please consider citing one or more of these if you use the system or its contents in your work (note that as the creator of ctext.org, I get no academic credit for any of the work that went into this project when it is cited solely by its URL):
- Digitizing Premodern Text with the Chinese Text Project, Journal of Chinese History 2020, 4(2).
- Chinese Text Project: a dynamic digital library of premodern Chinese, Digital Scholarship in the Humanities (2019)
- Digital Approaches to Text Reuse in the Early Chinese Corpus, Journal of Chinese Literature and Culture 2018, 5(2).
- Large-scale Optical Character Recognition of Pre-modern Chinese Texts, International Journal of Buddhist Thought and Culture 2018, 28(2).
- Unsupervised Identification of Text Reuse in Early Chinese Literature, Digital Scholarship in the Humanities (2018)