Draft – This is a preliminary draft specification. Please note that some implementation details will change before publication. Last updated: 22 March 2019.
Overview
Transformations of textual data are important processes in many natural language processing and text analysis workflows. Examples include tokenization, lemmatization, and appending of part of speech tags, as well as many other (often language-specific) procedures. In this specification, a text transformation is any operation which takes as input a sequence of Unicode characters, and produces as output a sequence of Unicode characters. The Text Transformation API (TTA) defines a simple specification for how to negotiate, request, and deliver text transformations over HTTP.
A TTA server is a system which both: 1) publishes a TTA service manifest, and 2) provides or references at least one TTA transformation service endpoint.
Service manifest
A service manifest is a valid JSON file containing a list of transformation services. Each service is described using the following key-value pairs:
| Key | Value |
|---|---|
| endpoint | The URL of the transformation service endpoint described by this entry. |
| languages | A list of ISO 639-1 language codes to which the endpoint is relevant or recommended. |
| title | A human readable description of the service the endpoint describes. |
Transformation service endpoint
A transformation endpoint is a HTTP or HTTPS URL which accepts a string of text sent to it via the HTTP POST method using the “application/x-www-form-urlencoded” content type. The content of the string must be supplied in the “data” parameter of the request in UTF-8 encoding.
The response to any valid request must be a JSON file containing exactly one of the following key value pairs:
| Key | Value |
|---|---|
| output | The contents of the “data” parameter transformed according to the service provided by the requested endpoint. |
| error | A string explaining why the request failed. |
Transformation client
A transformation client is any software which 1) requests TTA service manifests, specified by their URL; 2) provides a user with a means of viewing the “title” descriptions of the endpoints from any conformant TTA manifest, and 3) provides a user with a means of transforming texts using any conformant endpoint.
Examples
A non-normative example of a TTA service manifest (containing references to example TTA service endpoints) is: https://txt.ctext.org/services.pl
A non-normative example of a TTA client is accessible here.









”图标表示该文献的内容可以直接连接到对应的扫描影印资料。
”图标,打开相似段落的概要。
”图标,显示每一个结果和它出现的文脉。

”图标。
”图标。请注意,注释本身也是独立的文本,所以你可以点所显示的注释中的链接转到注释文本。

