The website for Asia Scholars and Asia Studies in Europe

go to events overview page


Two CATS/HRA projects published in conference volume



27 June 2020

The conference E-Science-Tage 2019 "Data to Knowledge," which took place at Heidelberg University in March 2019, brought together leading experts from the field of research data management to create a platform of discussion on how knowledge is obtained, shared, and preserved through data. In the course of the conference, two joint projects from the
Center for Asian and Transcultural Studies (CATS) and the
Heidelberg Research Architecture (HRA) were presented: the Early Chinese Periodicals Online (ECPO) and the Open Digital Archive for Chinese Studies (OpenDACHS). Both projects figure in the conference volume "
E-Science-Tage 2019: Data to Knowledge," recently published on
heiBOOKS and available Open Access.

ECPO project was originally created by the Heidelberg Research Architecture and the Cluster’s Digital Humanities Unit, in collaboration with Taiwan’s Academia Sinica. It joins several important digital collections of the early Chinese press and puts them into a single overarching framework by using a combination of extensive and intensive approaches. It is distinguished from many existing databases of Chinese periodicals in that it preserves materials often excluded in reprint, microfilm or digital editions, such as advertising inserts and illustrations. In addition, it incorporates a sophisticated body of metadata in both English and Chinese, including keywords and biographical information on editors, authors and individuals.

In the paper "
Transforming data silos into knowledge: Early Chinese Periodicals Online (ECPO),"
Matthias Arnold and Lena Hessel present the results of the ECPO research and illustrate the database. They focus on the efforts the project has made in opening up ECPO towards a broader user-group and to data re-use via API’s. They explain how the cross-database agent service allows to manage the almost 50,000 names currently recorded in ECPO and connect them to national and international authority files, like GND and VIAF. They also introduce the different approaches towards full-text, from crowdsourcing to encoding text in TEI XML.

OpenDACHS is a collaborative initiative maintained at the
East Asian Department of the CATS Library and funded by the
Institute of Chinese Studies and the
Heidelberg Centre for Transcultural Studies (HCTS), continuing and expanding
DACHS. It is a citation repository comprising services and workflows for researchers that allow the archiving of cited resources, the generation of DOI identifiers to be used in research publications, and the creation of library catalog records for the cited resources. The platform makes use of the .warc format, published as ISO standard 28500:2017. As webcrawler, it uses the open source software Heritrix, which was published by the Internet Archive. The individual archived records can be viewed with tools like the Wayback Machine.

The poster "
OpenDACHS: a Citation Repository for the Sustainable Archiving of Cited Online Sources," presented by
Matthias Arnold,
Hanno Lecher, and Sebastian Vogt, illustrates how OpenDACHS is working to develop solutions to problems related to the preservation of dynamic online content. Today, an increasing number of sources and publications are solely published on the internet. These materials are a modern form of research data and researchers cite and reference them in their own research output. However, the problem is that within just five years, about 50% of these webpages are either “Not found” anymore (HTTP 404, “link rot”), or the pages were modified and now show a different content (HTTP 200, “reference rot”). The aim of DACHS is to collect and preserve website resources relevant for Chinese Studies, in order to make data available to future generations of researchers and scholars.