DiBiLit is a corpus being created in the BMBF-funded project CLARIAH-DE by homogenising various derivatives of texts from the "Digital Library" and extensively enriching them with (bibliographical) metadata. The more than 2,000 texts come from renowned authors, are DTABf-encoded and were made accessible within the DTA infrastructure under a Creative Commons-licence. Thus, the text collection originally published by DirectMedia Publishing can be researched using the DDC search engine integrated in the DTA as well as other DTA tools for linguistic analysis.
The repository contains different directories:
- data
[contains all text assigned to genre-based subdirectories]
- drama
- erzaehlungen
- essays
- fabel
- libretti
- lyrik
- prosa
- roman
- sagen_maerchen
- wissenschaft
- metadata
[contains two subdirectories related to metadata]
- bibl
[contains the bibliographical metadata being the basis of the DTABf-Headers]
- headers
[contains the DTABf-headers of all texts]
- bibl
- publications
[contains the documentation of the workflow]
- Deutsches Textarchiv (DTA)/Dibilit-Corpus: https://deutschestextarchiv.de/dibilit/