
Welcome!
Here is the Text+ newsletter for March 2025 - we look forward to providing you with insights into our current project activities. Your feedback, questions or suggestions are always welcome - the easiest way to reach us is via the Text+ Office (office@text-plus.org).
The newsletter is currently published approximately every three months and is distributed via two channels - on our website and by e-mail to the community who would like to be informed about Text+ news. If you would also like to receive the newsletter by e-mail, please contact us.
Text+ internal
Coordination Committees for the 2025/2026 term of office
The Coordination Committees elected in November 2024 have been in operation since the beginning of 2025. The first meetings of the respective committees took place in January. The Text+ project and its structure, the role and tasks within Text+ and the roadmap for 2025 were presented. The chairs of the committees were also elected.
Prof. Dr Berenike Herrmann was confirmed as Chair of the Scientific Coordination Committee Collections. Prof. Dr Ingrid Schröder was also re-elected as Chair of the Lexical Resources Coordination Committee and Prof. Dr Vivien Petras as Chair of the Operations Coordination Committee. All three have already performed their duties as chairs with great commitment and success in the previous term of office. Prof Dr Stefanie Acquavella-Rauch was elected as the new Chair of the Scientific Coordination Committee Editions.
We would like to thank all the chairs for taking on these important positions in the new term of office and look forward to working with them over the next two years!
Text+ at the NFDI Interim Report Symposium 2025 in Bonn
On 11 February 2025, the NFDI Interim Report Symposium took place in Bonn, at which the consortia of the second funding phase presented their results of the last three years as well as their plans for the remaining funding period. All NFDI consortia that have been funded since 2021 were invited to this symposium. In four parallel sessions, after a joint welcome and introduction, the consortia presented their progress and future plans. The consortia from the humanities and social sciences were bundled into one session, so Text+ presented together with BERD@NFDI from the social sciences.
After the presentation of BERD@NFDI, the five-member Text+ team answered questions and engaged in discussion with the other participants. Text+ was represented by: Ingrid Schröder (Spokesperson SCC Lexical Resources), Andreas Speer (TA-Lead Editions), Elke Teich (Deputy Scientific Spokesperson Text+), Philipp Wieder (Operations Speaker) and Andreas Witt (Scientific Spokesperson of theText+ consortium). After a 15-minute presentation, which provided a detailed insight into the current status of the project, a 45-minute discussion with the experts followed, which provided valuable suggestions and feedback. In the subsequent poster session, all ten consortia from the second round presented their projects in a larger room. With eight prepared use cases alongside the poster, Text+ was able to demonstrate concrete applications and developments. The response to the presentation and the poster was consistently positive and a lively discussion ensued in which the questions posed could be discussed in detail and to the satisfaction of the expert panel, not least thanks to the careful preparation by the consortium. Text+ took away important impulses from this discussion for further work.
We would like to take this opportunity to thank everyone involved in the preparation and realisation of the interim report symposium. Special thanks go to the representatives mentioned above, who successfully presented Text+ with great commitment and expertise.
Newsletter survey
The Text+ newsletter has been around for almost a year now. In order to evaluate and further improve this format, we would like to invite you to take part in a short survey (6 questions, 2 optional free text fields). Your feedback is very important to us in order to better customise the newsletter to your interests and needs. We would therefore be delighted if you would take a few moments to give us your feedback.
Thank you in advance for your support!
Highlights from the blog
In this section, we present interesting articles from the Text+ blog. The blog provides information about Text+ and, in addition to the website, also presents work in progress or allows a more detailed look at individual topics. All contributions are provided with DOIs and can be cited.
Contributions from guest authors on topics of interest to the Text+ community are very welcome! Write to us if you have a topic
Is AI training covered by the TDM barriers? A legal debate with implications for the Text+ community
The blog article addresses the legal uncertainty about whether the training of AI language models is covered by the copyright barriers for text and data mining (TDM) introduced by the 2019 EU Directive on Copyright in the Digital Single Market. These barriers were incorporated into German law in 2021 (Section 60d UrhG) and allow research organisations to perform TDM on content with lawful access.
In September 2024, Tim W. Dornis and Sebastian Stober concluded in an extensive study on copyright and training of generative AI models that AI training was not covered by the TDM barriers. Shortly afterwards, however, the Hamburg Regional Court ruled to the contrary that the use of works for AI training may well be covered by the TDM limitations.
The Text+ working group for legal and ethical issues has analysed these contradictory interpretations and published an official statement. In it, it concludes that the training of AI language models is covered by the TDM barriers and therefore, in principle, no consent from copyright holders is required. The full statement is available on the Text+ website, and the working group invites discussion and feedback (please email office@text-plus.org).
Text+ cooperation projects
Text+ cooperation projects funding 2024
The second funding round for cooperation projects in Text+ took place in 2024, in which four cooperation projects were successfully supported. Funding began in January 2024 and ran until the end of the year. The funded projects covered all four work areas of Text+:
- Project ‘The Beria Collection in the Language Archive Cologne: Expansion, revision and evaluation of a data collection of an under-described African language’ submitted by Prof. Dr Birgit Hellwig (University of Cologne) and Dr Isabel Compes (Institute of Linguistics) in the Collections Task Area
- Project ‘Thesaurus Linguae Aegyptiae - More Fair with APIs’ submitted by Dr. Daniel Werning (Berlin-Brandenburg Academy of Sciences and Humanities, Centre for Basic Research on the Ancient World) in the Task Area Lexical Resources
- Project ‘Das älteste Görlitzer Stadtbuch 1305-1416: Transformation, Kuratierung und doppelte digitale Publikation (Daten, Webanwendung) einer außergewöhnlichen Buchedition für die historischen Disziplinen’ submitted by Prof. Dr Patrick Sahle (Bergische Universität Wuppertal) and Dr Christian Speer (Martin Luther University Halle-Wittenberg) in the Task Area Editions
- Project ‘Tool support for the automatic extraction of tabular data from historical newspapers’ submitted by Prof. Dr.-Ing. Frank Krüger (Wismar University of Applied Sciences) in the Infrastructure/Operations task area
We would like to thank all projects for their constructive and successful collaboration during the funding period. The diversity of the topics and the milestones achieved reflect the great commitment and expertise of all those involved. We look forward to integrating the results of these projects into the Text+ research data infrastructure and sharing the knowledge gained with the scientific community.
Text+ cooperation projects funding 2025 launched
On 1 January 2025, the third funding round of Text+ was launched with five promising cooperation projects that had already emerged from a large number of innovative and qualitatively convincing project applications in spring 2024. However, as the funding requirements exceeded the available funds, a careful selection process was necessary. In the end, the following five projects were selected for funding
- Project ‘Glossarium Graeco-Arabicum - Open Data’ submitted by Dr Rüdiger Arnzen (Friedrich-Alexander-Universität Erlangen-Nürnberg) in the Task Area Lexical Resources
- Project ‘HAdW GND-based web services - Beaconizer & Discoverer’ submitted by Dr Frank Grieshaber (Heidelberg Academy of Sciences and Humanities) in the Task Area Infrastructure/Operation
- Project “Text+ interfaces to the interview collections in Oral-History. Digital‘ submitted by Dr Cord Pagenstecher (University Library of the Freie Universität Berlin) in the Task Area Collections
- Project ’Building an open digital collection of historical music theory texts from German-speaking countries using examples from the 19th century” submitted by Prof. Dr Pagenstecher (University Library of the Freie Universität Berlin) in the Task Area Infrastructure/Operation. Century‘ submitted by Prof. Dr Fabian C. Moss (Julius-Maximilians-Universität Würzburg) in the Task Area Collections
- Project ’LOD-Rollen-Modellierungen aus den Registern von Regestenwerken zum Mittelalter" submitted by Prof. Dr Andreas Kuczera (Akademie der Wissenschaften und der Literatur Mainz) in the Task Area Editionen
We warmly welcome the new cooperation projects, wish them every success and look forward to a stimulating collaboration within the framework of Text+.
Call for proposals for the funding of cooperation projects through Text+
Text+ is part of the National Research Data Infrastructure (NFDI) with the aim of establishing a geographically distributed research data infrastructure for research data in the humanities. The focus is on the data areas of digital collections, lexical resources and editions.
The aim of this call is to continuously expand the data and services offered by Text+ and make them available to the research community in the long term. Alternatively, applications can be submitted that utilise the data and services available in Text+ specifically for innovative research questions. It is also possible to integrate your own institutionalised and sustainable data centres with interfaces into the technical infrastructure of Text+ in order to make the relevant data permanently available in the Text+ infrastructure.
Applications for funding can be submitted for one of the data domains Collections, Lexical Resources, Editions and for the Infrastructure/Operation Task Area. Further information on the application process and the projects approved to date can be found at https://text-plus.org/vernetzung/kooperationsprojekte/.
Workshop reports
Text+ and the use of large language models (LLMs)
With their research data, the text- and language-based humanities offer a wide range of use cases for the use of large language models (LLMs). Text+ extends access to such research data via the registry, the Federated Content Search (FCS) and via the data centres and repositories of the contributing partners. A new addition is the provision of a web service for the open source LLMs (Meta) LLaMA, Mixtral, Qwen and Codestral as well as Chat-GPT from OpenAI. This was made possible by the GWDG, which, as a national high-performance computing and AI centre, supports the development and testing of AI use cases in Text+. The service is initially available to all those directly involved in the project after registering via the Academic Cloud. An expansion of the user base is planned, but is currently subject to licence restrictions. Further information can be found at https://text-plus.org/daten-dienste/llm_service/.
Text+ Registry: Data model Editions, Approach of Text+ towards Ontology Harmonisation
In the NFDI, the interoperability of research data is of great importance, i.e. the keyword One NFDI is not only about making research data from different disciplines visible and reusable, but also about linking the research data with each other, e.g. to make them jointly searchable. The aspect of data enrichment also plays an important role in this context.
Against this background, many NFDI consortia are working on (sometimes joint) graph-based solutions. Text+ stands out somewhat from this field: We are not working on our own knowledge graph, but rather developing the Text+ Registry as a centralised discovery service for research data. The registry, in turn, can be harvested as a resource for graphs, i.e. Text+ contributes to other graphs in this way.
Why this is the case, what the infrastructural approach of Text+ is and how research data is labelled and proven was the subject of a presentation in the Ontology Harmonisation working group of the Memorandum of Understanding-NFDI consortia (NFDI4Culture, NFDI4Memory, NFDI4Objects, Text+) on 10 February 2025. In addition to an overview of the Text+ architecture, the registry was presented in technical terms and the data model Edition was presented as a specific implementation. A number of questions were answered and discussed during and after the presentation.
The slide deck of the presentation is available at Zenodo:
Text+ simplifies the integration of research data
Text+ is the central point of contact for text- and language-based research data, providing the scientific community with the opportunity to securely archive data and make it available for further reuse – all while adhering to data protection and licensing regulations. To facilitate the data integration process, Text+ has developed a structured form. This form captures the essential information about each resource and supports the decision on which Text+ center will take over and sustainably archive the data. Based on this information, the necessary steps for data integration, including exploration and curation, are coordinated to ensure that the research data is securely stored and its reusability is guaranteed in a trusted, discipline-specific environment.
Events & Reports
FAIR February 2025
In the third continuation of the virtual event series FAIR February, the Task Area Editions invited participants this year to four workshop-based sessions, each dedicated to one of the four FAIR principles.
In the first session, the “Guidelines for Quality Assessment and Assurance for Digital Editions” developed within the Task Area were presented, focusing on the implementation of quality criteria whose compliance is a crucial contribution to the discoverability and interoperability of digital editions. The second session focused on usability. The starting point was the presentation of a concept for an interview guide. Interviews with users of digital editions based on this guide aim to contribute to the usability guide for digital editions, developed in collaboration with the Service Center for Digital Humanities Münster (SCDH).
In the third session, participants learned how to create a BEACON and RDF file from the data of a digital edition using controlled vocabularies, and how to set up a corresponding interface for the edition. The concept of the GND-BEACON-Hub was also presented, which searches and aggregates such files from various digital resources, linking them together.
The fourth session offered an opportunity for smaller groups to practice documentation related to digital editions and exchange strategies and useful tools. Participants were able to develop an understanding of how to document effectively and meaningfully in ongoing projects to make information and data as traceable and reusable as possible.
More information about the event and workshop materials: https://events.gwdg.de/event/948/
DHd25 – Workshop on high-quality metadata in digital editions, a consortium-wide collaboration
For this year’s DHd conference, held in early March in Bielefeld under the motto “Under Construction,” the humanities NFDI consortia Text+, NFDI4Memory, and NFDI4Objects joined forces. Supported by two colleagues from the GBV consortium central office and the Institute for Museum Research, they developed a workshop concept focused on metadata. The workshop’s goal was to provide practical knowledge on assessing and improving the quality of metadata in digital editions.
The course, designed for 25 participants, was fully booked. The range of experience brought by the participants demonstrated a significant need to gain more confidence in working with metadata in everyday academic practice.
Starting with concrete case examples, such as different transmission situations of letters (new discovery, transcription, indirect reference), the workshop addressed the challenges that can arise when describing various types of metadata. The workshop was then divided into three larger exercise blocks, each preceded by brief thematic introductions to key topics such as metadata, controlled vocabularies, Dublin Core, LIDO, and data quality and curation. In the practical exercises, participants worked in groups or independently with metadata standards such as Dublin Core, LIDO, and TEI-XML. The exercises aimed to test the creation of metadata sets, including using metadata editors, and to develop methods for assessing and cleaning up one’s own or existing metadata sets based on personal experience.
DHd25 - The SSH Open Marketplace as an information hub for the DH community: Introduction to research and use cases
In collaboration with the Task Area Infrastructure Operations, DARIAH-EU offered a workshop during DHd25 to introduce the SSH Open Marketplace (SSHOMP). The collaboration between Text+ and the Marketplace to provide services and training materials is a proven and valuable division of labor for the consortium. An important aspect of communicating this cooperation is to emphasize that the Marketplace is an excellent solution for addressing community-specific scenarios.
The use case involves providing information about services, software, training materials, tutorials, or workflows that do not require a dedicated website but can be made available in the Marketplace through tagging, for example, with a specific keyword. A key advantage of this solution is that users are not only presented with their own resources, but also other resources that may be either an alternative or a valuable complement to the sought-after solution. The Marketplace is accessible with common credentials – for example, through one’s university, research institution, or via Google or Orcid – for data curation or the creation of new resources.
The workshop slide deck is available on Zenodo:
Dates
All events – both upcoming and already held – can also be found in our Event roll in the Text+ Portal.
Workshop Series: Standardization of Research Data
Text+, the NFDI consortium for text and language sciences, invites you to a new series focused on the exchange of ideas related to the standardization of research data.
Using concrete application examples, participants will gain insights into the use of standards and standard-based tools and can benefit from the experiences of the speakers in their respective projects. The goal is to facilitate the planning and implementation of their own endeavors.
Additionally, the workshop series aims to lay the groundwork for future data integrations into the Text+ infrastructure, promote internal reflection on service development, infrastructure, and interfaces, and highlight opportunities for participation for data providers.
The series is a collaborative activity of all Task Areas within Text+ in cooperation with colleagues from the community.
The application examples:
- March: DTABf, Marius Hug, Frank Wiegand
- April: correspSearch, Stefan Dumont
- May: edition humboldt digital, Christian Thomas, Stefan Dumont
- June: INSeRT, Felix Helfer, N.N.
- July: PROPYLÄEN. Goethes Biographica, Martin Prell
- September: DeReKo, Jennifer Ecker, Pia Schwarz, Rebecca Wilm
- October: Klaus Mollenhauer Ausgabe, Max Zeterberg
- November: BERIA Collection, Isabel Compes
Target audience: The series targets a broad audience with a focus on language and text sciences. It is suitable for newcomers seeking an introduction to the topic (e.g., PhD students, researchers without infrastructure connections), as well as for those experienced in using standards and tools who wish to benefit from or share experiences from similar endeavors.
Registration for all individual events in the series is available here: https://events.gwdg.de/category/284/
Text+ Plenary 2025
The 4th Text+ Plenary will take place on June 16th and 17th, 2025, at the Niedersächsische Staats- und Universitätsbibliothek Göttingen. As always, subject-matter communities, representatives of other NFDI consortia, and all interested parties are warmly invited to join for an inspiring and enriching exchange.
This year’s Text+ Plenary is themed Connecting Infrastructures – Connections between European Research Infrastructures. A special highlight is that it will be held just before the European DARIAH Annual Event (June 17-20, 2025). Both events offer a unique platform for exchange among European scholars and related stakeholders. While Text+ focuses on text- and language-based research, the thematic framework extends to the entire range of humanities, arts, and cultural sciences within the DARIAH consortium.
The Plenary provides an excellent opportunity to exchange ideas on current developments and research findings, forge new connections, and strengthen collaboration within the scientific community. More information and the possibility to register will be available shortly on the Text+ Consortium’s website.
Datum | Event | Ort |
---|---|---|
03. April 2025 | Text+ Research Rendezvous | virtuell |
04. April 2025 | KI und Large Language Models: neue Impulse für die Hochschullehre in der Romanistik | virtuell |
11. April 2025 | Werkstattreihe Standardisierung: correspSearch | virtuell |
15. April 2025 | Text+ Research Rendezvous | virtuell |
13. Mai 2025 | Text+ Research Rendezvous | virtuell |
22. Mai 2025 | Werkstattreihe Standardisierung: edition humboldt digital | virtuell |
25. Mai 2025 | Open Access as a Business Model: Practical Insights and Disciplinary Comparisons | virtuell |
05. Juni 2025 | Werkstattreihe Standardisierung: INSeRT | virtuell |
16./17. Juni 2025 | Text+ Plenary | Göttingen |
17.-20. Juni 2025 | DARIAH Annual Event 2025 | Göttingen |
17. Juli 2025 | Werkstattreihe Standardisierung: PROPYLÄEN. Goethes Biographica | virtuell |