WP7 – REVER – ITSERR

ABSTRACT

REVER is the tool developed within ITSERR for the automated extraction, classification, and semantic organization of information from medieval latin documents. It is designed to support the creation of structured regesta and to facilitate large-scale analysis of documentary corpora.By combining natural language processing and machine learning techniques, REVER enables the transformation of documentarial textual sources into structured, interoperable data. The project focuses in particular on medieval Latin papal documents, translating the traditional methodology of pontifical diplomatics into scalable digital workflows for regesta production.

RESULTS AND TOOLS

REVER developed an advanced pipeline for the automatic generation and enrichment of regesta, integrating text processing, entity recognition, and semantic annotation. Its main prototype, RegeXta, is an AI system that produces structured regesta from papal documents according to a predefined diplomatic information schema. The system supports automatic extraction of key information from texts (dates, places, actors, events), generation of structured summaries (regesta), semantic annotation and linking of entities, and integration with external resources. Users can upload or manually enter texts and, through the “Extract Regesta” function, obtain structured outputs via an inference service based on large language models, while authenticated users can also manage collections, notes, and exports within a dedicated web interface. The platform enables scalable processing of large documentary collections, improving both accessibility and research efficiency.

CASE STUDIES

REVERINO and REVERINO II datasets

TEAM

WP7 was coordinated by Alberto Melloni (University of Modena and Reggio Emilia- FSCIRE), with Laura Righi (University of Modena and Reggio Emilia) as scientific coordinator. The team combined expertise in medieval history, Church history, digital humanities, and computer science, with contributions from Ilaria Sabbatini (University of Palermo), Arianna Pavone (University of Palermo), Andrea Esuli (ISTI-CNR), Giovanni Puccetti (ISTI-CNR), Silvia Cascianelli (University of Modena and Reggio Emilia) and Marcella Cornia (University of Modena and Reggio Emilia).

This interdisciplinary configuration made it possible to formalise diplomatic standards as machine-readable structures and to integrate them into AI-supported language processing.

BEYOND ITSERR

RegeXta and the datasets developed within the project can be applied to the digitization, description, and organization of archival collections, both historical and contemporary.

The approach is particularly relevant for large-scale archival digitization projects requiring automated summarization, metadata extraction, and semantic indexing.The methodology can be extended to other documentary contexts, both public and private, where automatic linking of documents, summaries, and metadata, as well as structured indexing processes, are required. More broadly, REVER shows how well-established historiographical and diplomatic conventions can be modelled as computational logic, opening the way to reproducible and scalable analysis of documentary corpora beyond papal materials.