DIATHESIS is an information system for documentation, management and promotion of historical documents that supports both digital library functionality and archival management of the original documents. It includes OCR-based page analysis and subject clipping, subject-level metadata generation, semantic indexing and multifaceted classification of subjects using built-in thesauri. The data produced by the OCR processing of the scanned material are used for the creation of a highly flexible annotation interface which allows users to perform hybrid annotations upon the digitized material assigning semantic properties to specific regions of text that represent a subject. The goal of the documentation process is the creation of a coherent semantic backbone that can be easily enriched with semantic relations. It is not meant to be a complete semantic structure that includes all the semantic relationships and entities (Actors, Places) described in the text.The query interface enables users to conduct searches on a document as well as on a subject level basis combining both full text and metadata search capabilities.Queries on the document level are based on conventional metadata assigned automatically to the whole document during the import phase while queries on the subject level exploit the semantic relationships that have emerged from the documentation phase. The combination of the different query modes provides a semantic filter that greatly improves the precision of the conducted searches. The subject’s metadata are based on a robust top level domain ontology (CIDOC-CRM, ISO 21127) in order to ensure that the produced knowledge can be inter-exchanged between different institutions.The query result presentation mechanism allows the partial download of the digitized material in order to improve the overall user experience and reduce the download time.
DIATHESIS leafletDIATHESIS presentation
For more information contact Maria Theodoridou <mariaΑΤics.forth.gr>