CulDiLe for the Digitization of Cultural Assets
CULtural DImensions of deep LEarning
An integrated intelligent system for the digital recording, processing, and understanding of cultural documents.
THE NEED FOR INNOVATION IN THE DIGITIZATION AND DOCUMENTATION OF CULTURAL ASSETS
Despite funding through NSRF and other programs, the majority of archives and libraries still remain in analog form.
One of the main reasons is that digitization requires a lot of time and specialized, high-cost equipment. Moreover, scientific documentation — especially for material such as historical documents, rare books, and manuscripts — is a time-consuming and demanding process, which often exceeds the digitization itself in both duration and complexity.
The documentation of these artifacts includes the recording of critical characteristics, such as layout, typesetting, and content, so that search and access become possible. To demonstrate how necessary technological support is in this field, we present two examples of projects that were carried out in the traditional way.
Example 1: Microfilm digitization
In a recent project, 20,000 microfilms were converted into document images. Each microfilm required at least 8 hours of processing: isolating the document from the frame, margin trimming, and image enhancement. A new method, based on layout techniques, piloted on a small percentage (10%), proved that it could reduce the time by approximately 95%, saving nearly 72 person-years, while simultaneously producing higher-quality images.
Example 2:
Documentation of manuscripts at the Benaki Museum The documentation of 117 manuscripts required a total of 12 person-years. Although full automation of documentation is not feasible, a significant reduction in time can be achieved — even to less than 1/10 of the conventional time — using semi-automated methodologies that recognize and suggest key documentation elements.
OUR INNOVATIVE APPROACH
Within the framework of this project, we are developing a complete and innovative software system, which combines:
- Image capturing and optimization from low-cost scanners,
- Document understanding using artificial intelligence,
- And documentation assistance, based on neural networks.
SMART CONTENT ANALYSIS
Expert-assisted documentation includes the automatic recognition and annotation of significant elements, such as:
- Initial letter
- Τυπογραφικά κοσμήματα
- Titles, dedication notes
Keywords per document type (e.g., in a Gospel: “Τω καιρώ εκόνω”)
A UNIQUE APPLICATION OF GLOBAL REACH
The result will be a pioneering scientific documentation assistance software, unique on a global level, which:
- Fully aligns with priority 2.1.8 of the relevant program,
- Focuses on improving the quality of digitization,
- Drastically reduces the time and cost of documentation.
Our platform is based on the evolution of the existing HDOC+ technology by Honest Partners, and ensures full compatibility with open-source repositories, both for images and documented data.
Our goal is to offer archives, museums, and libraries an accessible, efficient, and technologically advanced tool that will substantially enhance the access, management, and preservation of cultural heritage.
CulDiLe — HIGH-QUALITY DIGITAL COPIES
Our goal is the production of high-quality digital copies through advanced image pre-processing. This improvement is fundamental to the success of the subsequent stages, such as segmentation and recognition.