International Dunhuang Project

Technical Infrastructure

page mounted: 1/12/05 page last updated: 12/6/12

Digitisation Equipment



Workflow Chart

IDP Workflow Chart

Any additional information regarding the item can be included on the 4D database; this includes conservation records, catalogue entries, research etc.

Download Workflow Chart (PDF 72KB).

Database and Website Overview

The IDP website gives access to a client/server image and content management system and cataloguing database written in 4D (4th Dimension) and served by Active 4D with QPix used for images.

Catalogues and bibliographical data is coded in XML using a slightly modified form of the TEI (Text Encoding Initiative) specifications. Templates are available on technical resources. The data is then imported and stored as a BLOB (Binary Large Object) in the database.

Remote Site Management and Synchronisation

IDP is an international collaboration with a directorate and technical support team based in the British Library, London, UK and with digitisation, cataloguing and research centres at libraries and museums worldwide. All these institutions host their own IDP database and website in their local language.

In addition, there are several other organisations that do not host their own server but whose data is hosted by one of the IDP hosts. For example, data from the British Museum, the Victoria and Albert Museum, the Chester Beatty Library etc. is hosted on the British Library IDP server. IDP China holds data on other Chinese collections etc. In this way, the IDP Centres act as local hubs.

Each IDP Centre has read — write access to its own, and its hosted — data, and read-only access to data from other Centres. Changes/additions to the data are immediately synchronised automatically to the other servers, and changes/additions made on the other servers are synchronised in. In this way, each Centre has a complete and up-to-date dataset.

The technical team at IDP UK in conjunction with local technical staff maintains the database web server and the database software through the cross-platform remote access software Timbuktu Pro.

Metadata and Digital Image Naming

IDP data are variously stored in three formats: 1) in a structured content management system; 2) in XML using a standard DTD (TEI); 3) with images ('implicit' metadata). In all cases a standard set of basic metadata is captured by IDP conforming to international standards and mappable to such standards. For example, the basic set of fifteen core elements defined by Dublin Core are captured by IDP (Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, and Rights.).

It should, however, be noted that IDP data covers a wide range of subject areas, each of which has its own defined metadata standards. IDP captures the basic metadata demanded by all of these. So, for example, IDP captures the main recognised specialised descriptive elements relevant to manuscripts, to artworks (such as Dimensions, Condition, Inscriptions, Conservation Treatment, and Exhibition/Loan History), and for geospatial data.

The RAW and processed large TIFF files, created by the digital camera software, contains embedded technical metadata data. This comprises three types of metadata: file properties (filesize, dimensions etc); camera data (EXIF (Exchangable Image File Format) data); and copyright (IPTC (International Press Telecommunications Council) data). In addition, the digital image name (DIN) is used to embed core information linking the digital object surrogate to the original and to the location of the digital surrogate.

The DIN is made from the following elements:

The prefix (e.g. BLX1) is identical to the DVD name and folder on the RAID in which the digital image is stored. It is composed of three parts:

The prefix is always followed by an underscore.

The suffix denotes the type of image, as follows:

It is always preceded by an underscore.

EXAMPLE: The Raw image of the first shot of the recto of the British Library scroll with pressmark Or.8210/S.395 would therefore be named: BLX1_OR8210S395R1_1_R.tif

Here the DVD/folder number (BLX1) is followed by the library manuscript number (Or.8210/S.395) and the section (R1, indicating the first text on the recto of the scroll). These are always entered without any separators such as hyphens or slashes, or other non-essental characters, such as full-stops. Following this is the number of the shot (first shot of this text), and then the suffix denoting the type of image.

For more resources on metadata see the Technical Links page.