Page Delivery Service (PDS)

On this page:

The Page Delivery Service (PDS) delivers to a web browser scanned page images of books, diaries, reports, journals and other multi-page documents from the collections of the Harvard libraries.

Documents delivered by PDS can be used in ways similar to their print counterparts, for example, browsed through a table of contents or viewed page-by-page. PDS also offers tools to manipulate pages (zoom, rotate, pan), a full-text keyword search of document contents, and an option to create a PDF of the document for printing.

Who can use PDS?

Any Harvard organizational entity that is registered to use DRS is eligible to use PDS. All of the files associated with digital objects to be delivered in PDS must be stored in DRS. To register for DRS, consult the Planning section of the DRS website. For more information on how to take advantage of the Page Delivery Service, send an inquiry to the LTS Support Center.

What are PDS file requirements?

The Page Delivery Service can deliver page content as page image files, as plain text files, or both page images and plain text. The use of PDS requires that these image and text files, along with a structural metadata file, be stored in the DRS.

The PDS requirements include needing to construct a separate digital file for each page of a page-turned object. For instance, if a page-turned object consists of images and text, for each page there needs to be a separate image file and a separate plain text UTF-8 encoded OCR text file.

PDS allows for PDS documents to consist only of text or only of images as well as of text and images together. There can be one or more image files per page as long as the files represent the same page image and are in different formats. For instance, one page can have a TIFF file as a master copy and a JPEG 2000 file as a deliverable copy as well as a plain text UTF-8 encoded OCR file.

Page images of a document can be bitonal (black and white), grayscale, or color. PDS delivers page images in JPEG or GIF format, and can create delivery images from JPEG2000 JP2 or TIFF masters. The following table describes these options:

Page image deposited as:PDS will deliver as:
JPEG2000 JP2JPEG
JPEGJPEG
GIFGIF
TIFF**GIF

If more than one of these formats is available for a document, PDS will use this order of preference: (1) JPEG/GIF (whichever is listed first for that page in the METS file), (2) JPEG2000 JP2, (3) TIFF.

**If the PDS TIFF-to-GIF service is used, TIFF images must be bitonal, in CCITT Group 4 Fax compression. Required TIFF values would be:

  • Compression = 4 (CCITT Group 4)
  • PhotometricInterpretation = 0 (WhiteIsZero)
  • BitsPerSample = 1

Plain UTF-8 encoded text for a document is required for DRS in order for the text to be searchable in PDS. Plain text files are created by using OCR (optical character recognition) during the page scanning process or by manually rekeying the text. These plain text files can be used only to support keyword searching in PDS or the plain text can be offered as a display option. It is also possible to have PDS offer plain text as the only display option for a document.

Structural metadata identifies all the components of a document, describes its structure, and allows for page-turning navigation. This structural metadata is generated during batch creation and stored within the PDS object's descriptor.xml file. This XML file also contains descriptive metadata about the original source material that is used for the display labels in PDS and to maintain a permanent connection between the digital object and the original source material. The XML metadata file can be produced in-house or by a reformatting vendor. Details about PDS structural metadata are provided in the PDS Workflow document.

Captions for page images

DRS collection curators can request that image captions be applied to the page images stored in the DRS and delivered by the Page Delivery Service. See Image Captions for more information.

Setting maximum delivery size

The default maximum size for digital images delivered by the Page Delivery Service (PDS) is 2400 pixels in the largest dimension. But curators can set a lower maximum delivery size for their page images on a per-billing-code basis. See Image Delivery Size Restriction for more information.

Sample Harvard documents using PDS

PDS urls have this syntax:

  • Production: http://pds.lib.harvard.edu/pds/view/[DRS object ID] 
  • QA/test: http://pdstest.lib.harvard.edu:9005/pds/view/[DRS object ID]

Links to sample documents delivered by PDS: