Digitization Specifications

General Guidance

Items should be scanned at 100% (original size) at 24bit RGB.

File names for scans will consist of a sequentially assigned numeric identifier (as tracked/maintained in the CHoM Scanning Log database), with “_pres” appended to indicate that this is a preservation-quality image and "_ref" to indicate a reference/delivery copy

Reference copies: Document/text files and files should be BATCH converted, resized, and renamed into reference copies using a program like Irfanview.

File storage: scans are stored in: N:\Digital_Collections\ARCHIVAL REPOSITORY\IMAGES\ARCHIVAL COPIES

  • Preservation copies are stored folders labeled with the corresponding ID range, ie: ID0000801_0000900
  • Reference copies are stored in sub-folders within those ID range folders, ie:            ID0000801_0000900\Reference Copies

Scanning log: All scanned/digitized items (including images, text, audio, and video) should be tracked in the CHoM Scanning Log database.

  • Location: N:\Collections\07_Collections_Databases_and_Lists
  • Metadata can be entered using either the table view or form view
  • Data entry guidance will appear at the bottom of the database window in the red bar when clicking into each field. For more in-depth data entry guidance, see the "Dublin Core Metadata" section of the Omeka Guidelines.

 

Filename and format quick reference

Text/documents:
Preservation copy = 0000022_dpres.tif 
Reference copy =     0000022_dref.pdf

Images:
Preservation copy = 0000023_pres.tif
Reference copy =    0000023_ref.jpg

Video:
Preservation copy = 0000024_vpres.mp4
Reference copy  =    0000024_vref.mp4

Audio:
Preservation copy = 0000025_spres.wav
Reference copy =     0000025_sref.mp3

 

Format-specific Guidance

Textual Items 

Category includes handwritten and typed documents, pages of books, articles, pamphlets, pages consisting of both text and images that are predominately text (if predominately image, follow the specs under the Images category), etc.

Scanning specs by item dimension (in inches)

    • 8 x 10 or larger: 300 ppi
    • Smaller: 600 ppi 
    • For textual pages that include images:

      • 8 x 10 or larger:  300 ppi
      • 5 x 7:  600 ppi
      • 4 x 5:  800 ppi
      • Smaller:  800-1000 ppi

Please note that, ideally, documents should be scanned both as images (to capture the document “as it is”) and as text using OCR so that you can search the documents. This is generally not possible with manuscript materials, but is if you are considering sets of documents (such as typed pages) in good enough condition to run through a document scanner or with printed works.

File formats

Preservation copies: TIFF
Reference copy: PDF  (150 ppi)

Filenaming

For textual items/documents, you must add a "d" (for document) in front of "pres" in the filename.  This is because, since both images/visual materials and textual document masters are scanned as TIFFs, the "d" designation allows us to easily differentiate the two when batch creating reference copy JPGs and PDFs, respectively. 

Example preservation copy: 0000077_dpres.tif
Example reference copy:      0000777_dref.pdf

 

Images

Category includes photographs, negatives, and transparencies, and pages with both text and images that are predominately images.

Scanning specs by dimensions (in inches)

    • 8 x 10 or larger: 600 ppi
    • Smaller (by type): 
      • Prints:
        • 5 x 7:  600 ppi
        • 4 x 5:  800 ppi
        • Smaller: 800–1000 ppi
      • Film Negatives:
        • 5 x 7:  600 ppi
        • 4 x 5:  1200 ppi
      • Glass Plate Negatives:
        • 5 x 7:  800 ppi
        • 4 x 5:  1200 ppi
      • Transparencies:
        • 4 x 5 (copies):  600 ppi
        • 4 x 5 (archival):  1200 ppi
          • ie. the only original, such as a photograph for which no copy print or negative exists
        • Slides:
        • 35 mm:  2800 ppi

File formats

Preservation copies: TIFF
Reference copy: JPG  (Long side set to 700 pixels, at 150 ppi)

Filenaming

For textual items/documents, you must add a "d" (for document) in front of "pres" in the filename.  This is because, since both images/visual materials and textual document masters are scanned as TIFFs, the "d" designation allows us to easily differentiate the two when batch creating reference copy JPGs and PDFs, respectively. 

Example preservation copy: 0000001_pres.tif
Example reference copy:      0000001_ref.jpg

 

Video recordings/moving images

File formats

Preservation copies: MP4   (Wrapper: QuickTime (.mov).  Uncompressed, 10-bit 4:2:2 YUV, 486 x 720, 4:3 aspect ratio, 29.97fps, pixel size: Rec. 601)
Reference copies: MP4      (H.264/mp4 –AVC, SD, 480p)

Filenaming

All video/moving image files should be named sequentially, receiving the next number available in the scanning log, rather than having collection names, project names, or subject names embedded in the file.  A “v” should be inserted as part of the file name to indicate it is a “video/moving image” file.

Example preservation copy: 00000001_vpres.mp4
Example reference copy: 00000001_vref.mp4

 

Audio/sound recordings

Preferred file formats

Preservation copies: WAV (first preference) or MP3 (second preference). 48Khz/24-bit PCM
Reference copies: MP3  (44kHz/16-bit AAC)

Filenaming

All sound files should be named sequentially, receiving the next number available in the scanning log, rather than having collection names, project names, or subject names embedded in the file.  An “s” should be inserted as part of the file name to indicate it is a “sound” file.

Example preservation copy:  0000001_spres.wav
Example reference copy:  0000001_sref.mp3

 

Batch Conversion of Reference Files

  1. Open Irfanview:
  2. Go to File > Batch Conversion/Rename
  3. In the top left, select "Batch conversion - Rename result files"

  4. For Images:
    1. Under Output format, select JPG
    2. Check "Use advanced options (for bulk resize...)
    3. In the new window:
      1. check "RESIZE"
        1. Select "Set long side to"  =  700 pixels
      2. "Set new DPI value" = 150
      3. Click OK
    4. Name pattern: $N[0-7]_ref
    5. Under "Look in" navigate to the folder where your preservation copies live. Highlight the files you wish to convert and click the "Add" button below. 
      1. Tip: if functioning is slow, uncheck "Show Preview image" in the bottom left).
    6. Set your Output directory for result files (the corresponding "Reference copies" folder). 
      1. Tip: To save time, first click "Use current (look in) directory", then click Browse to navigate to the Reference copies sub-folder
    7. Click Start Batch

  5. For Documents:
    1. Under Output format, select PDF
    2. Check "Use advanced options (for bulk resize...)
    3. In the new window:
      1. UNCHECK "RESIZE" (it may be checked if you've recently converted image files)
      2. "Set new DPI value" = 150
      3. Click OK
    4. Name pattern: $N[0-7]_dref
    5. Under "Look in" navigate to the folder where your preservation copies live. Highlight the files you wish to convert and click the "Add" button below. 
      1. Tip: if functioning is slow, uncheck "Show Preview image" in the bottom left).
    6. Set your Output directory for result files (the corresponding "Reference copies" folder). 
      1. Tip: To save time, first click "Use current (look in) directory", then click Browse to navigate to the Reference copies sub-folder
    7. Click Start Batch

 

 

 

 

 

 

 

 

 

Copyright © 2024 The President and Fellows of Harvard College * Accessibility * Support * Request Access * Terms of Use