Digitization Specifications
General Guidance
Items should be scanned at 100% (original size) at 24bit RGB.
File names for scans will consist of a sequentially assigned numeric identifier (as tracked/maintained in the CHoM Scanning Log database), with “_pres” appended to indicate that this is a preservation-quality image and "_ref" to indicate a reference/delivery copy.
Reference copies: Document/text files and files should be BATCH converted, resized, and renamed into reference copies using a program like Irfanview.
File storage: scans are stored in: N:\Digital_Collections\ARCHIVAL REPOSITORY\IMAGES\ARCHIVAL COPIES
- Preservation copies are stored folders labeled with the corresponding ID range, ie: ID0000801_0000900
- Reference copies are stored in sub-folders within those ID range folders, ie: ID0000801_0000900\Reference Copies
Scanning log: All scanned/digitized items (including images, text, audio, and video) should be tracked in the CHoM Scanning Log database.
- Location: N:\Collections\07_Collections_Databases_and_Lists
- Metadata can be entered using either the table view or form view
- Data entry guidance will appear at the bottom of the database window in the red bar when clicking into each field. For more in-depth data entry guidance, see the "Dublin Core Metadata" section of the Omeka Guidelines.
Filename and format quick reference
Text/documents:
Preservation copy = 0000022_dpres.tif
Reference copy = 0000022_dref.pdf
Images:
Preservation copy = 0000023_pres.tif
Reference copy = 0000023_ref.jpg
Video:
Preservation copy = 0000024_vpres.mp4
Reference copy = 0000024_vref.mp4
Audio:
Preservation copy = 0000025_spres.wav
Reference copy = 0000025_sref.mp3
Format-specific Guidance
Textual Items
Category includes handwritten and typed documents, pages of books, articles, pamphlets, pages consisting of both text and images that are predominately text (if predominately image, follow the specs under the Images category), etc.
Scanning specs by item dimension (in inches)
- 8 x 10 or larger: 300 ppi
- Smaller: 600 ppi
For textual pages that include images:
- 8 x 10 or larger: 300 ppi
- 5 x 7: 600 ppi
- 4 x 5: 800 ppi
- Smaller: 800-1000 ppi
Please note that, ideally, documents should be scanned both as images (to capture the document “as it is”) and as text using OCR so that you can search the documents. This is generally not possible with manuscript materials, but is if you are considering sets of documents (such as typed pages) in good enough condition to run through a document scanner or with printed works.
File formats
Preservation copies: TIFF
Reference copy: PDF (150 ppi)
Filenaming
For textual items/documents, you must add a "d" (for document) in front of "pres" in the filename. This is because, since both images/visual materials and textual document masters are scanned as TIFFs, the "d" designation allows us to easily differentiate the two when batch creating reference copy JPGs and PDFs, respectively.
Example preservation copy: 0000077_dpres.tif
Example reference copy: 0000777_dref.pdf
Images
Category includes photographs, negatives, and transparencies, and pages with both text and images that are predominately images.
Scanning specs by dimensions (in inches)
- 8 x 10 or larger: 600 ppi
- Smaller (by type):
- Prints:
- 5 x 7: 600 ppi
- 4 x 5: 800 ppi
- Smaller: 800–1000 ppi
- Film Negatives:
- 5 x 7: 600 ppi
- 4 x 5: 1200 ppi
- Glass Plate Negatives:
- 5 x 7: 800 ppi
- 4 x 5: 1200 ppi
- Transparencies:
- 4 x 5 (copies): 600 ppi
- 4 x 5 (archival): 1200 ppi
- ie. the only original, such as a photograph for which no copy print or negative exists
- Slides:
- 35 mm: 2800 ppi
- Prints:
File formats
Preservation copies: TIFF
Reference copy: JPG (Long side set to 700 pixels, at 150 ppi)
Filenaming
For textual items/documents, you must add a "d" (for document) in front of "pres" in the filename. This is because, since both images/visual materials and textual document masters are scanned as TIFFs, the "d" designation allows us to easily differentiate the two when batch creating reference copy JPGs and PDFs, respectively.
Example preservation copy: 0000001_pres.tif
Example reference copy: 0000001_ref.jpg
Video recordings/moving images
File formats
Preservation copies: MP4 (Wrapper: QuickTime (.mov). Uncompressed, 10-bit 4:2:2 YUV, 486 x 720, 4:3 aspect ratio, 29.97fps, pixel size: Rec. 601)
Reference copies: MP4 (H.264/mp4 –AVC, SD, 480p)
Filenaming
All video/moving image files should be named sequentially, receiving the next number available in the scanning log, rather than having collection names, project names, or subject names embedded in the file. A “v” should be inserted as part of the file name to indicate it is a “video/moving image” file.
Example preservation copy: 00000001_vpres.mp4
Example reference copy: 00000001_vref.mp4
Audio/sound recordings
Preferred file formats
Preservation copies: WAV (first preference) or MP3 (second preference). 48Khz/24-bit PCM
Reference copies: MP3 (44kHz/16-bit AAC)
Filenaming
All sound files should be named sequentially, receiving the next number available in the scanning log, rather than having collection names, project names, or subject names embedded in the file. An “s” should be inserted as part of the file name to indicate it is a “sound” file.
Example preservation copy: 0000001_spres.wav
Example reference copy: 0000001_sref.mp3
Batch Conversion of Reference Files
- Open Irfanview:
- Go to File > Batch Conversion/Rename
- In the top left, select "Batch conversion - Rename result files"
- For Images:
- Under Output format, select JPG
- Check "Use advanced options (for bulk resize...)
- In the new window:
- check "RESIZE"
- Select "Set long side to" = 700 pixels
- "Set new DPI value" = 150
- Click OK
- check "RESIZE"
- Name pattern: $N[0-7]_ref
- Under "Look in" navigate to the folder where your preservation copies live. Highlight the files you wish to convert and click the "Add" button below.
- Tip: if functioning is slow, uncheck "Show Preview image" in the bottom left).
- Set your Output directory for result files (the corresponding "Reference copies" folder).
- Tip: To save time, first click "Use current (look in) directory", then click Browse to navigate to the Reference copies sub-folder
- Click Start Batch
- For Documents:
- Under Output format, select PDF
- Check "Use advanced options (for bulk resize...)
- In the new window:
- UNCHECK "RESIZE" (it may be checked if you've recently converted image files)
- "Set new DPI value" = 150
- Click OK
- Name pattern: $N[0-7]_dref
- Under "Look in" navigate to the folder where your preservation copies live. Highlight the files you wish to convert and click the "Add" button below.
- Tip: if functioning is slow, uncheck "Show Preview image" in the bottom left).
- Set your Output directory for result files (the corresponding "Reference copies" folder).
- Tip: To save time, first click "Use current (look in) directory", then click Browse to navigate to the Reference copies sub-folder
- Click Start Batch
Copyright © 2024 The President and Fellows of Harvard College * Accessibility * Support * Request Access * Terms of Use