What We Do with the Data
Note: we don't accept data directly from patrons.
Workflows We Offer
- Images scanned at local repositories in response to researcher requests
- Content produced from Harvard Library special projects
- Content produced from in-house projects or acquired from a vendor or donor by local repositories
The Procedure to Use this Service
- Contact Imaging Services (see the "Imaging Services Contact" section at the end of this page)
- Determine the workflow with Imaging Services (see the "Workflows We Offer" section above and their detailed descriptions)
- Copyright clearance
- Set up the file delivery method/procedure with Imaging Services (see "File Delivery Method" section below)
- Provide DRS related information to Imaging Services (see the "DRS information" block in the "Check List ..." section below)
- Arrange to ran a pilot batch with Imaging Services if necessary
- Confirm a fee if applied
- Schedule file deliveries (one time or regular)
Check List before File Delivery to Imaging Services
- Any copyright or license document to be associated with the digital content?
- DRS object type (see the section below)
- Delivery, and Discovery systems to be used, and who will insert or update the links after depositing
- Any cataloging work needed before depositing?
- Any metadata files needed to go with file deliveries?
- File type/format to be delivered
- File type/format to be deposited
- File and Object name
- DRS information:
- DRS OwnerCode (for example, GSE.GUTMN)
- DRSBillingCode (for example, RAD.SCHL.SKC_0001)
NRS authorization path and mask (for example, FHCL and {n}, HBS.Baker.GEN and {n}-{yyyy})
DRS access code (P, R, or N)
- Any other information the repository would like to be associated with deposited objects. For example,
- Initial project note
- Producer of the files
- Future processing note
DRS Object Type We Support
Currently, Imaging Services assembles these types of object for the DRS.
- Still Image Object
- Page Turned Object (List Object if needed)
- Document Object (PDF, Doc, Docx)
- Opaque Object
- Text Object
File Delivery Method and Communication
Delivery methods
- Shared directory on Imaging Services File Server
- Hard disk or flash drive
- Shared Google Drive (managed by Imaging Services)
- Next Cloud space managed by LTS (pilot)
- Repository or vendor managed file sharing platform
- Shared directory using MS or Google (for a small set of files)
- Harvard Secured File Transfer (https://filetransfer.harvard.edu)
Delivery communication
- It is recommended that the repository sends an email to inform Imaging Services that a batch is ready to be processed.
- Repository should always keep a copy of data.
Batching
We expect files are delivered in batches. We refer a batch as a deposit unit in a reasonable "size". The "size" could be considered from the combination of several factors:
- Data size: generally speaking, we don't want a batch is too big (i.e. over 50GB)
- Average file size: the larger the file size, the less number of files in a batch.
- Total file count: we don't want a batch contains too many files (let's see, over 5,000)
- Object type: an object type may require all the files for an object are in the same batch (for example, a PDS object)
- Object size: an object may include several types of files, which increases the size of the object.
- DRS guidelines, policies, software practical limits: LTS has a general guideline about how many GB data a depositing unit can deposit each day.
- The convenience for repositories and Imaging Services to prepare and track the "unit":
Usually, we deposit the batch as delivered by its repository. Here are some examples:
- Still image batches with average 1000 color TIFF images.
- Still image batches with average 500 color TIFF images.
- Still image batches with average 20 color TIFF images which matches to a building design.
- Still image batches with average 200 color JPEG2000 images which were produced from a box of slides.
- A PDS batch which contains 1000 color images of a 1000-page book.
- A PDS batch which contains 500 color images of 50 small pamphlet.
- A PDF document batch which contains about 10 GB of PDF files.
- A PDF document batch which contains about 12 PDF files.
Metadata
We may be able to provide the following metadata support. Please tell us your need during the initial project discussion.
- Structural metadata for PDS objects.
- Descriptive metadata before or after depositing.
Fees
$0 - $50 per batch based on the workflow. Please follow the links in "Workflows We Offer" section above to see the details.
File Format and File Organization
Please refer to this page for file format and file organization recommendations.
Imaging Services Contact
Request to use this service:
- Send a request via our new digitization project inquiry form.
Question for using this service:
- Bill Comstock (comstock@fas.harvard.edu)
- Mingtao Zhao (mzhao@fas.harvard.edu)
Batch deposit:
- Mingtao Zhao (mzhao@fas.harvard.edu)
- Hilary Kline (kline@fas.harvard.edu)
Billing and Others:
Imaging Service Office,
G81 Widener Library,
Imaging@fas.harvard.edu
617-495-3995
Examples
H. C. Fung Library: Alfred G. Meyer letters from the USSR, 1958
48 letters (133 typewritten pages) acquired by Fung Library. (see details in HOLLIS)
Original files and delivery:
- A PDF file for each letter
- Named with a date (with a day-part if more than one letters for the day. (e.g., 8-4-58.pdf, 8-7-58_1 (morning).pdf, 8-7-58_2 (afternoon).pdf)
- The PDF files contain G4 compressed bitonal TIFF images
- A donor agreement text file saved from an email from the donor.
- Files copied to a shared space on Imaging Services file server.
What to deposit:
- A PDS object containing all the letters with searchable OCR files (plain text and alto XML)
- Each letter will be a section in the PDS object
Repository processing as advised by Imaging Services:
- An ALMA record was created
- PDF files were rename to [ALMA_ID]_yyyymmdd#.pdf (e.g., 8-7-58_2 (afternoon).pdf => 99155187868203941_195808072.pdf)
- A structural mapping spreadsheet was created by the repository
Imaging Services processing and depositing:
- The donor agreement was saved as a PDF object, and deposited.
- The bitonal TIFF files were exported from each PDF file to a folder named from the PDF filename.
- OCR files were generated from these TIFF files.
- A batch was prepared and deposited.
Linking:
- A NET holding was added in ALMA by the repository
Fee: $50/batch
- The project size was small
- Image conversion/processing was needed
- Filename and directory structure changes were required
- OCR files were desired
Deliverable examples:
PDS object in Harvard Viewer
Alfred G. Meyer letters from the USSR, 1958. Davis Center for Russian and Eurasian Studies Collection, H.C. Fung Library, Harvard University.
Multi-Repository: Scanning Key Content
See the project wiki page (esp. the "Deposit" section)
Fee: $50/batch during Harvard Library Lab project phase. $0/batch after it was established as Library-wide project.
Multi-Repository: Deposit color slide images produced by on-site vendor Picturae
Original files and delivery:
- JPEG2000 files created by Picturae saved in specific directories (usually on box level)
- One MD5 checksum file for image files in each folder
- Files were made accessible to Imaging Services on a file server managed by Picturae
What to deposit:
- Still Image objects with JPEG2000 files
Processing and depositing:
- Each directory was treated as a depositing batch
- Moving batches from Picturae file server to Imaging Services file server
- Running checksum to guarantee the file integrity
- Generating and uploading DRS batches
Linking:
- Majority of slides had been cataloged in JStorForum before scanning and files were named using SharedShelf ID.
- Some repositories asked Imaging Services to create records in JStorForum using AVES after depositing.
Fee: $10/batch
- A large scale project
- Supporting Library's slide-scanning
- Simple image-taking process
- No image processing involved
- Consistent batch size
- No cataloging and manual linking involved
Deliverable examples:
- The Harvard Theatre Collection at Houghton Library: Bernie Gardella Boston Ballet slide collection
- Dumbarton Oaks Research Library: A still image object
- Schlesinger Library: A still image object
Middle Eastern Division, Widener Library: Women's Worlds in Qajar Iran
Original files:
- TIFF or JPEG files in various color profiles and compression
- Olivia XML files for AVES loading
- Spreadsheets for generating structural METS for PDS objects
What to deposit:
- Still Image objects
- PDS objects
Processing and depositing:
- Converting or assigning sRGB or sGray profile
- Generating lossless JPEG2000 files
- Re-constructing directories and re-naming files
- Batching in BatchBuilder and depositing into DRS
Linking:
- In the project specific system by the repository
Fee: $50 for batch depositing
- Ongoing project, but in a un-predictable pace
- Batches tend to be big
- AVES loading involved
Deliverable examples:
- The project web site: Women's Worlds in Qajar Iran
- A PDS object: Collection of Tales, late 19th/early 20th century.
- A Still Image object: Shukuh ʻUzma with her husband and children
Baker Library: Cases on Microfilms
Original files and delivery:
- Untagged gray scale TIFF images from our microfilm scanning vendor
- Received from the vendor in film-case based directories. The repository will re-organize the files into meaningful item-level directories.
What to deposit:
- PDS objects
- PDS document list objects
Processing and depositing:
- Generating JPEG files for the repository to use for re-organizing files
- Assigning sGray profile to TIFF image files
- Converting TIFF to JPEG2000
- Reconstructing directories and Renaming files if necessary
- Creating structural METS files
- Batching in BatchBuilder and depositing into DRS
Linking:
- In ALMA by the repository
Fee: $0
- Images were from microfilm orders fulfilled by Imaging Services (via its scanning vendor)
Deliverable examples:
- A PDS list object: Course outlines on microfilm, 1932-1943.
- A PDS object: Course outlines on microfilm, 1940-1941.
Harvard-Yenching Library: Xin-Jiang Newspapers
Original files and delivery:
- JPEG files with a color target
- Catalog files for each title
- All files on an external hard drive
What to deposit:
- PDS objects with JPEG2000 files
Processing and depositing:
- Check JPEG files
- Rename directories and files
- Processing files: target, save as tiff, covert to lossless JPEG2000
- Generate structure from original directory structures
Linking:
- In ALMA by Imaging Services
Fee: $50/batch
- Cataloging involved
- Files need to renamed and re-grouped
- Image processing involved
- Error checking with the image-provider involved (corrupted images and file grouping)
Deliverable examples:
- A PDS document list object: Ili geziti
- A PDS object: Ili geziti. [Xinjiang?]: [publisher not identified], 1965.
- Record in HOLLIS: Ili geziti, [Xinjiang?], [publisher not identified], Harvard-Yenching Library.
Harvard Kennedy School Library: Depositing Working Papers
Original files and delivery:
- JPEG2000, bitonal TIFF and Uncorrected OCR plain text files in item-level folders
- Files were copied to a shared space on Imaging Services file server.
What to deposit:
- PDS objects with JPEG2000 files and OCR text files
Processing and depositing:
- Renaming directories and files using the spreadsheet prepared by Imaging Services's cataloging group
- Batching a group of items as a batch in BatchBuilder and deposit
Cataloging and Linking:
- Creating record in ALMA for each title
- Adding links to records after depositing
Fee: $50/batch
- Large batch size
- Cataloging involved
Deliverable examples:
- A PDS object: Tourism and economic development : considerations for tribal policy and planning
- and its record in HOLLIS: http://id.lib.harvard.edu/alma/990023595690203941/catalog
Harvard Kennedy School Library: Depositing Student Journals
Original files and delivery:
- PDF files each Journal title
What to deposit:
- Document objects (PDF)
Processing and depositing:
- Convert PDF files to PDF/a using Acrobat DC if possible
- Rename directories and files
- Batching each title as a batch in BatchBuilder and deposit
Cataloging and linking:
- Create record for each title in ALMA by Imaging Services
- Add link in ALMA after deposit by Imaging Services
Fee: $50/batch
- Small project
- Cataloging involved
Judaic Division: Depositing images from Dentac
Original files and delivery:
- JPEG2000 files
- Already batched with BatchBuilder
- Delivery files using a Imaging Services managed Google Drive
What to deposit:
- Still image objects
Processing and depositing:
- No image processing involved
- No BatchBuilder batching needed
Linking:
- AVES loading with files supplied from the vendor
Fee: $15/batch
- Batch size varies
- No processing and BatchBuilder batching needed
- AVES loading involved