Table of Contents |
---|
What We Do with the Data
Drawio | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Workflows We Offer
- Content Produced by local repositories from in-house projects or acquired by local repositories from a vendor or donor
- Content Produced by local repositories in response to researcher requests
- Content Produced by ongoing Harvard Library special projects
The Procedure to Use this Service
- Contact Imaging Services (see the contact section at the end of this document)
- Determine the workflow with Imaging Services (see the workflows above and their detailed descriptions at the beginning of this document)
- Copyright clearance
- Set up the file delivery method/procedure with Imaging Services (see "File Delivery Method" section below)
- Provide DRS related information to Imaging Services
- Arrange to ran a pilot batch with Imaging Services if necessary
- Confirm a fee if applied
- One time or regular file deliveries
Check List before File Delivery to Imaging Services
- Any copyright or license document to be associated with the digital content?
- DRS object type (see the section below)
- Delivery, and Discovery systems to be used, and who will insert or update the links after depositing
- Any cataloging work needed before depositing?
- Any metadata files needed to go with file deliveries?
- File type/format to be delivered
- File type/format to be deposited
- File and Object name
- DRS information:
- DRS Ownercode
- DRSBillingCode,
- NRS authorization code and mask
- DRS access code (P, R, or N)
- Any other information the repository would like to be associated with deposited objects. For example,
- initial project note
- producer of the files
- future processing note
DRS Object Type We Support
Currently, Imaging Services assembles these types of object for the DRS.
- Still Image Object
- Page Turned Object (List Object if needed)
- Document Object (PDF, Doc, Docx)
- Opaque Object
- Text Object
File Delivery Method
- Shared directory on Imaging Services File Server
- Hard disk or flash drive
- Shared Google Drive (managed by Imaging Services)
- Next Cloud space managed by LTS (pilot)
- Repository or vendor managed file sharing platform
- Shared directory using MS or Google (for a small set of files)
- Harvard Secured File Transfer (https://filetransfer.harvard.edu)
Batching
We expect files are delivered in batches. We refer a batch as a deposit unit in a reasonable "size". The "size" could be considered from the combination of several factors:
- Data size: generally speaking, we don't want a batch is too big (i.e. over 50GB)
- Average file size: the larger the file size, the less number of files in a batch.
- Total file count: we don't want a batch contains too many files (let's see, over 5,000)
- Object type: an object type may require all the files for an object are in the same batch (for example, a PDS object)
- Object size: an object may include several types of files, which increases the size of the object.
- DRS guidelines, policies, software practical limits: LTS has a general guideline about how many GB data a depositing unit can deposit each day.
- The convenience for repositories and Imaging Services to prepare and track the "unit":
Usually, we deposit the batch as delivered by its repository. Here are some examples:
- Still image batches with average 1000 color TIFF images.
- Still image batches with average 500 color TIFF images.
- Still image batches with average 20 color TIFF images which matches to a building design.
- Still image batches with average 200 color JPEG2000 images which were produced from a box of slides.
- A PDS batch which contains 1000 color images of a 1000-page book.
- A PDS batch which contains 500 color images of 50 small pamphlet.
- A PDF document batch which contains about 10 GB of PDF files.
- A PDF document batch which contains about 12 PDF files.
Metadata
We provide certain metadata support. Please tell us your need during initial project discussion.
- Structural metadata for PDS objects.
- Descriptive metadata before or after depositing.
Fee
$0 - $50 per batch. Please follow the links in workflows section to see the details.
File Format and File Organization
Please refer to this page for file format and file organization recommendations.
Imaging Service Contact
Request to use this service:
...
Question for using this service:
- Bill Comstock (comstock@fas.harvard.edu)
- Mingtao Zhao (mzhao@fas.harvard.edu)
Batch deposit:
- Mingtao Zhao (mzhao@fas.harvard.edu)
- Hilary Kline (kline@fas.harvard.edu)
Billing and Others:
Imaging Service Office,
G81 Widener Library,
Imaging@fas.harvard.edu
617-495-3995
Examples:
H. C. Fung Library: Alfred G. Meyer letters from the USSR, 1958
48 letters (133 typewritten pages) acquired by Fung Library. (see details in HOLLIS)
Original files and delivery:
- A PDF file for each letter
- Named with a date (with a day-part if more than one letters for the day. (e.g., 8-4-58.pdf, 8-7-58_1 (morning).pdf, 8-7-58_2 (afternoon).pdf)
- The PDF files contain G4 compressed bitonal TIFF images
- A donor agreement text file saved from an email from the donor.
- Files copied to a shared space on Imaging Services file server.
What to deposit:
- A PDS object containing all the letters with searchable OCR files (plain text and alto XML)
- Each letter will be a section in the PDS object
Repository Processing as advised by Imaging Services:
- An ALMA record was created
- PDF files were rename to [ALMA_ID]_yyyymmdd#.pdf (e.g., 8-7-58_2 (afternoon).pdf => 99155187868203941_195808072.pdf)
- A structural mapping spreadsheet was created by the repository
Imaging Services Processing and Depositing:
- The donor agreement was saved as a PDF object, and deposited.
- The bitonal TIFF files were exported from each PDF file to a folder named from the PDF filename.
- OCR files were generated from these TIFF files.
- A batch was prepared and deposited.
Linking:
- A NET holding was added in ALMA by the repository
Fee: $50/batch
- project size was small
- image conversion/processing was needed
- filename and directory structure changes were required
- OCR files were desired
Deliverable Examples:
PDS object in Harvard Viewer
Alfred G. Meyer letters from the USSR, 1958. Davis Center for Russian and Eurasian Studies Collection, H.C. Fung Library, Harvard University.
Multi-Repository: Scanning Key Content
See the project wiki page (esp. the "Deposit" section)
Fee: $50/batch during Harvard Library Lab project phase. $0/batch after it was established as Library-wide project.
Multi-Repository: Deposit color slide images produced by on-site vendor Picturae
Original files and delivery:
- JPEG2000 files created by Picturae saved in specific directories (usually on box level)
- One MD5 checksum file for image files in each folder
- Files were made accessible to Imaging Services on a file server managed by Picturae
What to deposit:
- Still Image objects with JPEG2000 files
Processing and Depositing:
- each directory was treated as a depositing batch
- moving batches from Picturae file server to Imaging Services file server
- running checksum to guarantee the file integrity
- generating and uploading DRS batches
Linking:
- Majority of slides had been cataloged in JStorForum before scanning and files were named using SharedShelf ID.
- Some repositories asked Imaging Services to create records in JStorForum using AVES after depositing.
Fee: $10/batch
- a large scale project
- supporting Library's slide-scanning
- Simple image-taking process
- No image processing involved
- Consistent batch size
- No cataloging and manual linking involved
Deliverable Examples:
- The Harvard Theatre Collection at Houghton Library: Bernie Gardella Boston Ballet slide collection
- Dumbarton Oaks Research Library: A still image object
- Schlesinger Library: A still image object
Middle Eastern Division, Widener Library: Women's Worlds in Qajar Iran
...