Google Scanning Project (FY 19-20)
- Kara Young
- Corinna Baksik
- Emily Kelly
Owned by Kara Young
Overview
Harvard Library is sending a total of 50,000 items to Google for scanning in FY2019, in two major phases of 25,000 items each.
Update: The project has been extended indefinitely for a limited period, with new batches processed for Google on a weekly (roughly) basis.
Workflow Summary
- Items are pulled and permanently withdrawn from Harvard Depository (HD) and shipped to Google Ann Arbor for scanning. Iron Mountain is managing pulling and shipping of materials.
- LTS uploads metadata files to Google. Note:Â Google requires receipt of metadata prior to receiving physical materials for scanning.
- After scanning, items are shipped to and reaccessioned at ReCAP.Â
Project Timeline
Phase 1 (November-December 2018)
25,000 items over three shipments. Iron Mountain is managing the pulling and shipping of materials from the Harvard Depository to Google.
- Delivery 1: November 12, 2018
- Delivery 2: December 3, 2018
- Delivery 3: December 21, 2018
Phase 2 (January-December 2019)
25,000 additional items initially projected January-March 2019, extended with additional shipments through December 2019.
- Delivery 4: January 22, 2019
- Delivery 5: January 29, 2019
- Delivery 6: February 13, 2019
- Delivery 7: February 22, 2019
- Delivery 8: February 26, 2019
- Delivery 9: March 1, 2019
- Delivery 10: March 19, 2019
- Delivery 11: March 20, 2019
- Delivery 12: April 9, 2019
- Delivery 13: April 18, 2019
- Delivery 14: April 24, 2019
- Delivery 15: May 2, 2019
- Delivery 16: May 16, 2019
- Delivery 17: June 3, 2019
- Delivery 18: June 11, 2019
- Delivery 19: June 18, 2019
- Delivery 20: June 26, 2019
- Delivery 21: July 3, 2019
- Delivery 22: July 22, 2019
- Delivery 23: August 15, 2019
- Delivery 24: August 22, 2019
- Delivery 25: September 6, 2019
- Delivery 26: September 18, 2019
- Delivery 27: September 23, 2019
- Delivery 28: October 1, 2019
- Delivery 29: October 23, 2019
- Delivery 30: October 29, 2019
- Delivery 31: October 30, 2019
- Delivery 32: November 4, 2019
- Delivery 33: November 12, 2019
- Delivery 34: November 21, 2019
- Delivery 35 (final batch): December 5, 2019
Detailed Workflow
Notes:
- The deaccession of items from HD and the shipment to Google are not simultaneous. Items are deaccessioned weekly from HD by Iron Mountain and then batched together into larger shipments.Â
- Google requires item metadata in advance of the items' arrival for scanning.Â
- Items will subsequently be shipped to ReCAP for accession and final storage following the Google processing.
Who is responsible? (color coded)
- Harvard Depository staff
- Iron Mountain
- LTS staff
- Google Ann Arbor staff
- ReCAP staff
Procedures
- Iron Mountain pulls items from HD weekly and permanently withdraws them from HD.
- HD (Pat O'Brien) places weekly item manifest files in /google directory on sherlock. Example:Â Example_weekly_HD_bookcarts210_214.txt
- Iron Mountain batches items from weekly HD pulls into three shipments for Google according to the schedule indicated.
- LTS processes item manifest files as follows:
- Combine weekly item manifest files into a single batch manifest and confirm/correct file to ensure that it contains only barcodes, one per line, no extra text or spaces. Example:Â Example_batch_HD_bookcarts_combined_200_242_barcodeonly.txt
- Extract metadata per Google spec and send to Google:
- Create a single-column Excel file: header row "barcode" followed by the list of barcodes from item manifest file. Example:Â Example_Excel_for_Alma_set_input.xlsx
- In Alma, create a public itemized set of Physical Items using the Excel list above: Alma > Admin > Manage Sets > Add Set+ > Itemized > From File. Example set in Alma:Â Google metadata - bookcarts 200-242
- Compare number of items in the Alma set with the number of barcode lines in input file to make sure they match.
- In Alma, open the publishing profile "Google@item level" for editing: Alma > Resources > Publishing > Publishing profiles, then select the ellipsis button next to "Google@item level" profile and select "Edit".
- Update the "Google@item level" profile parameters as follows:
- Set name: Select the itemized set you just created.
- File name prefix: Update with current batch information, following the same convention, e.g. "harvard_bookcart_200-242".
- Save the profile.
- Run "Google@item level" profile by selecting "Run" from the profile's ellipsis button. Output files are deposited as *.tar.gz format on almadrop in /dropbox/alma/alma/google.
- SFTP to almadrop as the alma user to download the metadata output files to the local machine, then upload to the shared Google Drive. Google automatically ingests files, no notification needed.
Work with Allison P. to contact bbunnell@google.com for access to the Google Drive. Google contact for metadata is Kurt Groetsch (kgroetsch@google.com).
- Update Alma permanent physical location to RD and set item status to Transit to RD:
- Run Alma Analytics physical items report on batch barcodes from item manifest:
- Open Alma Analytics report "RES-Physical items details with core holdings, bib data - Flexible": Alma > Analytics > Reports > RES-Physical items...Â
- Paste barcodes from item manifest file into "Barcodes" select window and Click OK to run report.Â
- Compare the number of report rows to the number of barcodes in the manifest to confirm all items are represented.
- Export report as Excel, and post Excel report to this page (see below).
- Create itemized Alma sets for each physical location:
- Method 1 – Filter barcodes by location and add a new itemized set to Alma via barcode upload:
- In Excel, filter RES-Physical items report by physical location (column P) and copy barcodes from that location into a new single-column Excel file, with header row "barcode".
- In Alma, create a public itemized set of Physical Items using the Excel barcode file: Alma > Admin > Manage Sets > Add Set+ > Itemized > From File.
- Method 2 (if someone else is already running an "Add items to set" job in Alma) – filter the existing Alma itemized set by location using Alma combined sets Â
- In Alma, create a logical set of physical items with that permanent physical location: Physical Items > Holdings: Permanent physical location = (library > location).
- Click the ellipsis next to this set and select "Combine sets". Combination parameters:
- Set 1 ("Combine"): logical item set with permanent physical location – preselected
- Operation: And
- Set 2 ("With"): The itemized set of all batch barcodes that you used to extract metadata for Google – Example: Google metadata - bookcarts 200-242.
- Hit Submit and confirm.
- Repeat for each physical location found in RES-Physical Items report.
- Method 1 – Filter barcodes by location and add a new itemized set to Alma via barcode upload:
- Run Alma job "Change Physical Items" on each location set: Alma > Admin > Run a Job > (Select set) > Parameters:
- Change type: Permanent
- New library: Same as current owning library (no change)
- New location: Corresponding RD collection for new/updated holdings
- Place items into transit awaiting post-processing ReCAP accession:
- Log onto almadrop as ltsadmin user and run remotetransit.py script on barcode list. (See /wiki/spaces/LibraryTechServices/pages/59098446.) Items will appear to have been put into transit from the owning library's main circ desk (e.g. WID_CIRC). Any existing requests will be queued.
- Run Alma Analytics physical items report on batch barcodes from item manifest:
- Iron Mountain physically ships batched materials to Google.
- Google retrieves metadata from shared Google Drive and scans batched materials.
- Iron Mountain physically ships batched materials to ReCAP for accession there.
- After batched materials are accessioned by ReCAP, the regularly scheduled End Of Day (EOD) process will return items to Item In Place status. Any waiting requests will be sent to RD for transfer.
Google Scanning Batch ReportsÂ