Data Sets on the HART roadmap
Google Library Inventory - In production August, 2023
Self-service reporting on Google Library Inventory database (GLIB). GLIB stores data from Google scanning projects that date from 2004 to present. GLIB tables contain
- Item and bibliographic metadata for inventory of volumes sent to Google for scanning
- URNs that link to full text for scanned materials
- Transaction records that track items through the Google scanning workflow including when an item was scanned or not scanned
- Detailed information about the quality of the scan
- Information regarding physical condition of the item that affects the scan, including preventing scanning
Status: GLIB implementation is part of a larger project outlined in the Charter for Google and HathiTrust Discovery and Data Maintenance. Reporting implementation is currently focused on updates to the LRS Alma holdings load script that will populate a new table for Google scanned barcodes. The specification for scanned_items is written and the loader changes are in development. In addition a specifications are needed for:
- LRW data load from Hathi spreadsheets for Harvard's digitized works in full text downloaded and available in full text in HathiTrust
- Table view of Google items scanned successfully. These will join to the scanned items table and to Hathi full text items table
- Table view of Google items rejected for scanning. These will join to Alma item table
Project charter for HART: GLIB - Google Library Inventory project
Publish from Alma and ingest in Library Reporting Data Warehouse records for electronic resources such as ebooks and ejournals. Implement HART data model and work with staff who manage eresources to gather requirements and define information needs that will be met by the data set.
...
In Process
Cataloging Statistics dashboard
Status: Data load QA underway. Full load expected in December.
Detailed information regarding accessioned and cataloged materials by cataloger, unit, project and date.
- DRAFT: Houghton library cataloging statistics requirements and usage documentation
- ITS use of 920 documentation
Aeon and ArchiveSpace data
Status: Assessing feasibility of LRW ingest. Meeting with stakeholders. MySQL database.
Data Sets under consideration for HART
- ReCAP Partner bibliographic metadata
Periodically load full bibliographic data from ReCAP partner libraries to support collection analysis.
DRS Billing
Extract and load data from DRS to enable billing by library, object and storage tier. Create invoices and deliver to library billing contacts on a schedule.
Harvard Digital Objects Usage
Extract and load data to enable analyses that measure use of digital objects.
- ACORN
Weissman Preservation Center's primary application for documenting preservation treatments.
Status: Requested access to github to review technical documentation for ACORN.
Cancelled data sets
Gate-Swipe data - Cancelled
This will be implemented in a repository for administrative data.
Formerly known as "Turnstiles", this data set is sourced from the gates at entrances to Widener, Cabot, Fine Arts, Lamont libraries. The data set includes datetime of entry, location of the gate, and some anonymized data on the individual entering the library such as affiliation, role, school, year of matriculation and others. A project template for Gates has been started.
Status: Data privacy considerations involving person identifying information are currently under discussion.
...
Data Sets under consideration for HART
- ACORN
Weissman Preservation Center's primary application for documenting preservation treatments.
Aeon and ArchiveSpace data
A request from Houghton to enable reporting on these data sets.
ReCAP Partner bibliographic metadata
Periodically load full bibliographic data from ReCAP partner libraries to support collection analysis.
DRS Billing
Extract and load data from DRS to enable billing by library, object and storage tier. Create invoices and deliver to library billing contacts on a schedule.
Harvard Digital Objects Usage
...