Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Finalizing adding finding aids, designing for how to display heirarchical items from Finding Aids and CURIOSity. Add in CURIOSity items to index, add in document type to eyebrow on front end. Confirm LLM selection and system prompts through team testing. Finalize front end in preparation for release to QA in Sprint 7.

...

Problem and Value Statements

Problem Statement

Since its founding, Harvard Library has been a guardian of the University’s memory and a gateway to the world's knowledge. We currently host an array of discovery systems that use different design approaches, organizational priorities, and technology standards. Scholars and the public expect to be able to find trustworthy information and discover resources easily regardless of the system that is managing and providing access to it.

Solution Business Value

By enabling rich cross-collection search, this project will offer end users intuitive, contextual discovery of special collections, archives and digital collections, through a mix of conversational interfaces, browsing that emphasizes the visual nature of materials when appropriate, and recommendations for similar or related resources, all informed by ongoing user research.

Alignment with Harvard Library Multi-Year Goals and Objectives

...

Sprints

Outcome

Responsible Parties

Sprint 1Gained foundational understanding of back end, and established collaboration practices with each other and other HUIT and LTS colleagues. Demo was not recorded.Technical Project Team
Sprint 2Investigated front end frameworks and decided on React, diagramed a draft front end architecture, and "made real" step 3 (semantic retrieval) in order to help begin the front end work. See recording of demo here. Technical Project Team

Sprint 3

Initialize front end development (big win: to work with fastapi for semantic retrieval), finish deploy of semantic retrieval, and experiment with one LLM generative feature and finish indexing the Finding Aids. See recording of demo here.Technical Project Team

Sprint 4

Continuing work on front end, making it deployable on dev and finishing back end generative AI features work. Planning for usability testing. See recorded demo here. Technical Project Team

Sprint 5

Fix the data issues with Finding Aids, add new set to index and investigate adding CURIOSity items to index. Finalize front end work and create end to end testing. By end of sprint, estimate when usability can begin. See a recording of the demo here.Technical Project Team

Sprint 6

Finalizing adding finding aids to index, designing for how to display hierarchical items from Finding Aids and CURIOSity. Add in CURIOSity items to index, add in document type to eyebrow on front end. Confirm LLM selection and system prompts through team testing. Finalize front end in preparation for release to QA in Sprint 7. See a recording of the demo here.Technical Project Team
Sprint 7Finalize and release Collections Explorer alpha to QA so that usability testing can begin in Sprint 8. See a recording of demo here.Technical Project Team
Sprint 8Onboard technical lead, demo confidence score investigation on front end, begin technical approach discussions and research for data pipeline, conduct usability tests of QA. See recording of demo here. Technical Project Team
Sprint 9

Solidifying designs for data pipeline, making decision for vector database and scaling considerations based on estimates of metadata records and fulltext. Begin work on front end components for re-use. Usability analysis will be completed for design changes to "production" Collections Explorer.

See recording of demo here.
Technical Project Team
Sprint 10

Set up Airflow locally and deploy code; develop baseline for testing relevancy in Q3; continue to work on front end components and remediate accessibility. See recording of demo here. 

Technical Project Team
Sprint 11Re-design "Results" page for Collections Explorer based on usability results. Start migration to NextJS for front end. Evaluate the 2 narrowed down choices for vector database and demo creation of an embedding document and retrieval in one of the vendor products. Deploy Airflow to our development environment. See recording of demo here. Technical Project Team

Definition of Done

...

Team Member

Title

Project Role(s)

Katie AmaralTechnical Project LeadDeveloper, Architecture (LTS)
Enrique DiazManager of Library Software EngineeringProduct Owner (LTS)
Doug SimonSenior Digital Library Software EngineerDeveloper (LTS)
JJ ChenDigital Library Data EngineerDeveloper (LTS)
Maura MeagherAssociate UX DeveloperDeveloper (LTS)
Carolyn CaizziSenior IT Project ManagerProject Manager/ Scrum Lead (LTS)
Meg McMahonUX ResearcherUX Researcher/Designer (HL)

Estimated Schedule

Note: Project is managed by using the Scrum framework and these phases/milestones will be adjusted. Below is a a tentative schedule.

...

version2

...

high level schedule. See more detailed view of project tasks here.


Phase

Phase Start

Phase End

Completion Milestone

1

July 2024

September 2024

Natural language discovery platform with generative AI features for discovering digitized, special and archival collections is built and released to QA for testing.
2

October 2024

December 2024Platform is tested by end users and improvements are recommended. Research into scaling platform for production is completed. Data pipeline is scoped and work begins. Design process for digitized collections (images) component is completed.
3

January 2025

March 2025Data pipeline and digitized collections components begin to be built. Decision to soft launch discovery platform is made depending on data pipeline.
4

April 2025

June 2025Cont. building data pipeline and digitized collections components.  Platform is monitored for costs and analytics are gathered and reviewed to plan for full launch September 2025. 
5-12



Years 2-3 will build out full text search integration, more types of digital collection discovery, and access, as well as continuously improve the platform.  Investigation into  and possible rollout of workflows for using AI to improve quality of metadata. 

...