In 2009 (after the completion of the pilot project to collect women’s blogs) we began collecting web sites created by individuals and organizations whose papers/records are at the library.
While completing the finding aid, the processor should include a file unit description of the web site, even if it has not yet been harvested.
Ideally, by the time the processor has completed the finding aid, a first harvest will have been successful and the web content will have been added to the Schlesinger Library Sites web archiving collection. The Digital Librarian/Archivist will supply Paula Aloisio with a url for the collection’s archived content, and Paula will create a URN and add the hot link(s) to the finding aid. If the harvested site is not available when the finding aid is complete, the finding aid will still be posted and the URN/link will be added by Paula later. The processor should, however, include all the information as if the web content were available.
Workflow for Archived Web sites:
- Oftentimes the donor agreement will list the donor's web site for archiving. If so, the site may already be scheduled for archiving. If a web site is not listed for archiving in the deed of gift or purchase agreement and the site is not mentioned in the correspondence file, the processor searches for a web site or web sites for the person or organization.
- The processor checks to see if the web site is already being captured in Archive-It's Schlesinger Library Sites collection by searching for the web site's URL at the collection level. The collection is: Schlesinger Library Sites. You can use this direct link: https://archive-it.org/collections/8237. Searching at a higher level in Archive-It will make it more difficult to find your web site since it searches all our collections (Schlesinger Library Sites; #Metoo; Capturing Women's Voices, etc.). Searching by URL isn't always foolproof, though. See "About Archive-It" section below for more information on how to navigate and search Archive-It. An additional clue that there is at least one web site already being captured is that there may be NET holding attached to your HOLLIS record indicating an archived web site.
- If the processor finds a web site created by the individual or organization that is not currently being archived by Schlesinger they will appraise the site for archiving. Appraisal includes assessing the content of the site. Is it worth collecting? Is it currently being maintained by the creator? Is there specific content that you think we should definitely make sure to archive, such as embedded video or PDFs of newsletters? If the processor determines that a site or sites are indeed worth archiving they should contact the curator about requesting permission to archive the site or they can contact the donor directly for permission.
- When a new site is ready to be archived, the processor sends the web site URL to Laura Peimer to begin harvesting of the site. At this point, the processor should tell Laura whether the site needs a one-time capture or if it should be harvested annually or bi-annually. The schedule default is annual but if the organization doesn’t exist anymore, if the individual has died, or if it's a site that is not actively being updated, we might want to schedule a one-time capture. Because archiving web sites through Archive-It is not always straightforward, especially regarding multi-media content, it is helpful to tell Laura if there is certain content to focus on capturing, such as video, certain images, etc.
- In 2018 a pilot project was successfully completed linking digitized video content from the archived web site of the Women's Encampment to matching Vt-# entries in the Women's Encampment collection’s finding aid: https://hollisarchives.lib.harvard.edu/repositories/8/resources/8359 By matching already digitized content from an archived web site to the finding aid -- where we have the corresponding analog content listed-- we are able to provide historical footage more readily and without the additional cost of reformatting on our end. When surveying, processors should consider identifying any possible 1-to-1 matches between digitized audiovisual content on the organization's or individual's archived web site with any physical original tapes in the collection.
Finding aid:
- If a web site exists and is being archived, the processor will add the following to the finding aid (see Sonia Fuentes for example: http://nrs.harvard.edu/urn-3:RAD.SCHL:sch01256):
- In the Extent tag include "# archived web site(s)"
- EXAMPLE: 12.93 linear feet ((31 file boxes) plus 2 folio+ folders, 10 photograph folders, 3 audiotapes, 4 videotapes, 1 archived web site
- In the Scope and Content: “XX’s web site is being captured periodically as part of Schlesinger Library’s web archiving program."
EXAMPLE: Also included is Griffin's web site, which is being captured periodically as part of Schlesinger Library's web archiving program.
See Papers of Susan Griffin finding aid.
- In file unit descriptions, use “E” as the container followed by a file unit number. New practice (as of January 2019) is to provide more information in the <unittitle>. This will include specifying that it is an "archived web site" and adding the actual URL of the site.
EXAMPLE: E.1. Susan Brownmiller’s archived web site: http://www.susanbrownmiller.com, 2010-ongoing. [hot link added by Paula]
- If there are multiple web sites these should be grouped under one E#. This means that the sites will be searchable as a group in Archive-It (See the section below on "About Groups" for more information). In this case, the URLs should be listed within a scope and content tag for the folder entry. If you want to list your sites separately in the finding aid inventory, such as in different series, discuss first with Laura or Paula. See below for more information on groups.
GROUP EXAMPLE: E.1. Deanna Booher's archived web sites as Queen Kong and Queen Adrena, 2018-ongoing. [hot link added by Paula]
Scope and Content note: GLAMAZON QUEEN KONG (http://goddessqueenkong.blogspot.com/); Queen Adrena (http://queenadrena.com/); Queen Adrena (http://queenadrena.net/); Matilda the Hun from GLOW; Queen Kong (http://queenkong.com/).
- In file unit descriptions, use “E” as the container followed by a file unit number. New practice (as of January 2019) is to provide more information in the <unittitle>. This will include specifying that it is an "archived web site" and adding the actual URL of the site.
- In the added entries include the <genreform>Web archives</genreform>. In the MARC record, include 655 _7 Web archives $$2 aat
- *Please note: we are no longer adding “electronic records” in the quantity or in the added entries when there are archived web sites. If other born digital material is present, use electronic records.
- *Please note: we are no longer adding “electronic records” in the quantity or in the added entries when there are archived web sites. If other born digital material is present, use electronic records.
- In the added entries include the <genreform>Web archives</genreform>. In the MARC record, include 655 _7 Web archives $$2 aat
- The collection, series, and subseries dates should exclude the web site date(s), which should only appear in the item description (e.g., E.1. Web site, 2010-ongoing)
Note about processing addenda: We include links to archived web sites in the finding aid and HOLLIS record for both the original collection and any addenda.
If you have a web site in your finding aid, please mention it to Paula when you give her the XML document for review. She will need to create a "NET holdings" for the bib record. Processors shouldn't worry about this, but should let her know so she can plan to make the necessary record.
About Archive-It
The library uses the outside service Archive-It to harvest and provide access to archived web sites. We have multiple collections in Archive-It, including collections for the #metoo project, the Long 19th Amendment, and Capturing Women's Voices (our original pilot project to capture blogs). Web sites attached to manuscripts collections are collected as part of Schlesinger Library Sites. As mentioned above, be sure to check to see if a web site you want to harvest for your collection is part of the Schlesinger Library Sites collection. In the public display of Archive-It, you can search at the specific collection level by selecting the collection first.
To find Schlesinger Library Sites, use the direct link or go to the home page of Archive-It. If you are on the Archive-It home page, you can enter "Schlesinger Library" in either the Explore Collections or the Explore Collecting Organizations search boxes. Select Schlesinger Library. At this point, you will see the following list of collections:
Scroll down to find Schlesinger Library Sites.
Again, you can avoid drilling down by using the direct link to Schlesinger Library sites: https://archive-it.org/collections/8237
Click on the name of the collection. The page below is where you should enter your search. To see if your collection web site is already in Schlesinger Library Sites, you can search on a term in the current URL, the full URL of the web site (i.e., http://annpmeredith.com/), or on the creator name using either first or last name or both (formatted as first name last name). Keep in mind that it is possible that a web site URL may have changed at some point and we may not be capturing the current iteration. Also, when you search on a creator name it is possible that all the relevant archived sites may not show up in your search results due to the limitations of the metadata. You can always check the group listing on the left side to see if there are multiple sites for a creator before you search on the creator's name. If you don't find your site, you may want to try using some of the filters on the left side of the page for groups or creators. Or, you can use the title sort function just under the search box at the top of the list, but be aware that the title is listed in alphabetical order by the first word (hence Ann Meredith shows up first). If you are unsuccessful in your search, you can let Laura know when you request the site be harvested; she will double check the production side of Archive-it before she adds the site for harvesting.
About groups:
We use groups feature in Archive-It when harvesting multiple web sites for one person or organization. We do this for a couple of reasons: first, it is good way for a researcher to find all the web sites associated with a person. In the home page for Schlesinger Library above, you can see that groups are listed along the left side of the page for the convenience of the researcher. Second, it allows the harvester to add on additional separate sites, or "seeds," when needed to make the archived site functional. Additionally, we sometimes need to capture a You-Tube channel or other social media separately, and this is the best way to associate it with the person or organization.
A processor can see if we are using groups by searching for one of the web sites associated with a person. The site will appear with other sites in the group. The following is an example of the group made for Deanna Booher:
As mentioned in the workflow above, there are several benefits to using the groups feature in Archive-It. Ideally, you would list the group one time in your finding aid under one E number. However, if it seems like the group feature will not work for the arrangement you have in your finding aid, please check with Laura and Paula to discuss options.