In 2009 (after the completion of the pilot project to collect women’s blogs) we began collecting web sites created by individuals and organizations whose papers/records are at the library.
While completing the finding aid, the processor should include a file unit description of the web site, even if it has not yet been harvested.
Ideally, by the time the processor has completed the finding aid, a first harvest will have been successful and the web content will have been added to the Schlesinger Library Sites web archiving collection. The Digital Librarian/Archivist will supply Paula Aloisio with a url for the collection’s archived content, and Paula will create a URN and add the hot link(s) to the finding aid. If the harvested site is not available when the finding aid is complete, the finding aid will still be posted and the URN/link will be added by Paula later. The processor should, however, include all the information as if the web content were available.
Workflow for Archived Web sites:
The work flow will take this shape:
- Oftentimes the donor agreement will list the donor's web site for archiving. If a web site is not listed for archiving in the deed of gift or purchase agreement and the site is not mentioned in the correspondence file, the processor searches for a web site for the person or organization. If the processor finds a web site he or she will contact the curator about requesting permission to archive the site or will contact the donor directly for permission.
- If a web site is found and will be archived, the processor checks to see if the site is already being captured in Archive-It's Schlesinger Library Sites collection by searching for the web site's URL at the collection level. You can use this direct link: https://archive-it.org/collections/8237. Searching at a higher level in Archive-It will make it more difficult to find your web site. Searching by URL isn't always foolproof, though. See below for more information on how to navigate and search Archive-It. An additional clue that there is at least one web site being captured is that there may be NET holding attached to your HOLLIS record indicating an archived web site.
- If the web site isn't in Archive-It, the processor sends the web site URL to Laura Peimer to begin harvesting of the site. At this point, the processor should be able to tell Laura whether the site needs a one-time capture or if it should be harvested annually or bi-annually. The schedule default is annual but if the organization doesn’t exist anymore or if the individual has died, we might want to schedule a one-time capture. Alternatively, if the site is very active, we can capture it bi-annually. The processor should also indicate if there's anything particularly important on the site that we should ensure the Archive-It crawler captures (e.g. video clips, certain images, etc.).
In 2018 a pilot project was successfully completed linking digitized video content from the archived web site of the Women's Encampment to matching Vt-# entries in the Women's Encampment collection’s finding aid: https://hollisarchives.lib.harvard.edu/repositories/8/resources/8359
By matching already digitized content from an archived web site to the finding aid -- where we have the corresponding analog content listed-- we are able to provide historical footage more readily and without the additional cost of reformatting on our end. When surveying, processors should consider identifying any possible 1-to-1 matches between digitized audiovisual content on the organization's or individual's archived web site with any physical original tapes in the collection.
Finding aid:
- If a web site exists and is being archived, the processor will add the following to the finding aid (see Sonia Fuentes for example: http://nrs.harvard.edu/urn-3:RAD.SCHL:sch01256):
- In the Extent tag include "# archived web site(s)"
- EXAMPLE: 12.93 linear feet ((31 file boxes) plus 2 folio+ folders, 10 photograph folders, 3 audiotapes, 4 videotapes, 1 archived web site
- In the Scope and Content: “XX’s web site is being captured periodically as part of Schlesinger Library’s web archiving program."
EXAMPLE: Also included is Griffin's web site, which is being captured periodically as part of Schlesinger Library's web archiving program.
See Papers of Susan Griffin finding aid.
- In file unit descriptions, use “E” as the container followed by a file unit number. New practice (as of January 2019) is to provide more information in the <unittitle>. This will include specifying that it is an "archived web site" and adding the actual URL of the site.
EXAMPLE: E.1. Susan Brownmiller’s archived web site: http://www.susanbrownmiller.com, 2010-ongoing. [hot link added by Paula]
- If there are multiple web sites these should be grouped under one E#. This means that the sites will be searchable as a group in Archive-It. In this case, the URLs should be listed within a scope and content tag for the folder entry. If you want to list your sites separately in the finding aid inventory, such as in different series, discuss first with Laura or Paula. See below for more information on groups.
GROUP EXAMPLE: E.1. Deanna Booher's archived web sites as Queen Kong and Queen Adrena, 2018-ongoing. [hot link added by Paula]
Scope and Content note: GLAMAZON QUEEN KONG (http://goddessqueenkong.blogspot.com/); Queen Adrena (http://queenadrena.com/); Queen Adrena (http://queenadrena.net/); Matilda the Hun from GLOW; Queen Kong (http://queenkong.com/).
- In file unit descriptions, use “E” as the container followed by a file unit number. New practice (as of January 2019) is to provide more information in the <unittitle>. This will include specifying that it is an "archived web site" and adding the actual URL of the site.
- In the added entries include the <genreform>Web archives</genreform>. In the MARC record, include 655 _7 Web archives $$2 aat
- *Please note: we are no longer adding “electronic records” in the quantity or in the added entries when there are archived web sites. If other born digital material is present, use electronic records.
- *Please note: we are no longer adding “electronic records” in the quantity or in the added entries when there are archived web sites. If other born digital material is present, use electronic records.
- In the added entries include the <genreform>Web archives</genreform>. In the MARC record, include 655 _7 Web archives $$2 aat
- The collection, series, and subseries dates should exclude the web site date(s), which should only appear in the item description (e.g., E.1. Web site, 2010-ongoing)
Note about processing addenda: We include links to archived web sites in the finding aid and HOLLIS record for both the original collection and any addenda.
If you have a web site in your finding aid, please mention it to Paula when you give her the XML document for review. She will need to create a "NET holdings" for the bib record. Processors shouldn't worry about this, but should let her know so she can plan to make the necessary record.
About Archive-It
The library uses the outside service Archive-It to harvest and provide access to archived web sites. We have multiple collections in Archive-It, including collections for the #metoo project, the Long 19th Amendment, and Capturing Women's Voices (our original pilot project to capture blogs). Web sites attached to manuscripts collections are collected as part of Schlesinger Library Sites. As mentioned above, be sure to check to see if a web site you want to harvest for your collection is part of this specific collection. In the public display of Archive-It, you can search at the specific collection level by selecting the collection first.
To find Schlesinger Library Sites, use the direct link or go to the home page of Archive-It. Here, you can enter "Schlesinger Library" in either the Explore Collections or the Explore Collecting Organizations search boxes. Select Schlesinger Library. At this point, you will see the following list of collections:
Scroll down to find Schlesinger Library Sites.
Again, you can avoid drilling down by using the direct link to Schlesinger Library sites: https://archive-it.org/collections/8237
Click on the name of the collection. The page below is where you should enter your search. Searching in Archive-It can sometimes be confusing, The best search term to use in the search box is the URL of the web site (i.e., http://annpmeredith.com/). However, this is not foolproof as it is possible that the URL changed at some point and we aren't capturing the current URL. If you don't find your site, you may want to try using some of the filters on the left side of the page for groups or creators. Or, you can use the title sort function just under the search box at the top of the list, but be aware that the title is listed in alphabetical order by the first word (hence Ann Meredith shows up first). If you are unsuccessful in your search, you can let Laura know when you request the site be harvested; she will double check the production side of Archive-it before she adds the site for harvesting.
About groups:
We use groups feature in Archive-It when harvesting multiple web sites for one person or organization. We do this for a couple of reasons: first, it is good way for a researcher to find all the web sites associated with a person. In the home page for Schlesinger Library above, you can see that groups are listed along the left side of the page for the convenience of the researcher. Second, it allows the harvester to add on additional separate sites, or "seeds," when needed to make the archived site functional. Additionally, we sometimes need to capture a You-Tube channel or other social media separately, and this is the best way to associate it with the person or organization.
A processor can see if we are using groups by searching for one of the web sites associated with a person. The site will appear with other sites in the group. The following is an example of the group made for Deanna Booher:
As mentioned in the workflow above, there are several benefits to using the groups feature in Archive-It. Ideally, you would list the group one time in your finding aid under one E number. However, if it seems like the group feature will not work for the arrangement you have in your finding aid, please check with Laura and Paula to discuss options.