3. Submitting Content to EASi


This page describes how to upload email content to EAS for processing, including client-specific tips on identifying and extracting email from a donor's file system.

A curator submits email content to EAS by uploading a "packet" of content to an SFTP dropbox, then using the Submit Packet option in EASi to define metadata values for content in the packet and trigger the upload. 

The EASi menu option that controls content submission is called Submit Packet.  A packet is a group of email and attachments that are submitted together.  The contents of a packet must come from the same creator client (email client) and that client must be supported by the EAS loader. 

EASi is configured to import email content (messages and attachments) from these creator clients:

  • Eudora for Windows/6.2
  • Eudora for Windows/version unknown
  • Mac OS X Mail/2.x
  • Mailman/2.0.5
  • Mailman/2.1.15
  • Outlook for Mac (OLM only/version unknown)
  • Outlook for Windows/version unknown
  • Thunderbird/2.0.0.23
  • Thunderbird/version unknown

If you know the exact creator client and version for your content but these are not listed, please contact LTS Support. If you do not know the exact creator client version, select the “version unknown” option for that client.

Step A: Extract email data files

Email to be archived may come to you (the curator) on a disk or drive, or you may need to copy it from the donor's computer. Either way, it helps to know what email data files to look for and where these are stored on the file system. This section provides tips on how to locate email on the donor's computer and what files to extract for archiving.

For most email clients, you navigate to the top of the email data folder and copy all files and folders nested beneath. In the case of Outlook for Mac, you or the donor will need to export messages from within the email client to get data in a format that can be imported to EAS. 

However the transfer takes place, be sure to follow local procedures for securing sensitive data.

Client

Client

Client

Eudora

More info

Entire "Eudora" folder and its contents

POP mailboxes:
C:\Documents and Settings\[username]\Application Data\Qualcomm\Eudora
C:\Users\[username]\AppData\Roaming\Qualcomm\Eudora

IMAP mailboxes:
C:\Program Files\Qualcomm\Eudora\Imap\Dominant\
C:\Users\[username]\AppData\Roaming\Qualcomm\Eudora\Imap\Dominant

Mac Mail

More info

Entire “Mail” folder and its contents

v.1: ~/Library/Mail/Mailboxes/inbox.mbox/mbox

v.2+: 
~/Library/Mail/Mailboxes/inbox.mbox/Messages/nnnn.emlx
~/Library/Mail/Mailboxes/IMAP-[email_account]/Mail/inbox.imapmbox/Messages/nnnn.emlx
~/Library/Mail/Mailboxes/POP-[email_account]/Mail/inbox.popmbox/Messages/nnnn.emlx

archived email:  ~/Downloads/archived-mail/inboxold.mbox/mbox

Outlook Mac

More info

.olm export file(s)

To export:

  1. Open Outlook for Mac.

  2. Select File>Export (v2011) or Tools>Export (v2015).

  3. Choose export options.

  4. Choose a location to save the export file.

  5. Outside Outlook, navigate to folder where you saved the .olm file. Copy it.

Outlook Windows

More info

Entire Outlook data folder and its contents

In XP: c:\Documents and Settings\[your login name]\Local Settings\Application Data\Microsoft\Outlook

In Vista/Windows 7: c:\Users\[your login name]\AppData\Local\Microsoft\Outlook

Thunderbird

More info

Entire mail server folder containing the INBOX mailbox file, and its sub-folders

In Windows:

c:\Documents and Settings\<user name>\Application Data\Thunderbird\Profiles\<Profilename>\<Mail | ImapMail>\<mail server name>\
c:\Users\<user name>\AppData\Roaming\Thunderbird\Profiles\<Profile name>\<Mail | ImapMail>\<mail server name>\

In Mac OS X: (~ represents user's home folder):
~/Library/Thunderbird/Profiles/<Profile name>/<Mail | ImapMail>/<mail server name>/
~/Library/Application Support/Thunderbird/Profiles/<Profile name>/<Mail | ImapMail>/<mailserver name>/


Step B: Create a staging folder

  1. In a convenient and secure location on your local file system, create a staging folder. 
    • Folder name can contain letters, numbers, underscores and/or hyphens. Avoid spaces, diacritics, and other special characters. 
    • The staging folder name will be the default packet name, but you can override this by changing metadata on the EASi Submit Packet screen (in Step D).

  2. Copy the extracted email content (from Step A) into the staging directory. 

It is possible to submit multiple packets to EAS at one time. Each packet must have a separate staging folder with a unique name.

Step C: Upload packet to dropbox

The next step is to upload your staging directory and its contents to an EAS dropbox.  

What you'll need:

  • An EAS dropbox. Each participating repository is assigned its own secure FTP dropbox. 
  • A VPN (virtual private network) account in the secure DRS tunnel. Using VPN to access EAS services is required at all times, even from an on-campus wired computer. 
  • To request an EAS dropbox and/or a secure DRS VPN account, contact Library Technology Services

Here is the packet upload procedure:

  1. Open your VPN client and log into your secure DRS VPN account. The login name should be your HarvardKey login name and tunnel (e.g., john_harvard@harvard.edu#vpn_tunnel_name) and your HarvardKey password. 
  2. Point your secure FTP client to the dropbox address: easdrop-secure.library.harvard.edu (production) or easdrop-secure-qa.hul.harvard.edu (test).  Use port 22 for secure FTP (if your client does not set this automatically).
  3. Enter your dropbox login name and password.
  4. Locate the dropbox INCOMING folder.  On your local file system, locate the staging folder.
  5. Copy the staging folder to the INCOMING folder.
  6. Close the secure FTP connection.  This step is IMPORTANT!  The EAS loader will not process your packet if your sFTP client is connected to the dropbox.

When upload is complete and your secure FTP connection is closed, proceed to the packet submission step.

Step D: Submit packet to EAS loader

Once your email packet is in the dropbox and secure FTP connection to the dropbox is closed, connect to EASi and submit the packet.

Tip:  The EAS loader runs about every 15 minutes (at  :15, :30, :45 and :59 minutes past the hour) from 8am-9pm Monday to Saturday.  Your packet needs to be in the dropbox and submitted to EAS to be picked up by the loader.

  1. Open a VPN connection to the secure DRS tunnel. Connect to EASi:

    Production:  https://easi.library.harvard.edu/easi
    Test (qa): https://easi-qa.lib.harvard.edu:9035/easi

  2. Select the "Packets" option on the EASi main menu. The Submit Packet page will display.
  3. Select a packet to process. At top left, select your packet on the Select Packet list. A listing of packet contents will display under Packet Inventory on the right.
  4. Assign metadata to content in the packet. Enter values into the metadata panel on the left.  Required fields are: packet name, depositor email, creator client, DRS access flag, and billing code; the rest are optional.
    • Packet name: By default, the staging directory name is used here, but you can replace this with a name of your choice.  This packet name will be associated with every message and attachment in the packet.
    • Depositor email: Address to which the load report will be sent.  The "manager email" value from your EASi account is used by default, but you can substitute another address.
    • Creator client: Email software used to create content in the packet.  If you know the exact creator client and version for your content but these are not listed, please contact Library Technology Services. If you do not know the exact creator client version, select the “version unknown” option for that client.
    • DRS access flag: This flag controls access to messages and attachments once these are stored in the DRS. Flag value is set to N (not accessible). 
    • Billing code: DRS billing code that will be associated with messages and attachments in the packet.
  5. Submit the packet.  Click the Submit button to start the process.  Click "OK" on the pop-up window that displays to confirm the submission.

Step E: Review the EAS load report

The EAS loader runs about every 15 minutes (at  :15, :30, :45 and :59 minutes past the hour) from 8am-9pm Monday to Saturday. Once a packet is picked up, loader processing may take several minutes to several hours depending on size of the packet. You can check the EASi Packet Import Queue to check on status of packets in process or the Packet Summary for packets that have been loaded to EAS. When processing is complete, the EASi loader will email a load report to the address specified in the "Depositor email" field in packet metadata. Once you get the load report, items in your packet should be available in EASi.

The load report email will report the date/time of import, the packet name you supplied, a packet ID number assigned by EASi, and a count of the messages, attachments and inline files that were processed. "Inline files" are usually image files embedded in messages.

Sample load report:

===============================================================================================================================
Subject: EAS notification (qa): Importer
Packet "E-journal_and_format_reg" was imported by EAS (qa account HUL.ARCH) on Tue Sep 13 14:31:03 EDT 2011 with packet ID 161.
The packet contains 756 messages, 0 attachments, and 0 inline files.
===============================================================================================================================

About the EAS import process

Once the curator submits a packet, the EAS loader will detect the packet and move it from the secure drop box file system to a secure storage file system location designated for EAS. The EAS importer picks up the packet from the secure storage file system and runs a series of processes that convert the original packet, creating a copy for further processing. Both the original packet and the converted copy are kept in the EAS secure storage file system.

The EAS importer process performs the following conversion routines on the email content:

  • Converts email messages to the EML standard format (folders of .eml message files conforming to RFC-2822).
  • Converts email attachments:
    • Copies external attachments.
    • Extracts embedded attachments.
    • File names based on message subject are converted to random alpha-numeric strings (for increased security).
    • Embedded files with no file name are renamed to unknown_EAS-generated_<random number> (previously these were named <random_number>_unknown). 
    • Embedded files with an invalid name are renamed to EAS-substituted_<random number>.
  • Writes metadata supplied by the curator to an Oracle database.
  • Associates rights metadata restrictions of “Rights basis: risk assessment” and “Secure Storage required: unconfirmed”.
  • Stores the metadata plus extracted email message content (headers and bodies) in a Solr index on the secure storage file system.