ASpace API

Welcome to the ASpace API Workshop wiki page.

Workshop sessions

Session 1 (Intro, Setup, Python practice) Tue. 6/9, 10-12noon 

Presenter's notebook: .ipynb | .pdf | .html

Session 2 (Working with the ASpace API, part 1) Wed. 6/17, 10-12noon 

Presenter's notebook: .ipynb | .pdf | .html  
Session #2 recording (1:49:00)

Session 3 (Working with the ASpace API, part 2) Wed. 6/24, 1-3pm 

Presenter's notebook: .ipynb | .pdf | .html
Session #3 recording (01:39:27)

Session 4 (API clinic: open discussion & picking of Dave’s brain) Tue. 6/30, 10-12noon 

Presenter's notebook: .ipynb | .pdf | .html
Session #4 recording (01:46:15)

Workshop evaluation survey

Text Blocks for Session 2

ASnake Initialization Code

# code here MUST be run before subsequent examples will work

# here we are importing the ASnakeClient class from the asnake.client module
from asnake.client import ASnakeClient

# here we are creating our ASnake client, which we will use to make API requests
client = ASnakeClient()

Software Agent Template

{
   "jsonmodel_type": "agent_software",
   "names": [
     {
       "jsonmodel_type": "name_software",
       "software_name": "Dave's Script",
       "sort_name": "Dave's Script",
       "version": "1.0",
       "isDisplayName": True,
       "rules": "local"
     }
   ],
   "title": "Dave's Script v1.0"
}

Pre-work for session #2: Intro to the API and ArchivesSnake video

Below are 2 different recordings of Dave Mayo's presentation Introduction to the ArchivesSpace API and ArchivesSnake, given at the ArchivesSpace Online Forum in 2020. If you can, review this video before session #2. If you want just the presentation, select the MP4 option; if you also want the Q&A session that followed the presentation, chose the Youtube option.

44:10 (Presentation only; MP4)

59:32 (Presentation followed by Q&A; Youtube)

Pre-workshop software setup & video

See the google doc.

Command line basics

ActionWinMac
Open the consoleWin+R > type cmd (or powershell) > Enter/OKFinder > Applications > Utilities > Terminal
List files and directoriesdirls
Create a directorymkdirmkdir
Move to directorycdcd
Go back to previous/parent directorycd..cd ..
Print current directorycdpwd
Find home directoryecho %HOMEPATH%echo ~
Command historyUp arrowUp arrow


Useful Python Libraries to look at next

argparse

https://docs.python.org/3/howto/argparse.html - tutorial (start here)

https://docs.python.org/3/library/argparse.html - full documentation

Handling command line arguments is a really important and valuable step toward improving the utility and reusability of your scripts.  argparse is built into Python, and will: 

  • let you define command line arguments for your scripts, and have them converted into data your program can use
  • automatically take these arguments (and attached description) to give your script a --help argument that will print a description of what your script does and how to use it

Here's a small example script:

#!/usr/bin/env python3
from argparse import ArgumentParser
parser = ArgumentParser(description='Description of what my script does')
parser.add_argument('input_csv', nargs='?', default='input.csv', help='CSV file to be read!')
parser.add_argument('output_csv', nargs='?', default='report.csv', help='CSV file report gets written to!')

args = parser.parse_args()

print(args.input_csv)
print(args.output_csv)

If you save this in a file called test_argparse.py, and run with the --help:

python3 test_argparse.py --help

You will get the following output:

usage: test_argparse.py [-h] [input_csv] [output_csv]

Description of what my script does

positional arguments:
  input_csv   CSV file to be read!
  output_csv  CSV file report gets written to!

optional arguments:
  -h, --help  show this help message and exit

openpyxl

https://openpyxl.readthedocs.io/en/stable/

There are several libraries in Python for working with Excel.  This is the one I have found to be the most useful and least frustrating.  In particular, it seems to have reliable access to the "raw" value input into cells, which has let me work around Excel date-handling issues.

sqlite3

https://pynative.com/python-sqlite/ - tutorial

https://docs.python.org/3/library/sqlite3.html - documentation

sqlite3 is an SQL database that is stored in a single file.  If you're familiar with SQL, it provides a way to get some of the benefits of storing data in a database without needing a server set up.  It's best thought of as an intermediate stage between spreadsheets and "full" databases; it's not as complex to work with, doesn't need any IT support, and you can do joins and queries.

Concepts in Python to self-study

iterator protocol - how do for loops work?

Python has something called "the iterator protocol" - which underlies how any object you can loop over works.  It's very useful to understand how this works and is applied; in ASpace, a lot of scripting work involves iterating over search results or scripting.

An extremely sketchy overview:

  1. if a python object has the special __iter__() method defined on it, it is an "iterable"
  2. the __iter__() method should return an iterator, a python object with a special __next__() method
  3. the __next__() method can be called multiple times, and will return each item in the iterable until there are none left
  4. then, it raises a StopIteration exception

for loops internally use this! And it's how the part where we skipped header lines in the CSV example worked - the function next() takes an iterator and calls its __next__() method.

https://www.pythonlikeyoumeanit.com/Module2_EssentialsOfPython/Iterables.html - complete but somewhat dense tutorial

ASpace Sandbox information

SB staff mode: 
https://arstaff-sb.lib.harvard.edu

SB PUI: 
https://hollisarchives-sb.lib.harvard.edu

Base url for the SB API: 
https://arstaff-sb.lib.harvard.edu:8443

See Sandbox Info google doc for more info.



Repository codes

2>ATK
3>ATKProd
4>HUA
5>LAW
6>PEA
7>DES
8>SCH
9>ART
10>BER
11>BAK
12>DIV
13>AJP

14>MED
15>ARN
16>DDO
17>ECB
18>ENV
19>FAL
20>FAR
21>FUN
22>GRA
23>HFA
24>HOU
25>HYL

26>MCZ
27>MUS

28>ORC
29>TOZ
30>URI
31>WID
32>WOL
33>HSI
34>ORA
35>VIT
36>GUT