archivesspace a Python library for querying the ArchivesSpace API

archivesspace is a python module for making queries to ArchivesSpace much easier.

This code lives at: https://github.com/SmithCollegeLibraries/archivesspace-python

Documentation located here: https://smithcollegelibraries.github.io/archivesspace-python/

Compatibility

As of writing, archivesspace has only been tested with ArchivesSpace 2.1.2 and Python 3. YMMV with other versions.

Getting started

At the heart of the module is the class ArchivesSpace. To set up a connection create an ArchivesSpace with your login credentials, and run the connect() method.

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> print(aspace.connection['user']['username'])
admin

To continue you will first need to familiarize yourself with the ArchivesSpace REST API documentation located here: https://archivesspace.github.io/archivesspace/api/#archivesspace-rest-api

Pro tip: If fields are missing from the API documentation, get them from the horse’s mouth by checking the ArchivesSpace JSON Schemas located here: https://github.com/archivesspace/archivesspace/blob/master/common/schemas

Note that required fields are indicated by “ifmissing” not “required.”

Getting a record

To retrieve a record from ArchivesSpace use the get() method.

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> jsonResponse = aspace.get("/users/1")
>>> jsonResponse['username']
'admin'

Posting a record

To post a record to ArchivesSpace use the post() method.

Example:

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> 
>>> data = { "jsonmodel_type":"subject",
...         "external_ids":[],
...         "publish":True,
...         "used_within_repositories":[],
...         "used_within_published_repositories":[],
...         "terms":[{ "jsonmodel_type":"term",
...         "term":"Alberta",
...         "term_type":"geographic",
...         "vocabulary":"/vocabularies/1"}],
...         "external_documents":[],
...         "vocabulary":"/vocabularies/1",
...         "authority_id":"myid114",
...         "source":"local"}
>>> 
>>> response = aspace.post("/subjects", data)
>>> # import pdb; pdb.set_trace()
>>> response['uri']
'/subjects/...'

Updating a record

Upading a record in ArchivesSpace is a two step process. First, retrieve the record, then post the modified version back to ArchivesSpace.

>>> aspace = ArchivesSpace('http','localhost', 8089, 'admin', 'admin')
>>> aspace.connect()
>>> myrecord = aspace.get('/subjects/1')
>>> myrecord['scope_note'] = "Hello World"
>>> response = aspace.post('/subjects/1', requestData=myrecord)
>>> response['lock_version']
1
Behind the scenes: there’s a special field called lock_version included in the retrieved data structure. This field is required by ArchivesSpace when you post the record back. This field ensures that only one agent edits the record at a time.

Getting listings and search results

ArchivesSpace uses paginated responses for queries that would return many items. To do a paginated query use the getPaged() method.

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> response = aspace.getPaged("/subjects")
>>> for subject in response:
...     print(subject['title'])
... 
Alberta

Reference

class archivesspace.ArchivesSpace(protocol, domain, port, username, password)

Base class for establishing a session with an ArchivesSpace repository, and doing API queries against it.

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> print(aspace.connection['user']['username'])
admin
connect()

Start a sessions with ArchivesSpace. This must be done before anything else.

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> print(aspace.connection['user']['username'])
admin
get(path, requestData={})

Do a GET request to ArchivesSpace and return the JSON response

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> jsonResponse = aspace.get("/users/1")
>>> jsonResponse['username']
'admin'
getPaged(path, requestData={})

Automatically request all the pages to build a complete data set

getPagedAllIds(path)

Get a list of all of the IDs

post(path, requestData={})

Do a POST request to ArchivesSpace and return the JSON response

>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> 
>>> data = { "jsonmodel_type":"subject",
...         "external_ids":[],
...         "publish":True,
...         "used_within_repositories":[],
...         "used_within_published_repositories":[],
...         "terms":[{ "jsonmodel_type":"term",
...         "term":"North Pole",
...         "term_type":"geographic",
...         "vocabulary":"/vocabularies/1"}],
...         "external_documents":[],
...         "vocabulary":"/vocabularies/1",
...         "authority_id":"myid314",
...         "source":"local"}
>>> 
>>> response = aspace.post("/subjects", requestData=data)
>>> response['uri']
'/subjects/...'
>>> 
setJsonSerializerDefault(jsonSerializerDefault)

Set an optional custom JSON serializer to be passed to json.dumps. c.f. https://docs.python.org/3/library/json.html#json.JSONEncoder.default

If you don’t know what this is, don’t use it.

archivesspace.checkStatusCodes(response, data={})

This helper function checks the response from a request for problems and then returns the data if everything is fine.

archivesspace.formatJson(data)

Use this function to make data look nice

archivesspace.formatResponse(response)

Get the data element of a requests response and format it to be pretty