archivesspace a Python library for querying the ArchivesSpace API¶
archivesspace is a python module for making queries to ArchivesSpace much easier.
This code lives at: https://github.com/SmithCollegeLibraries/archivesspace-python
Documentation located here: https://smithcollegelibraries.github.io/archivesspace-python/
Compatibility¶
As of writing, archivesspace has only been tested with ArchivesSpace 2.1.2 and Python 3. YMMV with other versions.
Getting started¶
At the heart of the module is the class ArchivesSpace. To set up a connection create an ArchivesSpace with your login credentials, and run the connect() method.
>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> print(aspace.connection['user']['username'])
admin
To continue you will first need to familiarize yourself with the ArchivesSpace REST API documentation located here: https://archivesspace.github.io/archivesspace/api/#archivesspace-rest-api
Pro tip: If fields are missing from the API documentation, get them from the horse’s mouth by checking the ArchivesSpace JSON Schemas located here: https://github.com/archivesspace/archivesspace/blob/master/common/schemas
Note that required fields are indicated by “ifmissing” not “required.”
Getting a record¶
To retrieve a record from ArchivesSpace use the get() method.
>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> jsonResponse = aspace.get("/users/1")
>>> jsonResponse['username']
'admin'
Posting a record¶
To post a record to ArchivesSpace use the post() method.
Example:
>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>>
>>> data = { "jsonmodel_type":"subject",
... "external_ids":[],
... "publish":True,
... "used_within_repositories":[],
... "used_within_published_repositories":[],
... "terms":[{ "jsonmodel_type":"term",
... "term":"Alberta",
... "term_type":"geographic",
... "vocabulary":"/vocabularies/1"}],
... "external_documents":[],
... "vocabulary":"/vocabularies/1",
... "authority_id":"myid114",
... "source":"local"}
>>>
>>> response = aspace.post("/subjects", data)
>>> # import pdb; pdb.set_trace()
>>> response['uri']
'/subjects/...'
Updating a record¶
Upading a record in ArchivesSpace is a two step process. First, retrieve the record, then post the modified version back to ArchivesSpace.
>>> aspace = ArchivesSpace('http','localhost', 8089, 'admin', 'admin')
>>> aspace.connect()
>>> myrecord = aspace.get('/subjects/1')
>>> myrecord['scope_note'] = "Hello World"
>>> response = aspace.post('/subjects/1', requestData=myrecord)
>>> response['lock_version']
1
Behind the scenes: there’s a special field called lock_version included in the retrieved data structure. This field is required by ArchivesSpace when you post the record back. This field ensures that only one agent edits the record at a time.
Getting listings and search results¶
ArchivesSpace uses paginated responses for queries that would return many items. To do a paginated query use the getPaged() method.
>>> from archivesspace import ArchivesSpace
>>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin')
>>> aspace.connect()
>>> response = aspace.getPaged("/subjects")
>>> for subject in response:
... print(subject['title'])
...
Alberta
Reference¶
-
class
archivesspace.
ArchivesSpace
(protocol, domain, port, username, password)¶ Base class for establishing a session with an ArchivesSpace repository, and doing API queries against it.
>>> from archivesspace import ArchivesSpace >>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin') >>> aspace.connect() >>> print(aspace.connection['user']['username']) admin
-
connect
()¶ Start a sessions with ArchivesSpace. This must be done before anything else.
>>> from archivesspace import ArchivesSpace >>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin') >>> aspace.connect() >>> print(aspace.connection['user']['username']) admin
-
get
(path, requestData={})¶ Do a GET request to ArchivesSpace and return the JSON response
>>> from archivesspace import ArchivesSpace >>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin') >>> aspace.connect() >>> jsonResponse = aspace.get("/users/1") >>> jsonResponse['username'] 'admin'
-
getPaged
(path, requestData={})¶ Automatically request all the pages to build a complete data set
-
getPagedAllIds
(path)¶ Get a list of all of the IDs
-
post
(path, requestData={})¶ Do a POST request to ArchivesSpace and return the JSON response
>>> from archivesspace import ArchivesSpace >>> aspace = ArchivesSpace('http', 'localhost', '8089', 'admin', 'admin') >>> aspace.connect() >>> >>> data = { "jsonmodel_type":"subject", ... "external_ids":[], ... "publish":True, ... "used_within_repositories":[], ... "used_within_published_repositories":[], ... "terms":[{ "jsonmodel_type":"term", ... "term":"North Pole", ... "term_type":"geographic", ... "vocabulary":"/vocabularies/1"}], ... "external_documents":[], ... "vocabulary":"/vocabularies/1", ... "authority_id":"myid314", ... "source":"local"} >>> >>> response = aspace.post("/subjects", requestData=data) >>> response['uri'] '/subjects/...' >>>
-
setJsonSerializerDefault
(jsonSerializerDefault)¶ Set an optional custom JSON serializer to be passed to json.dumps. c.f. https://docs.python.org/3/library/json.html#json.JSONEncoder.default
If you don’t know what this is, don’t use it.
-
-
archivesspace.
checkStatusCodes
(response, data={})¶ This helper function checks the response from a request for problems and then returns the data if everything is fine.
-
archivesspace.
formatJson
(data)¶ Use this function to make data look nice
-
archivesspace.
formatResponse
(response)¶ Get the data element of a requests response and format it to be pretty