Resources-specific Handler¶
A handler is a python class that is plugged into the generic TimeGate to fit any specific technique a web server has to manage its Original Resources and Mementos. Its role is simple: to retrieve the list of URI-Ms (with their archival dates) given a URI-R. It typically does so by connecting to an API.
Alternatives¶
- If no API is present: The list can be retrieved from many different ways. Page scraping, rule-based or even in a static manner. Anything will do.
- If the history cannot be retrieved entirely: The handler can implement an alternative function that returns one single URI-M and its archival datetime given both URI-R and the datetime the user requested.
- If the TimeGate’s algorithms that select the best Memento for a requested date do not apply to the system: Implementing the alternative function could also be used to bypass these algorithms. This is particularly useful if there are performance concerns, special cases or access restriction for Mementos.
Requirements¶
A handler require to have the following:
- It must a python file placed in the
core.handler
module (which is thecore/handler/
folder). And it must be unique. If several classes are needed, or to switch quickly between handlers, consider adding the handler module path manually in the configuration file. (See Configuring the server.) - A handler must extend the
core.handler_baseclass.Handler
base-class. - Implement at least one of the following:
get_all_mementos(uri_r)
class function: This function is called by the TimeGate to retrieve the history an original resourceuri_r
. The parameteruri_r
is a Python string representing the requested URI-R. The return value must be a list of 2-tuples:[(uri_m1, date1), (uri_m2, date2), ...]
. Each pair(uri_m, date)
contains the URI of an archived version of Ruri_m
, and the date at which it was archiveddate
.get_memento(uri_r, requested_date)
class function (alternative): This function will be called by the TimeGate to retrieve the best Memento foruri_
at the datedate
. Use it if the API cannot return the entire history for a resource efficiently or to bypass the TimeGate’s best Memento selection. The parameteruri_r
is a Python string representing the requested URI-R. The parameterdate
is a Pythondatetime.DateTime
object. In this case, the return value will contain only one 2-tuple:(uri_m, date)
which is the best memento that the handler could provide taking into account the limits of the API.
- Input parameters:
- All parameter values
uri_r
are Python strings representing the user’s requested URI-R. - All parameter values
requested_date
aredatetime.DateTime
objects representing the user’s requested datetime.
- All parameter values
- Output return values:
- All return values
uri_m
must be strings. - All return values
date
must be strings representing dates. Prefer the ISO 8601 format for the dates.
- All return values
- Note that:
- If both functions are implemented,
get_memento(uri_r, requested_date)
will always be used for TimeGate requests. - If the TimeMap advanced feature (see TimeMaps) is enabled,
get_all_mementos(uri_r)
must be implemented.
- If both functions are implemented,
Example¶
A simple example handler is provided incore/handler/
and can be
edited to match your web server’s requirements: - See
example.py
Which returns static lists.
Other handlers examples are provided for real world APIs in
core/handler_examples/
for instance:
- arXiv.py Where the Original Resources are the e-prints of http://arxiv.org/ -
- wikipedia.py Where the Original Resources are the articles of https://www.wikipedia.org/
- github.py Where the Original Resources are the repositories, trees (branches and directories), files and raw files.
Other scraping Handlers examples are provided for real world resources without any API:
- can.py Where the Original Resources are the archives stored in http://www.collectionscanada.gc.ca/webarchives/