Traversal¶
In addition to regular requests to fetch entries one-by-one there are requests to fetch several entries and/or their metadata with a single request. Performing a search request is one possibility, another one is to perform a graph traversal based on the statements in the entry or metadata graphs. This page explains how to traverse graphs using the REST API.
Traversal of metadata graphs¶
Parameters:
- The
recursive
parameter triggers the traversal and may contain a comma-separated list of url-encoded URIs and/or traversal profiles (see below for how to configure profiles). The most widely used namespaces (dc, dct, foaf, vcard) are expanded automatically. - The
repository
parameter makes the traversal ignore context boundaries in retrieving metadata. Without this parameter the traversal algorithm fetches metadata only from the context of the starting point entry. - The
format
parameter is optional (normal content negotiation works through the Accept-header) and works the same as with normal metadata requests. - The
depth
parameter is optional and determines how many levels the algorithm should traverse. The default and maximum depth is 10, this can be changed through the configuration.
The traversal algorithm works as follows:
- The metadata of the requested entry is the starting point. a. If the requesting user does not have sufficient rights to read metadata, the request fails immediately.
- The
recursive
parameter is parsed and resolved into a list of URIs. This list forms the list of predicates that are matched in order to detect which objects to follow. Also, an eventually existing list of blacklisted predicate/object combinations is loaded from the configuration. - The entry's metadata graph is loaded and checked against the blacklist. If at least one tuple of the blacklist matches the graph (it is a simple string match, so it does not matter whether the object is a literal or resource), the traversal is aborted for this entry and the currently loaded entry is not included in the result. The traversal continues with the other remaining entries of the same traversal level (if level > 0).
- The statements in the metadata graph are iterated over and matched against the list of predicates.
- If a predicate matches and the object is a URI which starts with the EntryStore instance's base URI, the object is fetched as entry and its metadata is traversed. This happens recursively. a. The object URI is tried as resource URI first, if no entry matches it is tried as entry URI.
- Circular dependencies are detected. Also, a maximum of 10 levels are traversed.
- A merged graph containing metadata (local and external) of all matching and traversed entries is returned.
Example:
{base-uri}/1/metadata/99?recursive=dcat,foaf:knows&format=text/turtle
Traversal of entry graphs¶
This functionality remains to be implemented.
Configuration of traversal profiles¶
In addition to sending a list of URIs as URL parameter for the request, so called traversal profiles can be predefined in entrystore.properties
. A profile consists basically of a name and a list of URIs, see below for DCAT as an example:
entrystore.traversal.dcat.1=http://www.w3.org/ns/dcat#contactPoint
entrystore.traversal.dcat.2=http://purl.org/dc/terms/publisher
entrystore.traversal.dcat.3=http://www.w3.org/ns/dcat#dataset
entrystore.traversal.dcat.4=http://www.w3.org/ns/dcat#distribution
entrystore.traversal.dcat.5=http://schema.theodi.org/odrs#copyrightHolder
The name is provided implicitly through the part of the key after entrystore.traversal
. There is no upper limit for the amount of traversal profiles or its containing predicate URIs.
Blacklisting of predicate/object combinations¶
Blacklists can be defined per traversal profile. Despite its configuration per traversal profile, a restriction of the current implementation is that blacklists are applied globally. See the algorithm description above for information on how blacklists work.
A blacklist is configured as a list of tuples which are separated with a comma according to the pattern predicate,object
. Common namespaces (see class org.entrystore.repository.util.NS
) are expanded.
Example:
entrystore.traversal.dcat.blacklist.1=dcterms:accrualPeriodicity,http://publications.europa.eu/resource/authority/frequency/BIMONTHLY
entrystore.traversal.dcat.blacklist.2=dcterms:format,text/html
This exludes all entry's of which the metadata graphs contain the above predicate/object combinations and it stops traversal from the matching entry's metadata downwards.
Configuration of max depth¶
The default max traversal depth of 10 can be changed with the following settings:
entrystore.traversal.max-depth=20