Skip to content

Backup

Overview

The internal backup performs a snapshot of the quad store. All triples (including named graph information) are saved to a single RDF file (with a timestamped name) in the configured format. Only formats with support for named graphs are supported.

While the backup is running, no modifying request are accepted by the REST interface (PUT, DELETE, POST). Read-only access using HEAD and GET is possible.

By default (and if maintenance is not configured, see below) each backup execution will create a new folder all inside the configured backup directory. In the beginning of the backup process the old backup will be removed. It is possible to configure a different order if the previous backup should be deleted after the new backup was created.

Configuration

All backup settings are configured directly in the entrystore.properties file. The backup is stored in the folder which is specified in the entrystore.properties file.

The parameters are:

entrystore.backup.scheduler=on|off (default: off)
entrystore.backup.folder=/path/to/backup/
entrystore.backup.cronexp=the time when the backup should be run, in Quartz cron format (see below)
entrystore.backup.gzip=on|off (default: off)
entrystore.backup.format=n-quads|trig|trix|binaryrdf (default: n-quads)
entrystore.backup.delete-after=on|off (default: off; only used if maintenance is disabled)
entrystore.backup.include-files=on|off (default: on)

The Quartz cron expression consists of 6 fields:

  • Seconds
  • Minutes
  • Hours
  • Day Of Month
  • Month
  • Day Of Week

The first three fields may be expressed using a randomizing function rnd() (this is non-standard and a feature of EntryStore). Either an asterisk * for any value or an integer range may be provided. Depending on the position (second, minute, hour) it generates a value between 0-59 or 0-23. The ranges are boundary inclusive. The cron expression is evaluated upon every startup of the EntryStore instance.

The randomization feature is particularly useful in cases where many EntryStore instances are hosted in the same environment to avoid overloading the host in case all instances should start their backup process at the same time.

Example: the expression 0 rnd(*) rnd(1-3) * * ? starts the backup process sometime between 1:00 am and 3:59 am.

History and Maintenance

EntryStore's backup can keep more than one backup and perform maintenances by removing outdated backups. This is configured through the following settings in addition to the main backup settings above:

entrystore.backup.maintenance=on|off (default: off) - defines whether maintenance should be performed
entrystore.backup.maintenance.upper-limit=maximum amount of stored backups
entrystore.backup.maintenance.lower-limit=minimum amount of stored backups
entrystore.backup.maintenance.expires-after-days=maximum days a backup is kept

Maintenance is automatically turned off if none of upper-limit, lower-limit and expires-after-days are configured with values higher than 1.

When maintenance is active a new folder named after the current date and time will be created for each existing backup, instead of the default folder name all.

Restore

Follow the following steps to restore a backup:

  • Shutdown EntryStore.
  • Restore the files in the files folder using a normal file copy operation.
  • Restore the store directory by importing the serialized RDF to a Sesame/RDF4J store. This can be done using EntryStore Tools (see the Git repo on Bitbucket).
  • Start EntryStore.