Longleaf basic usage tutorial
This tutorial provides an introduction to the basic functionality of Longleaf by demonstrating how to configure and execute a small set of preservation tasks, using an example data directory as a local sandbox on your own computer.
Longleaf basic usage tasks covered in this tutorial:
- Validate the mandatory Longleaf configuration file that contains the storage locations and preservation activities you will use
- Register example data files
- Validate metadata for registered files
- Run a Preservation action on example data files
- Alter an example data file to cause an error; catch the error by re-running the Preservation action
- Re-register the altered file, and preserve the file in its new, altered state
System Requirements for this tutorial
Longleaf scripts rely on common UNIX programs. In Mac OS X and Linux operating systems, these programs will likely already be installed, but some of these tools, such as rsync
may be missing from a Windows system unless you have installed it.
Example data directory for this tutorial
The example data directory for this tutorial is named 'll-example'; it is located in the 'docs' folder of the longleaf-preservation repository.
The 'll-example' directory contains all the materials required for completing this tutorial:
- an example configuration file, 'config-example-relative.yml', that is pre-configured for tutorial tasks
- a folder of example data files to be preserved
- empty scaffolding folders that will be used for storing: 1) metadata files about the original data files, 2) the replicated data files created from the originals, and 3) metadata files about the replicated data files.
folder structure of 'll-example' directory: materials for tutorial tasks
└───ll-example
│ config-example-relative.yml
│
├───files-dir
│ LLexample-PDF.pdf
│ LLexample-TOCHANGE.txt
│ LLexample-tokeep.txt
│
├───metadata-dir
│
│
├───replica-files
│
│
└───replica-metadata
Subdirectories in 'll-example' are used as follows:
'files-dir'
Contains three example data files that will be preserved and replicated.
'metadata-dir'
Empty at start of tutorial; metadata files about the data files in 'files-dir' will be stored and automatically updated in this directory during Register and Preserve actions.
'replica-files'
Empty at start of tutorial; replicated data files (copies) created during Preserve actions run on the data files in 'files-dir' will be stored in this directory.
'replica-metadata'
Empty at start of tutorial; metadata files about the replicated data files in 'replica-files' will be stored in this directory.
Set up your local sandbox: copy the example data directory to your Desktop
Copy the 'll-example' directory to a location where it will be easy to access and work with. For example,it might be convenient to copy 'll-example' to your 'Desktop' location.
Note: All paths in the commands in this tutorial are relative to the 'll-example' folder itself, so if you execute commands from within 'll-example,' you can use the paths as shown with no alterations.
Review the example Longleaf configuration file
Locations
The example configuration file references storage locations based on the subdirectory names in 'll-example':
- The 'data-directory' location points to 'files-dir' as the location of the original data files to be preserved, and 'metadata-dir' for storing metadata files created about these original data files.
- The 'backup-directory' location refers to the 'replica-files' folder as the storage location where the copies of the original files will be stored, and 'replica-metadata' for metadata files.
Services
The 'services' area in the configuration file defines characteristics of work scripts, such as the target storage location for the replication script, and the checksum algorithm for the fixity script.
Service Mappings
The 'service-mappings' area specifies which services will run at a particular storage location. In the example configuration file, for example, the services 'example_replication' and 'example_fixity' are assigned to run at location 'data-directory'. At location 'backup_directory', however, only 'example_fixity' is assigned to run, so files there will have fixity checks, but will not get replicated.
locations: data-directory
services:
- example_replication
- example_fixity
- locations: backup-directory
services:
- example_fixity
Relative Paths in the Configuration File
Storage locations in this configuration file specify paths relative to the location of the configuration file itself, and will also be evaluated relative to the location of the configuration file itself when Longleaf commands are run. In this tutorial, the configuration file is located inside 'll-example', at the same level as the data and metadata folders.
Check that you can run Longleaf, by viewing the help text page
Once you have copied the 'll-example' directory to an accessible working location, and have familiarized yourself with the configuration yaml file, you are ready to start using Longleaf!
First confirm that you can run Longleaf, as you did in the "Installation Instructions" page of this site. Open a terminal window and cd
into the 'll-example' directory, then enter the command 'longleaf', with no arguments:
longleaf
This command will output the help page text, showing all available commands.
Validate the example configuration file
Next, you will validate the Longleaf configuration file, to ensure that storage locations and paths are correctly specified and accessible.
As noted above, all paths in this tutorial are relative to the 'll-example' directory, so make sure that you execute commands from within that directory, or amend paths accordingly.
Command for validating configuration file:
longleaf validate_config -c config-example-relative.yml
Successful validation of the configuration file outputs the following message in terminal:
SUCCESS: Application configuration passed validation: config-example-relative.yml
Register example data files
Once the configuration file has been validated, you can proceed to registering the example data files, so that they are ready for further preservation actions with Longleaf.
Note that Longleaf commands can be run on directories, or individual files. For example, when using the Register command, you can either:
- register a whole directory at once, which will individually register every file in that directory
or - register individual files.
In this example of using the Register command, we will register an entire directory of files by specifying the directory path.
Use command:
longleaf register -c config-example-relative.yml -f files-dir
If files successfully registered, Longleaf will list the registration outcome for each file within the directory:
SUCCESS register /data/ll-example/files-dir/LLexample-PDF.pdf
SUCCESS register /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS register /data/ll-example/files-dir/LLexample-tokeep.txt
Validate the new metadata files
Validate the metadata files that were created for the files you just registered. Use command:
longleaf validate_metadata -c config-example-relative.yml -f files-dir
Confirmation of validation for the metadata files:
SUCCESS: Metadata for file passed validation: /data/ll-example/files-dir/LLexample-PDF.pdf
SUCCESS: Metadata for file passed validation: /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS: Metadata for file passed validation: /data/ll-example/files-dir/LLexample-tokeep.txt
Run the Preserve command, to replicate and check fixity on the registered files.
Use command:
longleaf preserve -c config-example-relative.yml -f files-dir
Confirmation of successful Preservation action completed on all files in the files-dir directory:
SUCCESS register /data/ll-example/replica-files/LLexample-PDF.pdf
SUCCESS preserve[example_replication] /data/ll-example/files-dir/LLexample-PDF.pdf
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-PDF.pdf
SUCCESS register /data/ll-example/replica-files/LLexample-TOCHANGE.txt
SUCCESS preserve[example_replication] /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS register /data/ll-example/replica-files/LLexample-tokeep.txt
SUCCESS preserve[example_replication] /data/ll-example/files-dir/LLexample-tokeep.txt
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-tokeep.txt
Look at the output of the Preserve command
Take a look at the SUCCESS output messages from the preserve
command above.
Notice that for each file included in the Preservation action, Longleaf indicates "SUCCESS" for the "register" and "preserve" components of the Preservation action, as well as the individual scripts within the "preserve" component.
Also take a look inside the subfolders in your 'll-example' sandbox folder, using Finder or command-line tools.
You'll notice that there are newly-created data files and metadata files in the 'replica-' directories. The original data files from 'files-dir' have been copied to the 'replica-files' directory, and metadata files for those copies have been created in the 'replica-metadata' directory.
Re-run the Preserve command, to check the integrity of the files that are being preserved.
Since the 'example_fixity' script in our example configuration file is set to frequency: 30 seconds, we can check fixity on the preserved files once 30 seconds have elapsed since the last preservation action.
Use command:
longleaf preserve -c config-example-relative.yml -f files-dir
Success output message:
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-PDF.pdf
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-tokeep.txt
Note that the output in terminal window is much shorter than the first preserve
command. This output only shows "SUCCESS" and preserve[example_fixity] for each file. This time register and replication are not included because those actions already ran, and do not need to run this time. Only fixity checks ran in this Preserve action.
In the finder window, notice that the Date Modified is updated on the metadata files in the metadata-dir. The metadata files for the original data files are updated to reflect these Preservation actions running.
Cause a change to a file, and catch the change with the Preservation command.
Now make a change to one of the files that is being preserved, which you will then catch by re-running preserve
on the storage directory for that file.
Open up the file: 'll-example/files-dir/LLexample-TOCHANGE.txt'. On line 4 of the text file, you'll see "( insert changed text ) ". Insert your cursor near this line, and type something to make a change to the file. Then save the file and close it. In your 'll-example/files-dir' directory, notice that the Date Modified has changed for this file.
Run the preserve
command again. This time it will fail, because one of the files under preservation has been altered since the file's last checksums were recorded.
Use command:
longleaf preserve -c config-example-relative.yml -f files-dir
Output showing failed fixity check on the file 'll-example/files-dir/LLexample-TOCHANGE.txt':
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-PDF.pdf
FAILURE preserve[example_fixity] /data/ll-example/files-dir/LLexample-TOCHANGE.txt: Fixity check using algorithm 'sha1' failed for file /data/ll-example/files-dir/LLexample-TOCHANGE.txt: expected 'effa1bfc1b93f1260a36044bcd668240cf14738a', calculated 'cd5e53f6945ceda2d90b003a39af509d7acb068f.'
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-tokeep.txt
Re-register the altered file.
Now you will re-register the altered file, using the --force
option for the register
command. Run this registration command at the file level so that only this file is re-registered, not the entire directory.
Re-registration will update the file's preservation metadata to the new state of 'LLexample-TOCHANGE.txt'.
Note, however, that the re-registered altered file (ll-example/files-dir/LLexample-TOCHANGE.txt) will still not match the copy of LLexample-TOCHANGE.txt that is stored in "replica-files" (ll-example/replica-files/LLexample-TOCHANGE.txt), because the replica copy was created earlier, from the unaltered original file.
Use command:
longleaf register -c config-example-relative.yml -f files-dir/LLexample-TOCHANGE.txt --force
Success output message:
SUCCESS register /data/ll-example/files-dir/LLexample-TOCHANGE.txt
Run Preserve on the re-registered file.
Running preserve
on the re-registered altered file will replicate the new state of this file to the copied data location. As with register --force
, run preserve
at the file level, not the directory.
Use command:
longleaf preserve -c config-example-relative.yml -f files-dir/LLexample-TOCHANGE.txt
The terminal output shows the new registration of the file, and the replication and fixity scripts running on this file. In finder, you can see that the Date Modified of the 'll-example-TOCHANGE.txt' file in the 'replica-files' directory now matches the original file. Open up 'replica-files/ll-example-TOCHANGE.txt' and you'll see the added line of text in this file.
Successful preservation of re-registered changed file:
SUCCESS register /data/ll-example/replica-files/LLexample-TOCHANGE.txt
SUCCESS preserve[example_replication] /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-TOCHANGE.txt
Re-run Preserve at the directory level to check all three files.
After the 30-sec fixity check interval has elapsed, re-run the preserve
command at the directory level as before, and now all three files pass again.
Use command:
longleaf preserve -c config-example-relative.yml -f files-dir
Success output message:
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-PDF.pdf
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-TOCHANGE.txt
SUCCESS preserve[example_fixity] /data/ll-example/files-dir/LLexample-tokeep.txt