Skip to end of metadata
Go to start of metadata

Tools to help process digital files in \\Romeo\processing

Logon with: 

ssh railsdev

Creating image derivatives with creates compressed images (jpg, png) and pdfs from master images.

  1. Log on to processing server or clone the ingest-processing-workflow repo from Github.
  2. cd or open a command line shell in the ingest-processing-workflow directory.
    1. on server: cd /opt/lib/ingest-processing-workflow

    2. the git repo directory you cloned on your local machine
  3. Run: python3 <package ID> -i input -o output
    • Examples:
      • python3 apap301_h4fMLPL48CuxFPLpYxTmkL -i tif -o jpg
      • python3 ua950.009_qUQbs7GYhzmB3uL3yjH5uX -i tif -o pdf
      • python3 ua809_JxkK2VWVFu7F8VWaTe72BG -i pdf -o pdf
  4. (optional) A -p flag with a subpath limit the input to that path, relative to the masters directory:
    • Example:
      • python ua802.011_xMHVAto2AuzLfd2NtP9STY -i tif -o pdf -p TIFFs/edited

      • This will only convert files in:[package]\masters\TIFFs\edited
  5. Files will be created in \derivatives subfolder
    1. directory structure will also be duplicated
    2. for PDF outputs, all input images in the same folder will be joined as a single PDF in the order of the filesystem
    3. (Server only for now) PDF inputs and PDF outputs will combine in PDF files in folders in the order of the filesystem


Arranging Digitized and Born-Digital Materials in ArchivesSpace

  • Use asInventory (asUpload.exe) to enter initial description in ArchivesSpace
  • Use asInventory (asDownload.exe) to export the same description from ArchivesSpace with the new identifiers
    • Be sure to use the whole spreadsheet, even if you are not adding digital objects for all items or the order will be altered. Just leave the DAO field blank for these items.
  • Export the changes you made in ArchivesSpace to ArcLight

  • Place a copy of the exported spreadsheet in the package's \metadata directory. it may help to any other .xlsx files in that folder to a subfolder so they don't affect, but you can also exclude these with a -f flag.


  • Use to make a .txt file of all derivatives

    python3 apap015_CijY985mDUy6hdLSPPYqRR

  • Use the derivatives.txt file in the package root to copy and paste and arrange derivative relative paths in to DAO column in exported asInventory spreadsheet

  • If you need to add lines, begin the process of asUpload.exe and asDownload.exe again. This will create ASpace IDs for the new line which are needed for the next step.

  • Run with the package ID as an argument to create Hyrax Upload .tsv file.

    sudo python3 ua950.012_Xf5xzeim7n4yE6tjKKHqLM

  • If there are other .xlsx files in the metadata folder, you can move these files to a subfolder or use the optional -f flag to only use specific files

sudo python3 ua397_cv3E3okEhxKARzZunE4Dom -f "Tower Tribune_exported.xslx"

  • Add Resource Type, Licenses or Rights Statements to all objects in the Hyrax upload .tsv file created by the script

Resource Types:

      • Audio
      • Bound Volume
      • Dataset
      • Document
      • Image
      • Map
      • Mixed Materials (Avoid)
      • Pamphlet
      • Periodical
      • Slides
      • Video
      • Other (Avoid)


      • BY-NC-ND:
      • Public Domain:
      • Unknown

Rights Statements (if License is "Unknown"):

  • Move a copy of all derivatives to be uploaded to \\Lincoln\Library\ESPYderivatives\files
    • (Optional) Use a collection ID subfolder if convenient: \\Lincoln\Library\ESPYderivatives\files\apap101

  • Move the Hyrax Upload .tsv file to the Hyrax import directory: \\Lincoln\Library\ESPYderivatives\import

  • Upload the files to Hyrax using the Batch Upload to Hyrax documentation (Starting with Step 4)
    Note: All files will be public by default

  • When the import is finished, copy the completed .tsv file back to the package's metadata file from: \\Lincoln\Library\ESPYderivatives\complete

  • Run with the package ID as an argument to add the correct URIs back into the ASpace export

    python3 ua680_FbBxaYn8Jm9tBxuXsQ6R3L

  • Don't forget to unpublish/republish to make sure the collection is exported!

  • Use asInventory to re-import the ASpace spreadsheet with the correct URIs back into ArchivesSpace

  • Run with the package ID as an argument to combine the processing package and the SIP in to an AIP
    • Must be run from the processing server, not a local machine
    • If running without flags, use "packageAIP" function from /etc/profile.d/
      • Will log to \\Romeo\SPE\processing\log\<collection ID>
    • Use a -u flag to use the master files from the processing package instead of the files in the SIP
    • Use a -n flag for no derivatives, this will only package master files
    • If this is used, you must also delete the SIP manually after examination with

    sudo python3 ua950.012_Xf5xzeim7n4yE6tjKKHqLM

    packageAIP ua950.012_Xf5xzeim7n4yE6tjKKHqLM

    sudo python3 -u ua435_LUUaFPvezhmdcwwnVX3drV

    sudo python3 -n ua950.009_qUQbs7GYhzmB3uL3yjH5uX

    python3 ua435_LUUaFPvezhmdcwwnVX3drV

Replacing /masters with Masters in SIP

The masters directory in /processing is a duplicate of the SIP. If you edit or delete the masters