PDS4 Bundle Generation
The pds4 package builds PDS4-compliant bundles from RMS-NAV’s per-image
navigation metadata and per-pixel backplanes. A bundle is the deliverable the
Ring-Moon Systems node ships to PDS for archive: one collection of data labels
(one per image), one collection of browse PNGs, plus the auxiliary collections
(context, document, xml_schema) and the bundle-level label that wires them
together. This chapter covers the bundle-generation driver, the per-dataset
extension points, the templated label workflow, and the output layout.
The user-facing CLI walkthrough lives at PDS4 Bundle Generation; this chapter is the developer’s reference.
Pipeline overview
Bundle generation is a two-phase process driven by nav_create_bundle:
Per-image data labels. For each image in the input batch,
generate_bundle_data_files()reads the_metadata.jsonproduced bynav_offsetand the_backplane_metadata.jsonproduced bynav_backplanes, populates apdstemplaterendering context with per-image template variables, and writes the matching<image>_backplanes.lblxfile (plus a copy of the browse PNG into the bundle’sbrowse/tree). The backplane FITS file itself is copied (or symlinked, depending on the dataset’s preference) from the backplane root into the bundle’sdata/tree.Collections + bundle assembly. After every per-image data label is in place,
generate_collection_files()walks the bundle’sdata/tree, collects every_backplanes.lblxit finds, sorts them by image name, writes thecollection_data.csvinventory and the matchingcollection_data.lblxlabel, and renders the bundle’s other collection labels (context, browse, document, xml_schema) plus the top-levelbundle.lblx.generate_global_index_files()writes the per-bundleglobal_index_bodies.lblxandglobal_index_rings.lblxsummary tables.
The driver runs phase 1 once per image (fan-out friendly — each image is independent) and phase 2 once at the end (sequential — needs every per-image label in place before it can build the inventory).
Per-dataset extension points
PDS4 bundle generation is parameterized by the
DataSet subclass. The base class declares the
extension points as non-abstract methods that raise NotImplementedError —
a dataset that does not need PDS4 support can simply not override them, and
the bundle drivers refuse to run.
The full extension-point set:
pds4_bundle_template_dir()— absolute path to the directory ofpdstemplate.lblxfiles this dataset uses. Lookups consultconfig.pds4.<dataset_name>.template_dirfirst; relative paths resolve undersrc/pds4/templates/. The reference Cassini ISS Saturn dataset usescassini_iss_saturn_1.0/.pds4_bundle_name()— the bundle’s external name (for examplecassini_iss_saturn_backplanes_rsfrench2027). The bundle root is<bundle_results_root>/<bundle_name>/. Lookups consultconfig.pds4.<dataset_name>.bundle_name.pds4_bundle_path_for_image()— maps an image name to its position in the bundle’sdata/directory tree (typically a sharded path like1234xxxxxx/123456xxxxto keep per-leaf cardinality manageable on filesystems that struggle with very wide directories).pds4_path_stub()— full per-image stub including the image name (e.g.1234xxxxxx/123456xxxx/1234567890w). Builds the per-file paths underdata/andbrowse/.pds4_image_name_to_browse_lid()/pds4_image_name_to_browse_lidvid()— emit the browse-product Logical Identifier (LID) and LID + version (LIDVID) for the given image name. LIDs follow the PDS4 namespace conventionurn:nasa:pds:<bundle>:browse:<image>.pds4_image_name_to_data_lid()/pds4_image_name_to_data_lidvid()— same, for the data product (the backplane.lblx+.fitspair).pds4_template_variables()— returns a dict of template variables consumed by the per-imagedata.lblx/browse.lblxtemplates. Inputs are theImageFile, the navigation metadata dict parsed from<image>_metadata.json, and the backplane metadata dict parsed from<image>_backplane_metadata.json. The dataset is free to derive any per-image quantity the templates reference (target body, observer, mid-time, exposure, filters, navigation offset and confidence, per-backplane min/max/units, and so on).
Reference implementation:
DataSetPDS3CassiniISS
overrides every PDS4 hook above and serves as the canonical worked example.
Voyager ISS (DataSetPDS3VoyagerISS)
mirrors the same shape for an instrument with different image-naming
conventions.
The pds4 config block
src/nav/config_files/config_950_pds4.yaml populates config.pds4 with
per-dataset bundle metadata: the bundle name, the template directory name,
the LID namespace prefix, and any per-bundle template defaults the
pds4_template_variables hook draws from. See
Config and Static Data for the loader contract; the file
is loaded by the standard numeric-prefix order at the 9xx “downstream
products” tier.
Templated label workflow
Labels are rendered via the pdstemplate library — a Python expression
language embedded in PDS4 .lblx files (XML). Each template carries
expressions that resolve against a dictionary of variables; the
pds4_template_variables hook is the contract that connects per-image
metadata to the templates.
A typical render looks like:
import pdstemplate
template_path = template_dir / 'data.lblx'
variables = dataset.pds4_template_variables(
image_file=image_file,
nav_metadata=nav_metadata,
backplane_metadata=backplane_metadata,
)
template = pdstemplate.Template(str(template_path))
rendered = template.generate(variables)
destination.write_text(rendered)
The pdstemplate library handles the XML escaping, the expression syntax,
and the per-template error reporting; consumers only supply the variable
dictionary and the destination path.
Template tree
Each dataset’s template directory under src/pds4/templates/ contains the
shipping .lblx files. The Cassini-ISS-Saturn-1.0 set is the reference
layout:
src/pds4/templates/cassini_iss_saturn_1.0/
bundle.lblx # top-level bundle label
readme.txt # bundle-level README
data.lblx # per-image backplane data label
browse.lblx # per-image browse-product label
collection_data.lblx # data-collection label (CSV inventory)
collection_browse.lblx # browse-collection label
collection_context.lblx # context-collection label
collection_context.csv # context inventory (static)
collection_document.lblx # document-collection label
collection_document.csv # document inventory (static)
collection_xml_schema.lblx # schema-collection label
collection_xml_schema.csv # schema inventory (static)
global_index_bodies.lblx # per-bundle bodies summary
global_index_rings.lblx # per-bundle rings summary
cassini-iss-saturn-backplanes-user-guide.lblx # bundle user-guide doc
The static inventory CSVs are copied verbatim into the bundle; the
per-image and per-bundle .lblx files are rendered fresh on every run.
Output layout
A finished bundle has the standard PDS4 directory shape:
<bundle_results_root>/<bundle_name>/
bundle.lblx
readme.txt
data/
<pds4_bundle_path_for_image>/
<image>_backplanes.lblx
<image>_backplanes.fits # copied from backplane_results_root
browse/
<pds4_bundle_path_for_image>/
<image>_browse.lblx
<image>_browse.png # copied from nav_results_root
collection/
data/
collection_data.lblx
collection_data.csv
browse/
collection_browse.lblx
collection_browse.csv
context/
collection_context.lblx
collection_context.csv # static
document/
collection_document.lblx
collection_document.csv # static
<user-guide doc>.lblx
xml_schema/
collection_xml_schema.lblx
collection_xml_schema.csv # static
index/
global_index_bodies.lblx
global_index_bodies.csv
global_index_rings.lblx
global_index_rings.csv
Adding PDS4 support to a new dataset
The end-to-end checklist:
Override every
pds4_*method on the newDataSetsubclass. UseDataSetPDS3CassiniISSas the reference implementation. The methods that absolutely must work arepds4_bundle_template_dir(),pds4_bundle_name(),pds4_path_stub(), the fourpds4_image_name_to_*_lid[vid]methods, andpds4_template_variables().Drop a per-dataset template directory under
src/pds4/templates/<dataset>_<version>/containing the.lblxfiles and the static inventory CSVs. Copy fromcassini_iss_saturn_1.0/and adapt the field set.Add an entry under
pds4.<dataset_name>:inconfig_950_pds4.yamlthat points at the new template directory and sets the bundle name plus any per-bundle defaults thepds4_template_variableshook draws from.Add an integration smoke test that renders one image through
nav_create_bundleand asserts the resultingdata.lblxvalidates against the PDS4 schema. The Cassini ISS test undertests/integration/is the pattern to follow.
API reference
The pds4 package has no autogenerated entry under
API Reference; the module’s public surface is the three
phase-1 / phase-2 entry points listed below, plus the
DataSet pds4_* extension hooks
documented above.
generate_bundle_data_files()— phase 1, one image.generate_collection_files()— phase 2, collection + bundle assembly.generate_global_index_files()— per-bundle bodies / rings global indexes.