Per-Technique Diagnostics (Shared Dataclass Family)

Overview

Per-technique diagnostics are the typed dataclasses every navigation technique returns on its diagnostics field. Each technique declares its own diagnostics dataclass — a frozen, narrow record of the per-fit quantities that the confidence formula consumes and the curator surfaces in the JSON sidecar. Centralising every diagnostics dataclass in one module lets the curator’s allow-list discipline catch a programmer who adds a new diagnostic field without updating its JSON schema, and lets the validate_registered_confidence_specs() walk verify at config-load time that every YAML-driven confidence formula references only attributes the technique actually emits.

Theory

A diagnostics dataclass is a frozen record whose fields are exactly the per-fit quantities that downstream systems read. Two consumers exist:

The confidence formula

Each ConfidenceTerm references a diagnostic-attribute name; the shared evaluator reads that attribute off the diagnostics object and feeds it through the offset / divisor / cap normalisation before applying the linear coefficient. See Confidence Calibration (Shared Sigmoid-of-Linear Combination) for the sigmoid math. The technique’s confidence_attributes allow-list spans both the diagnostic-attribute names and any side-channel flags the spec is allowed to read (at_edge, spurious, etc., which live on the result rather than the diagnostics object); the validation walk verifies that every term and every hard-zero key falls inside the allow-list.

The curator

The orchestrator’s curator (build_metadata_dict()) walks every diagnostics dataclass’s CURATOR_FIELDS class attribute — a mapping of dataclass-field name to JSON-key name (or None to skip) — and emits exactly those fields into the per-image JSON sidecar. The mapping format lets the JSON schema use a different name than the Python field (e.g. an internal mode could surface as "path" in the JSON), but the conventional usage is identity (the dataclass field name and the JSON key match). assert_diagnostic_fields_present() runs at startup and fails the build when a new dataclass field is added without updating CURATOR_FIELDS.

Restrictions and assumptions

  • Every diagnostics dataclass is frozen (@dataclass(frozen=True)); the technique builds one instance per fit and the orchestrator passes it on to the curator without mutation.

  • Every dataclass declares a CURATOR_FIELDS typing.ClassVar mapping covering every public field. Fields the curator deliberately omits are mapped to None; every other field maps to its JSON key name. CI fails if any field is unmapped.

  • All numeric fields are plain Python floats / ints. Numpy scalars are coerced before storage so the JSON serialiser does not encounter non-native types.

  • Every per-technique confidence formula references only attributes that exist on the technique’s diagnostics dataclass plus the four side-channel flags carried on the NavTechniqueResult itself (at_edge, spurious, plus the technique’s own internal flags exposed via an adapter object).

Sources of uncertainty

Diagnostics are the outputs of the per-fit numerics — they record what the technique measured rather than uncertainty about the measurement. Any uncertainty quoted on the diagnostic value (e.g. an LM RMS residual) is the technique’s own number.

Configuration

Diagnostics carry no YAML configuration of their own. Each technique’s confidence formula — which references diagnostic-attribute names by string — lives under techniques.<TechniqueName> in src/nav/config_files/config_510_techniques.yaml; see Confidence Calibration (Shared Sigmoid-of-Linear Combination) for the YAML schema.

Implementation

Source file: src/nav/nav_technique/diagnostics.py.

Public surface (autodocumented at nav.nav_technique):

The module also exports the NavTechniqueDiagnostics union type spanning every per-technique dataclass; the orchestrator’s curator and NavTechniqueResult both consume this union. Adding a new technique means adding both its diagnostics dataclass and a new entry into the union.

Examples

Curator allow-list discipline. Each diagnostics dataclass declares a class-level CURATOR_FIELDS mapping that the curator walks at JSON-emit time. For BodyLimbDiagnostics the mapping is:

CURATOR_FIELDS = {
    'visible_limb_arc_fraction': 'visible_limb_arc_fraction',
    'visible_arc_px': 'visible_arc_px',
    'dt_fit_rms_px': 'dt_fit_rms_px',
    'lm_iterations': 'lm_iterations',
    'tukey_inlier_count': 'tukey_inlier_count',
}

A new field added to the dataclass without a corresponding entry trips assert_diagnostic_fields_present() at startup, which raises AssertionError and fails the build before any image is processed.

Confidence-formula reference. The YAML stanza for BodyLimbNav declares:

techniques:
  BodyLimbNav:
    terms:
      - feature: visible_limb_arc_fraction
        alpha: 3.0
      - feature: dt_fit_rms_px
        alpha: -1.5

Each feature value names an attribute on BodyLimbDiagnostics. At config-load time validate_registered_confidence_specs() walks the spec and confirms every name appears in BodyLimbNav’s confidence_attributes allow-list.

JSON sidecar field-by-field. A successful BodyLimbNav fit on a Cassini image produces a per-technique block in the per-image JSON sidecar of the form:

{
  "technique_name": "BodyLimbNav",
  "feature_ids": ["limb_arc:DIONE"],
  "offset_px": [11.0, 29.5],
  "covariance_px2": [[0.0156, 0.0017], [0.0017, 0.0148]],
  "confidence": 0.794,
  "spurious": false,
  "at_edge": false,
  "diagnostics": {
    "visible_limb_arc_fraction": 0.85,
    "visible_arc_px": 120.0,
    "dt_fit_rms_px": 0.4,
    "lm_iterations": 5,
    "tukey_inlier_count": 118
  }
}

Every key under "diagnostics" corresponds to a non-None-valued entry in the CURATOR_FIELDS mapping for BodyLimbDiagnostics; nothing else surfaces in the sidecar.