Ensemble Combine (ensemble + EnsembleConfig)
Overview
ensemble() is the function that reconciles every
per-technique NavTechniqueResult into a
single NavResult. The orchestrator invokes the
ensemble twice per image: once after pass 1 (to derive the pass-2 prior) and once on the
union of pass-1 and pass-2 results (to produce the final answer). The reconciliation
discipline is honest: spurious results are dropped, at-edge results are dropped unless
removing them empties the set, the surviving results are grouped by Mahalanobis-distance
agreement, the highest summed-confidence group wins, and the within-group results are
fused via precision-weighted (Kalman-style) merging.
Theory
The ensemble’s reconciliation is a seven-step pipeline.
Step 1 — drop spurious
Every result with
spurious True is
dropped unconditionally. Spurious is the technique’s self-assessed structural failure
flag; the ensemble does not second-guess it.
Step 2 — drop at-edge
Every result with
at_edge True is dropped
unless dropping the at-edge cohort would empty the surviving set. The exception
preserves an at-edge result when it is the only signal the orchestrator has — better a
hint at a search-window edge than no answer at all.
Step 3 — single-link Mahalanobis grouping
Surviving results are clustered by single-linkage Mahalanobis-distance agreement. Two results \((\mu_{a}, \Sigma_{a})\) and \((\mu_{b}, \Sigma_{b})\) are linked when
is at most agreement_sigma, where the pseudoinverse uses
scipy.linalg.pinvh() so rank-deficient covariances are handled. A result whose
\((\mu_{a} - \mu_{b})\) projects into the null space of the summed covariance is
treated as infinite distance — estimates cannot agree along an unobservable axis.
Step 4 — pick the highest summed-confidence group
For each connected component, sum the per-technique confidences and pick the group with
the highest sum. When the runner-up’s summed confidence is within agreement_gap of
the winner’s, the ensemble flags the conflict and returns a status='conflicted'
NavResult instead of fusing.
Step 5 — precision-weighted merge
Inside the winning group, fuse the per-technique offsets into one estimate via Kalman-style information addition. The fused information matrix is the sum of the per-technique information matrices \(I_{i} = \Sigma_{i}^{+}\); the fused offset is \(\mu = \Sigma \, \sum_{i} I_{i} \mu_{i}\), where \(\Sigma\) is the pseudo-inverse of the summed information matrix. The pseudoinverse handles rank-deficient inputs (e.g. a flat-ring-only result) gracefully — the unobservable axis carries an unbounded marginal sigma.
Step 6 — disagreement and conflict penalties
When more than one Mahalanobis-distance group survived, the fused confidence is multiplied
by disagreement_penalty (default 0.7). When the conflict branch fired in Step 4 the
status='conflicted' NavResult is returned with a further
conflicted_confidence_multiplier (default 0.3) applied to the runner-up’s summed
confidence so the JSON sidecar reflects the conflict’s severity.
Step 7 — confidence-rank assignment
The fused confidence and the per-axis sigma are mapped to a five-bucket rank
('high' / 'medium' / 'low' / 'conflicted' / 'failed') by
derive_confidence_rank() against the per-rank
min_confidence / max_sigma_px thresholds. Below the min_confidence floor the
ensemble returns status='failed'.
Restrictions and assumptions
Per-technique covariances must be 2x2 (translation-only) or 3x3 (translation + rotation). The ensemble does not handle scale-disagreement or arbitrary-shape parameter spaces.
The Mahalanobis grouping assumes the per-technique covariances are calibrated. An over-confident covariance shrinks the apparent agreement region and may cause a legitimate match to land in its own cluster.
The pseudoinverse cutoff (
pinvh_rcond) is global; rank-deficient detection uses the same threshold for grouping and merging so behaviour is consistent across the two passes.
Sources of uncertainty
The fused covariance is the pseudo-inverse of the summed information matrix; it is the
standard precision-weighted-merge form. When the input set has no full-rank result, the
fused covariance is rank-deficient along the unconstrained axis; the
sigma_along_unobservable_px field
captures the unbounded eigenvalue’s direction. When the disagreement-penalty fires the
fused confidence is reduced multiplicatively.
Configuration
Tunables live on EnsembleConfig. The defaults are
module-level constants in nav.nav_orchestrator.ensemble; the orchestrator’s
constructor accepts an EnsembleConfig override.
agreement_sigma— float, default2.0. Mahalanobis-distance threshold for grouping.agreement_gap— float, default0.5. Minimum summed-confidence gap between best and runner-up groups before declaring a conflict.disagreement_penalty— float, default0.7. Multiplier on combined confidence when more than one group existed.conflicted_confidence_multiplier— float, default0.3. Additional multiplier when the conflicted branch fires.min_confidence— float, default0.2. Final-result threshold below which the ensemble returnsfailed()instead ofsuccess().pinvh_rcond— float, default1.0e-9. Cutoff forscipy.linalg.pinvh().tier_thresholds— mappingrank -> {min_confidence, max_sigma_px}; default thresholds give'high'for confidence at or above 0.8 with sigma at most 0.5 px,'medium'for 0.5 confidence with sigma at most 2.0 px,'low'for 0.2 confidence with no sigma cap.
Implementation
Source file: src/nav/nav_orchestrator/ensemble.py —
ensemble(),
derive_confidence_rank(), and
EnsembleConfig.
Public surface (autodocumented at nav.nav_orchestrator):
ensemble()— the reconciler. Returns oneNavResult.derive_confidence_rank()— assign the five-bucket rank from a confidence / sigma pair.EnsembleConfig— frozen dataclass carrying the seven tunables documented above.
The function uses scipy.sparse.csgraph.connected_components() to find the
Mahalanobis-distance clusters and scipy.linalg.pinvh() for both the per-pair
distance test and the precision-weighted merge.
Examples
Two agreeing techniques. Pass 1 produces
BodyDiscCorrelateNav
(\((6.76, -17.71)\) ± 0.5 px) and
BodyLimbNav
(\((7.00, -18.00)\) ± 0.3 px). The Mahalanobis distance is well below
agreement_sigma=2.0; both end up in the same group. The fused offset is
\((6.93, -17.92)\) px with combined per-axis sigma ~0.26 px. No disagreement
penalty fires (only one group existed) so the fused confidence is the summed per-technique
confidence (capped by the project-wide ceiling).
Single-link grouping with three techniques. Three techniques converge:
\((7.0, -18.0)\) ± 0.3, \((8.0, -17.5)\) ± 0.5, \((11.6, 12.6)\) ± 0.4. The
first two are within agreement_sigma of each other; the third is several sigma off in
both axes. Single-link grouping puts the first two in one cluster and the third in its
own. The first cluster’s summed confidence is 0.49; the third’s is 0.74. When the gap
\(0.74 - 0.49 = 0.25\) falls below agreement_gap=0.5 the ensemble flags the
conflict and returns status='conflicted' rather than picking the higher-confidence
isolated wrong answer (this is the documented multi_body test scene’s behaviour).
Rank-deficient ring-edge fit. A flat-ring-only scene produces a
RingEdgeNav result whose covariance is
rank-1 along radial only. The ensemble’s pseudoinverse handles the rank deficiency: the
fused covariance has unbounded variance along the along-edge tangent and the
sigma_along_unobservable_px field
captures it. When a star or body limb supplies an orthogonal-axis constraint the fused
result becomes full-rank.