STRATON-LLM

Evaluation & Learning

STRATON-LLM does not stop at generating a candidate mapping. It asks whether the dialogue that produced the agreement was actually reliable, then stores the outcome in a way that improves future system behavior.

Trust-aware agreement evaluation

The framework evaluates the final agreement using dialogue-level heuristics rather than blindly accepting the last proposal. This is crucial because a fast agreement can still be weak, biased, or insufficiently challenged.

Turn Dominance

Checks whether one side controls the conversation without sufficient challenge.

Self-Confirmation Bias

Detects whether an agent keeps validating its own claims without meaningful external support.

Repetition without Evolution

Flags dialogues that repeat the same point without introducing better evidence or refinement.

Weak Counter

Measures whether opposing arguments were too weak to justify strong acceptance.

Strong Acceptance

Rewards agreements that are backed by explicit support and coherent resolution.

Fluctuating Confidence

Tracks unstable confidence shifts that may indicate unreliable convergence.

Too-Fast Agreement

Penalizes suspiciously quick agreements that likely skipped real examination.

Persistence model

Mapping store

Accepted alignments are saved with source term, target term, confidence, method, evidence, and timestamp so the next negotiation starts from prior knowledge.

Trace logger

Each session preserves dialogue acts, decisions, confidence changes, and final outcomes. This enables replay, debugging, and later evaluation.

Learning updater

Successful and failed negotiations update confidence values, unresolved-term reports, and future strategy hints.

Ontology updater

Confirmed mappings can later be promoted into versioned ontology updates through a controlled and auditable evolution step.

Future roadmap

  • Expand evaluation scenarios with more cross-domain agent pairs.
  • Refine adaptive thresholds and uncertainty propagation across layers.
  • Promote high-confidence mappings into versioned ontology evolution workflows.
  • Benchmark negotiation outcomes against simpler alignment-only baselines.