TTE Agent

Python 3.9+ Deterministic agent layer Target trial emulation Longitudinal mediation guardrails Version 0.1.0

Overview

What TTE Agent adds

TTE Agent is not just a modeling script. It is a protocol, audit, planning, reporting, and guarded-orchestration layer around serious observational causal research.

Protocol intelligence

Requires eligibility, time zero, exposure strategies, assignment emulation, censoring, outcome, follow-up, estimand, covariate timing, positivity, and sensitivity analysis before modeling.

Data-support audit

Profiles required variables, missingness, time support, exposure and outcome counts, mediator availability, and temporal ordering.

Scientific guardrails

Flags post-exposure adjustment, mediator/collider mistakes, ambiguous index dates, immortal time risk, outcome leakage, positivity failure, sparse events, unstable weights, and estimand-model mismatch.

Manuscript-grade reporting

Generates target trial tables, STROBE-style checklists, causal assumptions tables, diagnostics summaries, sensitivity registries, and limitations language.

Scientific boundary

TTE Agent does not prove causal identification, does not replace investigator judgment, and does not automatically estimate natural direct or indirect effects. Its value is to make assumptions, design choices, and modeling permissions explicit before estimates are produced.

Installation

Local development workflow

The public package release is planned with the scientific manuscript. Until then, the documentation reflects the current repository workflow.

Locked development environment

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements-dev.lock
python -m pip install -e .
python -m pytest tests -q

Command-line entry point

tte-agent --help

python -m tte.cli profile \
  --data person_interval.csv \
  --output dataset_profile.md

Quick start

Run an audit before modeling

The central workflow is deliberately conservative: draft or load a protocol, profile the data, run deterministic checks, then decide whether modeling is scientifically appropriate.

import pandas as pd
from tte import TTEAgent

data = pd.read_csv("person_interval.csv")
agent = TTEAgent()

spec = agent.draft_protocol(
    analysis_name="ggt_hba1c_t2d",
    id_col="person_id",
    time_col="visit_index",
    exposure_col="ggt_high",
    mediator_col="hba1c_next",
    outcome_col="incident_t2d_next",
    baseline_covariates=["age", "sex"],
    time_varying_covariates=["bmi", "sbp", "hba1c", "ggt"],
)

result = agent.audit(data, spec=spec)
print(result.report)
print(result.next_actions)

agent.write_artifacts(result, "audit_output", prefix="ggt_hba1c_t2d")

Runnable examples

Try fixed public scenarios in the browser

These examples are safe, deterministic demonstrations of the current agent behavior. They do not upload data and do not run arbitrary code.

Complete protocol audit

A synthetic longitudinal panel with a complete target-trial protocol.

Click Run to see the deterministic output.

Protocol schema

Questions the agent requires before modeling

The schema forces design elements that are often left implicit in observational analyses.

Protocol element	Why it matters
Eligibility	Defines who could enter the emulated trial and prevents hidden selection rules.
Time zero	Anchors exposure, mediator, outcome, censoring, and follow-up to a common origin.
Treatment strategies	Specifies the exposure regimes being compared.
Assignment emulation	States how observational assignment approximates randomized allocation and what exchangeability assumptions are needed.
Censoring definition	Clarifies loss of follow-up, competing data processes, and censoring assumptions.
Outcome definition	Prevents leakage and makes event timing auditable.
Estimand and contrast	Keeps total, direct, indirect, interventional, and predictive analyses distinct.
Covariate timing	Separates baseline covariates from post-exposure variables and mediators.
Positivity assumptions	Requires support for the strategies in relevant covariate histories.
Sensitivity analyses	Makes unmeasured confounding, measurement error, and modeling choices visible.

Scientific guardrails

Red flags are first-class outputs

The agent separates blockers from cautions so that investigators know what must be fixed before modeling and what must remain visible during interpretation.

Post-exposure adjustment in total-effect models Mediator or collider role mistakes Ambiguous index dates Immortal time risk Outcome leakage Missing censoring definition Positivity failure Sparse events Unstable weights Estimand-model mismatch

Reporting

Reproducibility appendices for manuscripts

The reporting layer is designed to make manuscript drafting more transparent, not automatic. Investigators must review every generated statement before publication.

Target trial table

Eligibility, strategies, assignment, follow-up, censoring, outcome, and estimand.

STROBE-style checklist

Observational reporting items shaped for longitudinal causal studies.

Causal assumptions table

Exchangeability, consistency, positivity, censoring, and mediation-specific assumptions.

Diagnostics table

Data support, missingness, temporal order, sparse cells, and red flags.

Weight summaries

Distribution summaries and instability warnings when weights are present.

Sensitivity registry

Planned robustness checks before interpretation begins.

API reference

Core public interfaces

The current API is small by design. It exposes deterministic building blocks that a future conversational agent or web app can call safely.

`TTEAgent`

Facade for protocol drafting, data profiling, audit, next actions, analysis plan, reporting, artifact writing, and approved library-backed runs.

`create_protocol_template`

Creates a conservative protocol template with explicit placeholders for investigator decisions.

`run_protocol_audit`

Runs deterministic data-support checks and returns report-ready audit results.

`run_external_validation_benchmarks`

Runs generic synthetic benchmarks for audit behavior across valid, positivity-failure, and sparse-event settings.

Scientific status

Implemented, conservative, and future work

The current software is already useful as a reproducibility and scientific guardrail layer. The estimator layer should grow slowly, with explicit validation.

Area	Current status	Scientific interpretation
Protocol gate	Implemented	Blocks modeling until key target-trial questions are answered.
Data audit	Implemented	Checks support, timing, missingness, and sparse cells.
Red flags	Implemented	Detects common causal design failures before modeling.
Reporting appendices	Implemented	Produces deterministic manuscript drafting aids.
Library integration	Guarded	Allowed only after audit approval, with signed artifacts.
Natural direct and indirect effects	Conservative boundary	Requires assumptions beyond software execution; not automatically asserted.
Interventional effects	Future estimator work	Important next direction when exposure-induced mediator-outcome confounding is present.

Roadmap

Path toward a mature scientific agent

Documentation release: maintain examples, API references, scientific boundaries, and reproducibility recipes.
Safe web demos: add fixed, parameter-bounded examples that run server-side without arbitrary code execution.
Benchmark expansion: support multiple dataset structures and synthetic causal scenarios beyond JMDC-style panels.
Estimator validation: add estimator modules only after simulation-based checks, diagnostics, and reporting standards are clear.
Agent interface: introduce LLM assistance only around question-asking, protocol drafting, and report explanation, with deterministic tools as the source of truth.

Protocol-first software for defensible longitudinal causal studies.

What TTE Agent adds

Protocol intelligence

Data-support audit

Scientific guardrails

Manuscript-grade reporting

Scientific boundary

Local development workflow

Locked development environment

Command-line entry point

Run an audit before modeling

Try fixed public scenarios in the browser

Complete protocol audit

Questions the agent requires before modeling

Red flags are first-class outputs

Reproducibility appendices for manuscripts

Target trial table

STROBE-style checklist

Causal assumptions table

Diagnostics table

Weight summaries

Sensitivity registry

Core public interfaces

TTEAgent

create_protocol_template

run_protocol_audit

run_external_validation_benchmarks

Implemented, conservative, and future work

Path toward a mature scientific agent

`TTEAgent`

`create_protocol_template`

`run_protocol_audit`

`run_external_validation_benchmarks`