Protocol intelligence
Requires eligibility, time zero, exposure strategies, assignment emulation, censoring, outcome, follow-up, estimand, covariate timing, positivity, and sensitivity analysis before modeling.
TTE Agent documentation
TTE Agent helps investigators design, audit, and report observational longitudinal analyses before modeling begins. It is built for target trial emulation, causal guardrails, reproducible audit trails, and manuscript-grade appendices.
Overview
TTE Agent is not just a modeling script. It is a protocol, audit, planning, reporting, and guarded-orchestration layer around serious observational causal research.
Requires eligibility, time zero, exposure strategies, assignment emulation, censoring, outcome, follow-up, estimand, covariate timing, positivity, and sensitivity analysis before modeling.
Profiles required variables, missingness, time support, exposure and outcome counts, mediator availability, and temporal ordering.
Flags post-exposure adjustment, mediator/collider mistakes, ambiguous index dates, immortal time risk, outcome leakage, positivity failure, sparse events, unstable weights, and estimand-model mismatch.
Generates target trial tables, STROBE-style checklists, causal assumptions tables, diagnostics summaries, sensitivity registries, and limitations language.
TTE Agent does not prove causal identification, does not replace investigator judgment, and does not automatically estimate natural direct or indirect effects. Its value is to make assumptions, design choices, and modeling permissions explicit before estimates are produced.
Installation
The public package release is planned with the scientific manuscript. Until then, the documentation reflects the current repository workflow.
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements-dev.lock
python -m pip install -e .
python -m pytest tests -q
tte-agent --help
python -m tte.cli profile \
--data person_interval.csv \
--output dataset_profile.md
Quick start
The central workflow is deliberately conservative: draft or load a protocol, profile the data, run deterministic checks, then decide whether modeling is scientifically appropriate.
import pandas as pd
from tte import TTEAgent
data = pd.read_csv("person_interval.csv")
agent = TTEAgent()
spec = agent.draft_protocol(
analysis_name="ggt_hba1c_t2d",
id_col="person_id",
time_col="visit_index",
exposure_col="ggt_high",
mediator_col="hba1c_next",
outcome_col="incident_t2d_next",
baseline_covariates=["age", "sex"],
time_varying_covariates=["bmi", "sbp", "hba1c", "ggt"],
)
result = agent.audit(data, spec=spec)
print(result.report)
print(result.next_actions)
agent.write_artifacts(result, "audit_output", prefix="ggt_hba1c_t2d")
Runnable examples
These examples are safe, deterministic demonstrations of the current agent behavior. They do not upload data and do not run arbitrary code.
A synthetic longitudinal panel with a complete target-trial protocol.
Click Run to see the deterministic output.
Protocol schema
The schema forces design elements that are often left implicit in observational analyses.
| Protocol element | Why it matters |
|---|---|
| Eligibility | Defines who could enter the emulated trial and prevents hidden selection rules. |
| Time zero | Anchors exposure, mediator, outcome, censoring, and follow-up to a common origin. |
| Treatment strategies | Specifies the exposure regimes being compared. |
| Assignment emulation | States how observational assignment approximates randomized allocation and what exchangeability assumptions are needed. |
| Censoring definition | Clarifies loss of follow-up, competing data processes, and censoring assumptions. |
| Outcome definition | Prevents leakage and makes event timing auditable. |
| Estimand and contrast | Keeps total, direct, indirect, interventional, and predictive analyses distinct. |
| Covariate timing | Separates baseline covariates from post-exposure variables and mediators. |
| Positivity assumptions | Requires support for the strategies in relevant covariate histories. |
| Sensitivity analyses | Makes unmeasured confounding, measurement error, and modeling choices visible. |
Scientific guardrails
The agent separates blockers from cautions so that investigators know what must be fixed before modeling and what must remain visible during interpretation.
Reporting
The reporting layer is designed to make manuscript drafting more transparent, not automatic. Investigators must review every generated statement before publication.
Eligibility, strategies, assignment, follow-up, censoring, outcome, and estimand.
Observational reporting items shaped for longitudinal causal studies.
Exchangeability, consistency, positivity, censoring, and mediation-specific assumptions.
Data support, missingness, temporal order, sparse cells, and red flags.
Distribution summaries and instability warnings when weights are present.
Planned robustness checks before interpretation begins.
API reference
The current API is small by design. It exposes deterministic building blocks that a future conversational agent or web app can call safely.
TTEAgentFacade for protocol drafting, data profiling, audit, next actions, analysis plan, reporting, artifact writing, and approved library-backed runs.
create_protocol_templateCreates a conservative protocol template with explicit placeholders for investigator decisions.
run_protocol_auditRuns deterministic data-support checks and returns report-ready audit results.
run_external_validation_benchmarksRuns generic synthetic benchmarks for audit behavior across valid, positivity-failure, and sparse-event settings.
Scientific status
The current software is already useful as a reproducibility and scientific guardrail layer. The estimator layer should grow slowly, with explicit validation.
| Area | Current status | Scientific interpretation |
|---|---|---|
| Protocol gate | Implemented | Blocks modeling until key target-trial questions are answered. |
| Data audit | Implemented | Checks support, timing, missingness, and sparse cells. |
| Red flags | Implemented | Detects common causal design failures before modeling. |
| Reporting appendices | Implemented | Produces deterministic manuscript drafting aids. |
| Library integration | Guarded | Allowed only after audit approval, with signed artifacts. |
| Natural direct and indirect effects | Conservative boundary | Requires assumptions beyond software execution; not automatically asserted. |
| Interventional effects | Future estimator work | Important next direction when exposure-induced mediator-outcome confounding is present. |
Roadmap