Skip to content

Test History

Tailtest keeps a persistent record of every file it has tested. This record survives across sessions and gives tailtest context before it writes a single line of tests.


Where the file lives

.tailtest/history.json

Same path for all three platforms (Claude Code, Cursor, Codex). Add .tailtest/ to your .gitignore to keep it out of version control.


What gets tracked

Every time tailtest finishes testing a file, it writes one entry to history.json. Each entry has:

Field What it contains
file Relative path to the source file
status Outcome: passed, fixed, unresolved, or deferred
session_id ID of the session in which this result was recorded
timestamp UTC timestamp
classification How this result compares to prior history (see below)

Classification

Each entry is classified based on the file's history up to that point:

Classification Meaning
gap This file has never been tested before. First time tailtest has seen it.
passed Tests passed this session, and the file has prior history.
fixed Tests were failing but were resolved within the same session.
regression The file was passing in the most recent prior session. Now it is failing.

Classifications are assigned automatically when the session ends. They appear in the history file and in session reports.


Recurring failure detection

If the same file fails across three or more different sessions, tailtest flags it as a recurring problem.

"Different sessions" means distinct session_id values. If a file fails ten times within a single session while tailtest tries to fix it, that counts as one session, not ten. The threshold is met only when the file has failed and remained unresolved across three separate work sessions.

Files that meet this threshold are surfaced at the next session start so the agent knows about them before it begins.


Cross-session context at startup

When you open a new session, tailtest reads history.json and prepends a context note before the agent starts working. This note contains:

  • Recurring failures: files that have failed in three or more sessions
  • Recent regressions: files that were passing before but failed in a recent session

For example, the agent might start with: "Recurring failures across sessions: billing.py. Recent regressions: checkout.py (was passing, now failing)."

This means the agent already knows which files are trouble spots before it writes any tests.


Limits

history.json is capped at 1000 entries. When a new session would push the total over that limit, the oldest entries are dropped first.