Software Requirements Specification¶

Project: tot-agent — Autonomous Browser Testing Agent Version: 1.0 Date: 2026-03-21 Standard: IEEE 830-1998

1. Introduction¶

1.1 Purpose¶

This Software Requirements Specification (SRS) defines the functional and non-functional requirements for tot-agent, an autonomous browser agent that uses Claude AI vision and tool-use to execute scripted test scenarios against web applications. The primary audience is developers extending or maintaining the tool and stakeholders evaluating its capabilities.

1.2 Scope¶

tot-agent is a Python command-line tool that:

Accepts natural-language goals or pre-built scenario commands
Drives a Chromium browser via Playwright to execute those goals
Uses Claude's vision capability to observe the UI state and reason about next actions
Fetches real book cover images to seed A/B testing platforms
Supports multi-user simulation with isolated browser sessions

The tool is initially targeted at the This-or-That A/B book-cover testing platform but is designed to be repurposed for any web GUI through configuration.

1.3 Definitions, acronyms, and abbreviations¶

Term	Definition
Agent	The `BrowserAgent` instance that drives the agentic loop
Goal	A plain-English string describing what the agent should accomplish
Observer	A class implementing `AgentObserver` that receives agent lifecycle events
SIM_USER	A configured simulated user account (`SimUser` dataclass)
Tool	A function exposed to Claude via the `TOOL_DEFINITIONS` schema
Context	A Playwright `BrowserContext` representing one user's isolated session
Cover	A `BookCover` instance with title, author, image URL, and source

1.4 References¶

1.5 Overview¶

Section 2 provides the overall product description. Section 3 specifies detailed functional and non-functional requirements. Section 4 includes supporting information.

2. Overall Description¶

2.1 Product perspective¶

tot-agent is a standalone command-line tool that integrates three external systems:

graph LR
    Tot["tot-agent"]
    Claude["Anthropic Claude API"]
    Playwright["Playwright / Chromium"]
    OL["Open Library API"]
    GB["Google Books API"]
    App["Target Web Application"]

    Tot --> Claude
    Tot --> Playwright
    Tot --> OL
    Tot --> GB
    Playwright --> App

It is not a web service, does not have a database, and does not require a network server. All state is in-memory within a single process execution.

2.2 Product functions¶

The tool provides the following high-level functions:

Browser automation — navigate, click, fill forms, take screenshots in a real browser
AI-driven decision making — use Claude's vision to interpret screenshots and choose the next action
Multi-user simulation — maintain separate browser sessions per simulated user
Cover image fetching — retrieve real book cover images from Open Library and Google Books
Scenario execution — execute pre-built or custom natural-language test scenarios
Reporting — summarise outcomes via console output and/or log files

2.3 User characteristics¶

User class	Technical background	Primary use
Developer / QA engineer	High; comfortable with Python CLIs	Running test scenarios during development
Platform owner	Moderate; can run commands	Seeding demo data for a live platform
CI/CD pipeline	n/a (automated)	Headless regression testing

2.4 Constraints¶

C-1: Requires an Anthropic API key; incurs API usage costs.
C-2: Requires a local web server running the target application.
C-3: Chromium must be installed via playwright install chromium.
C-4: Python ≥ 3.12 required (uses match statements, walrus operator).
C-5: No cross-browser support in the initial release (Chromium only).
C-6: The agent loop is limited to MAX_AGENT_STEPS iterations to prevent runaway API usage.

2.5 Assumptions and dependencies¶

A-1: The target application is accessible at a stable URL during agent execution.
A-2: Simulated user accounts exist in the target application before voting commands are run.
A-3: The Open Library and Google Books APIs are publicly accessible.
D-1: anthropic Python SDK ≥ 0.40.0.
D-2: playwright Python package ≥ 1.44.0.
D-3: httpx ≥ 0.27.0 for HTTP requests.

3. Specific Requirements¶

3.1 Functional requirements¶

3.1.1 Browser automation¶

ID	Requirement
FR-BA-01	The system SHALL navigate the active browser context to any absolute or site-relative URL.
FR-BA-02	The system SHALL click elements identified by CSS selector or visible text.
FR-BA-03	The system SHALL fill input fields identified by CSS selector with a given string value.
FR-BA-04	The system SHALL press named keyboard keys (e.g. Enter, Tab, Escape).
FR-BA-05	The system SHALL capture a full-colour PNG screenshot and return it as base64.
FR-BA-06	The system SHALL return the visible text content of the active page, capped at 4 000 characters.
FR-BA-07	The system SHALL return the current URL of the active page.
FR-BA-08	The system SHALL scroll the active page to the bottom.
FR-BA-09	The system SHALL wait up to a configurable timeout for a CSS selector to appear.
FR-BA-10	The system SHALL evaluate arbitrary JavaScript in the active page and return the result.

3.1.2 Multi-user context management¶

ID	Requirement
FR-UC-01	The system SHALL maintain a pool of named browser contexts, one per simulated user.
FR-UC-02	The system SHALL lazily create a new context when a previously unseen user key is requested.
FR-UC-03	Switching user contexts SHALL NOT close existing contexts.
FR-UC-04	Each context SHALL have isolated cookies and session storage.
FR-UC-05	The system SHALL close all contexts when the `BrowserManager` exits.

3.1.3 Cover image fetching¶

ID	Requirement
FR-CF-01	The system SHALL search Open Library for book covers matching a query string.
FR-CF-02	The system SHALL fall back to Google Books when Open Library returns fewer results than requested.
FR-CF-03	Returned cover URLs SHALL use HTTPS.
FR-CF-04	Results SHALL be deduplicated by normalised title.
FR-CF-05	The system SHALL verify that a cover URL resolves via HTTP HEAD on request.
FR-CF-06	The system SHALL support random cover-pair generation for seeding A/B tests.

3.1.4 Agentic loop¶

ID	Requirement
FR-AL-01	The agent SHALL accept a plain-English goal string.
FR-AL-02	The agent SHALL pass the goal, message history, and tool schemas to Claude.
FR-AL-03	The agent SHALL execute all tool calls returned by Claude in sequence.
FR-AL-04	The agent SHALL feed tool results back to Claude as `tool_result` messages.
FR-AL-05	The agent SHALL terminate when Claude returns `stop_reason == "end_turn"`.
FR-AL-06	The agent SHALL terminate after `MAX_AGENT_STEPS` iterations if end_turn is not reached.
FR-AL-07	The agent SHALL return a plain-text summary of what was accomplished.

3.1.5 Observer / event reporting¶

ID	Requirement
FR-OB-01	The agent SHALL emit an `AgentEvent` for each of: `GOAL_START`, `STEP_START`, `AGENT_TEXT`, `TOOL_CALL`, `TOOL_RESULT`, `GOAL_COMPLETE`, `STEP_LIMIT`.
FR-OB-02	Any number of `AgentObserver` instances SHALL be registerable at construction time.
FR-OB-03	Observers SHALL be addable and removable at runtime.
FR-OB-04	`ConsoleObserver` SHALL render events as Rich-formatted terminal output.
FR-OB-05	`LoggingObserver` SHALL write events to the Python logging hierarchy.

3.1.6 Command-line interface¶

ID	Requirement
FR-CLI-01	The CLI SHALL provide sub-commands: `create`, `vote`, `simulate`, `seed`, `goal`, `users`, `info`, `covers`.
FR-CLI-02	The CLI SHALL support global options: `--log-level`, `--log-file`, `--model`, `--max-steps`, `--site-url`.
FR-CLI-03	The CLI SHALL print the version via `--version`.
FR-CLI-04	The `info` command SHALL display all active configuration values.
FR-CLI-05	The `covers` command SHALL work without launching a browser.
FR-CLI-06	All sub-commands that launch a browser SHALL support `--headless`.

3.2 Non-functional requirements¶

3.2.1 Performance¶

ID	Requirement
NFR-P-01	Context switching SHALL complete in < 500 ms for already-created contexts.
NFR-P-02	Screenshots SHALL be captured within the Playwright page timeout (default 15 s).
NFR-P-03	Cover search requests SHALL complete within 10 s per source.

3.2.2 Reliability¶

ID	Requirement
NFR-R-01	HTTP errors from cover APIs SHALL be caught; the system SHALL return an empty list rather than raising.
NFR-R-02	Browser action failures SHALL return an `"ERROR: ..."` string rather than raising an exception.
NFR-R-03	The agent loop SHALL NOT crash on unknown tool names; it SHALL return an error string to Claude.

3.2.3 Maintainability¶

ID	Requirement
NFR-M-01	All public classes and functions SHALL have Sphinx-compatible docstrings.
NFR-M-02	Code SHALL conform to PEP 8 style.
NFR-M-03	Test coverage SHALL be tracked via `pytest-cov`; the coverage report SHALL be generated on each test run.
NFR-M-04	New cover sources SHALL be addable without modifying `CoverFetcher`.

3.2.4 Security¶

ID	Requirement
NFR-S-01	The Anthropic API key SHALL be loaded from environment variables or `.env`; it SHALL NOT be hardcoded.
NFR-S-02	Browser contexts SHALL NOT share cookies or session storage across users.
NFR-S-03	The `--log-file` output SHALL NOT include the API key in plaintext.

3.2.5 Portability¶

ID	Requirement
NFR-PT-01	The tool SHALL run on macOS, Linux, and Windows.
NFR-PT-02	All dependencies SHALL be declared in `pyproject.toml`.
NFR-PT-03	The tool SHALL be installable in a Python virtual environment via `pip install -e .`.

4. Supporting information¶

4.1 Requirements traceability matrix¶

Requirement	Source module	Test file
FR-BA-01 – FR-BA-10	`browser.py`	`tests/unit/test_browser.py`
FR-UC-01 – FR-UC-05	`browser.py`	`tests/unit/test_browser.py`
FR-CF-01 – FR-CF-06	`covers.py`	`tests/unit/test_covers.py`
FR-AL-01 – FR-AL-07	`agent.py`	`tests/unit/test_agent.py`
FR-OB-01 – FR-OB-05	`agent.py`	`tests/unit/test_agent.py`
FR-CLI-01 – FR-CLI-06	`cli.py`	(CLI tests — future)

4.2 Out of scope for v1.0¶

Cross-browser support (Firefox, WebKit)
Parallel multi-user execution
Persistent test result storage
GUI dashboard or web API
Authentication methods other than username/password form fill