Software Design¶
1. Introduction¶
tot-agent is an autonomous browser testing agent that combines Claude's vision and tool-use capabilities with a real Playwright browser to execute natural-language test scenarios against web applications.
This document describes the architectural decisions, design patterns, module structure, and key data flows that define the system.
2. Architectural overview¶
graph TB
subgraph CLI Layer
CLI["cli.py<br/>Click commands"]
end
subgraph Agent Layer
Agent["agent.py<br/>BrowserAgent"]
Goals["Goal templates<br/>CreateTestsGoal, VoteGoal…"]
Observers["Observers<br/>ConsoleObserver, LoggingObserver"]
end
subgraph Infrastructure Layer
Browser["browser.py<br/>BrowserManager"]
Tools["tools.py<br/>TOOL_DEFINITIONS + dispatch()"]
Covers["covers.py<br/>CoverFetcher + Sources"]
end
subgraph External
Claude["Anthropic Claude API<br/>(vision + tool use)"]
PW["Playwright / Chromium"]
OL["Open Library API"]
GB["Google Books API"]
end
CLI --> Agent
CLI --> Goals
Agent --> Claude
Agent --> Tools
Agent --> Observers
Tools --> Browser
Tools --> Covers
Browser --> PW
Covers --> OL
Covers --> GB
3. Design patterns¶
3.1 Strategy — Cover sources¶
The covers.py module uses the Strategy pattern to make book-cover data sources interchangeable.
classDiagram
class CoverSource {
<<abstract>>
+search(query, limit) list~BookCover~
}
class OpenLibrarySource {
+search(query, limit) list~BookCover~
}
class GoogleBooksSource {
+search(query, limit) list~BookCover~
}
class CoverFetcher {
-_sources: list~CoverSource~
+fetch(query, count) list~BookCover~
}
CoverSource <|-- OpenLibrarySource
CoverSource <|-- GoogleBooksSource
CoverFetcher o-- CoverSource
Why: Adding a new cover source (e.g. Amazon, Goodreads) requires only implementing CoverSource.search() and passing the instance to CoverFetcher. No existing code needs to change.
3.2 Observer — Agent events¶
The agent.py module uses the Observer pattern to decouple the core loop from output concerns.
classDiagram
class AgentObserver {
<<abstract>>
+on_event(event) None
}
class ConsoleObserver {
+on_event(event) None
}
class LoggingObserver {
+on_event(event) None
}
class BrowserAgent {
-_observers: list~AgentObserver~
+add_observer(obs) None
+remove_observer(obs) None
-_emit(event) None
+run(goal) str
}
AgentObserver <|-- ConsoleObserver
AgentObserver <|-- LoggingObserver
BrowserAgent --> AgentObserver
Why: CI/CD pipelines need file-based logging; interactive use needs Rich terminal output; tests need neither. Observers can be mixed and matched without modifying BrowserAgent.
3.3 Template Method — Goal builders¶
The GoalTemplate hierarchy applies the Template Method pattern to goal construction.
classDiagram
class GoalTemplate {
<<abstract>>
+build(**kwargs) str
}
class CreateTestsGoal {
+build() str
}
class VoteGoal {
+build() str
}
class SimulateAllUsersGoal {
+build() str
}
class FullSeedGoal {
+build() str
}
GoalTemplate <|-- CreateTestsGoal
GoalTemplate <|-- VoteGoal
GoalTemplate <|-- SimulateAllUsersGoal
GoalTemplate <|-- FullSeedGoal
Why: Goal strings follow a common structure (context + instructions + reporting request) but vary in specifics. Sub-classes encapsulate the variation while the interface stays stable.
3.4 Context Object — BrowserManager¶
BrowserManager acts as a Context Object, encapsulating all Playwright state (browser instance, context pool, active user) behind a clean async API. This isolates the agent and tool dispatcher from Playwright's async complexity.
4. Module dependency graph¶
graph LR
cli --> agent
cli --> config
agent --> browser
agent --> tools
agent --> config
tools --> browser
tools --> covers
covers --> config
browser --> config
All modules depend only on the layer below them, preventing circular imports.
5. Agentic loop — sequence diagram¶
sequenceDiagram
participant Agent as BrowserAgent.run()
participant Claude as Anthropic API
participant Dispatcher as tools.dispatch()
participant Browser as BrowserManager
participant Observer
Agent->>Observer: emit(GOAL_START)
loop Until end_turn or step limit
Agent->>Observer: emit(STEP_START)
Agent->>Claude: messages.create(history + tools)
Claude-->>Agent: content blocks
loop For each text block
Agent->>Observer: emit(AGENT_TEXT)
end
loop For each tool_use block
Agent->>Observer: emit(TOOL_CALL)
Agent->>Dispatcher: dispatch(tool_name, input, bm)
Dispatcher->>Browser: browser action
Browser-->>Dispatcher: result
Dispatcher-->>Agent: result
Agent->>Observer: emit(TOOL_RESULT)
end
Agent->>Claude: tool_result messages
end
Agent->>Observer: emit(GOAL_COMPLETE or STEP_LIMIT)
6. Multi-user context model¶
Each simulated user gets an isolated Playwright BrowserContext with its own cookies and session storage. Switching users is O(1) — contexts are lazily created and cached.
graph TB
BM["BrowserManager"]
C1["Context: admin<br/>(cookies, sessionStorage)"]
C2["Context: alice<br/>(cookies, sessionStorage)"]
C3["Context: bob<br/>(cookies, sessionStorage)"]
BM --> C1
BM --> C2
BM --> C3
P1["Page"] --> C1
P2["Page"] --> C2
P3["Page"] --> C3
7. Key design decisions¶
| Decision | Rationale |
|---|---|
| Vision-first (screenshots) over selector-based | Adapts to any UI; no maintenance when the app changes |
| Structured tool schemas | Claude can reason about available actions; schemas are validated by the API |
| Strategy for cover sources | Open/Closed principle — new sources don't require modifying the orchestrator |
| Observer for agent output | Separates concerns; tests can use a no-op observer |
dataclass for SimUser |
Zero-boilerplate, auto-generated __eq__ and __repr__, IDE-friendly |
pyproject.toml over setup.py |
PEP 517/518 modern packaging standard |
src layout |
Prevents accidental imports of non-installed code during development |
8. Error handling strategy¶
| Layer | Approach |
|---|---|
| Browser actions | Return "ERROR: ..." strings rather than raising; the agent can read and recover |
| Cover sources | Catch httpx.HTTPError, log a warning, return empty list |
| Agent loop | Tool errors are fed back to Claude as text; Claude decides how to recover |
| CLI | Click's BadParameter for invalid user input; unhandled exceptions propagate |
9. Testing strategy¶
pyramid
accTitle: Test pyramid
accDescr: Unit tests form the base; integration tests are fewer
section Unit (fast, offline)
"config — SimUser, routes, env vars" : 15
"covers — strategy pattern, HTTP mocking" : 20
"browser — async mock, BrowserManager" : 18
"tools — dispatch routing, format_tool_result" : 14
"agent — observer pattern, loop logic" : 16
section Integration (real browser / network)
"BrowserManager — live Chromium" : 4
"CoverFetcher — live Open Library" : 2
All external HTTP calls in unit tests are intercepted with respx. Browser tests use pytest-asyncio with a live_browser fixture that spins up real Chromium (skipped by default in CI via SKIP_INTEGRATION=1).