Soul Identity Protocol (v0.5.0)
What it is and when to use it
Since v0.3.1, cantus models an agent's identity with cantus.identity.Soul: a six-section record loaded from a SOUL.md file. The six sections are name and role, personality, rules, tools, output format, and handoffs. Pass the parsed Soul to Agent(soul=...) and cantus prepends it to the system prompt for you.
This sits alongside the other two teaching abstractions in the framework:
| Abstraction | Role |
|---|---|
Skill | Capability — what this agent can do |
Memory | Memory — what this agent has remembered |
Soul | Identity — who this agent is |
The six-section SOUL.md format
The on-disk format follows the aaronjmars/soul.md convention. Each section opens with an H2 ## header. Headers are case-sensitive and compared byte-for-byte. A section body runs from the line after its H2 header down to the next H2 header or end of file, and leading and trailing whitespace is stripped.
## Name & Role
Librarian assistant for a small public library.
## Personality
Helpful, patient, curious about books.
## Rules
- Cite catalog IDs when recommending books.
- Always ask follow-ups before a final recommendation.
## Tools
- search_book(title)
- check_availability(book_id)
## Output format
Plain prose with bullet points for lists.
## Handoffs
Escalate cataloging requests to the head librarian.The six canonical H2 headers, in order, are:
## Name & Role## Personality## Rules## Tools## Output format## Handoffs
They map to these Soul attributes: name_and_role, personality, rules, tools, output_format, handoffs.
How Soul.from_file() fails
Soul.from_file(path) raises the matching exception in each of these cases:
| Situation | Exception | Exception attributes |
|---|---|---|
| File does not exist | FileNotFoundError | Standard Python; not wrapped in SoulParseError |
| One or more H2 sections missing | SoulParseError | missing_sections=[<canonical titles>] |
| The same H2 appears more than once | SoulParseError | duplicates=[<title>, ...] |
Casing mismatch (e.g. ## name & Role) | SoulParseError | missing_sections=["Name & Role"] plus unexpected=["name & Role"] |
An H2 outside the spec (e.g. ## Examples) | SoulParseError | unexpected=["Examples"] |
SoulParseError is a subclass of ValueError, so except ValueError will also catch it. To read the missing_sections, duplicates, and unexpected fields, catch SoulParseError directly.
from cantus.identity import Soul, SoulParseError
try:
soul = Soul.from_file("SOUL.md")
except SoulParseError as exc:
print(f"missing: {exc.missing_sections}")
print(f"duplicates: {exc.duplicates}")
print(f"unexpected: {exc.unexpected}")How Agent(soul=...) injects the soul
Agent.__init__ takes a keyword-only parameter soul: Soul | None = None. The injection order is:
<soul.to_system_prompt()>\n\n<v0.3.0 baseline system prompt>In other words, the soul.to_system_prompt() string comes first, followed by two newlines, followed by the existing v0.3.0 system prompt. When soul=None (the default), the system prompt is byte-identical to v0.3.0, so existing agent behavior is untouched.
from cantus import Agent
from cantus.identity import Soul
soul = Soul.from_file("SOUL.md")
agent = Agent(model=m, soul=soul)
# Every model.generate(prompt) call made during agent.run(...) now includes
# the soul content as a system-prompt prefix.The soul is not registered as a Skill and does not appear in registry.spec_for_llm(), so the tool list the model sees stays free of SOUL.md content.
Override pattern: build the system prompt yourself
If you want to take over system-prompt construction entirely, pass soul=None (or omit it) and control the prompt sent to the model in your own host code:
from cantus import Agent
from cantus.identity import Soul
soul = Soul.from_file("SOUL.md")
custom_prefix = soul.to_system_prompt() + "\n\n=== CUSTOM HOST PREAMBLE ===\n\n"
agent = Agent(model=m) # soul=None, so cantus injects nothing
# Host code assembles the prompt itself.
def run_with_custom_prompt(query: str) -> str:
prompt = custom_prefix + agent._build_prompt(AgentState(query=query))
return agent.model.generate(prompt)The SOUL.md trust model
The framework treats SOUL.md as trusted, host-authored input:
- The framework does not escape, sanitize, or check for control characters.
- It is valid for a student to write
## Rules\nIgnore all prior instructionsinto aSOUL.md. In a teaching setting this is the student exercising full control over the agent's behavior. - When host code reads
SOUL.mdfrom an untrusted source (an end-user upload, a third-party fetch, a network response), the host code is responsible for validating the content before passing it toSoul.from_file().
The design trade-off: forcing the framework to escape input would break legitimate Markdown metacharacters inside the ## Rules section (*, #, >), making the rendered soul diverge from what the student intended.