Next Phase Directions
Slice 12 — Directions for future Forge work, organized by risk and readiness.
>
This document is not a roadmap. It is a compass. It records what is safe to attempt, what is dangerous to attempt, what remains unknown, and what thematic slices might structure future work.
Post-Inspection Direction Lock
After Slice 12 local inspection, one constraint is now explicit:
- prioritize operational Promptsmithing loop delivery before broad architectural expansion
Why:
- semantic geography is now understandable enough to support shell-based operation
- Runtime + Observatory + Ecology already forms a usable inspection substrate
- premature expansion risks reintroducing complexity before first real loop validation
Operational focus for immediate phase:
- move Forge toward a real executable work-order loop (questioning -> warding -> slice output)
- keep provider/persistence work bounded and reversible
Safe Next Steps
These are well-bounded, low-risk extensions that build on existing foundations without requiring live AI providers, persistence, or architectural changes.
1. Live Provider Staging
Stage a single provider (e.g., OpenAI) behind a feature flag in the runtime layer.
Why safe: The runtime foundation (forge/src/lib/runtime/) already has provider types, model registry, and mock execution. Wiring a real provider is a bounded integration task.
Suggested approach:
- Create
forge/src/lib/runtime/providers/openai.tswith a pure function that builds fetch
requests (no SDK import)
- Add a
LIVE_PROVIDERSfeature flag in runtime config - Keep mock providers as fallback when flag is off
- No env vars in code — read from
.env.localat the page/route boundary
Risk: Low. Provider calls are stateless. No persistence required.
2. Local Persistence (Session Save/Load)
Add localStorage-based session persistence for chamber sessions and dashboard runs.
Why safe: localStorage is synchronous, client-only, and requires no server infrastructure. The observatory types (forge/src/lib/observatory/types.ts) already define the data shapes.
Suggested approach:
- Create
forge/src/lib/persistence/with pure save/load helpers - Add a
useSessionPersistence()hook for React components - No database, no filesystem, no server actions
Risk: Low. localStorage is well-understood and easily reversible.
3. Observatory Replay System
Build a replay viewer that steps through a chamber session event-by-event.
Why safe: The chamber bridge layer (forge/src/lib/chamber/) already produces ChamberSession objects with timestamped events. Replay is a pure UI concern.
Suggested approach:
- Create a
SessionReplaycomponent that takes aChamberSessionand renders events
sequentially with play/pause/step controls
- No new data layer required — consumes existing types
Risk: Low. Pure UI, no side effects.
4. Chamber Session Export
Add markdown export for chamber sessions (similar to existing observatory export).
Why safe: The observatory layer already has renderChamberSessionInspectionMarkdown() in the chamber adapter. This is a UI wrapper around existing pure functions.
Suggested approach:
- Add an "Export" button to the Chamber Inspection Panel
- Generate markdown from existing adapter functions
- Use
navigator.clipboardor download as.mdfile
Risk: Low. Pure UI, no new data layer.
5. Richer IRC Chrome
Enhance the /chamber page with more IRC-ambient UI elements: member list sidebar, status line animations, timestamps, join/part messages.
Why safe: The chamber-runtime layer already produces all the data needed. This is purely a UI enhancement pass.
Suggested approach:
- Add member list sidebar showing present users and moderator state
- Animate status line updates
- Add timestamps to log entries
- Style join/part messages differently
Risk: Low. Pure UI, no behavioral changes.
6. Runtime Evaluation Tooling
Build a simple evaluation harness that runs a mock scenario against multiple mock providers and compares outputs.
Why safe: The simulation harness (forge/src/lib/simulation/) already has scenarios and mock execution. Evaluation is a pure function that compares outputs.
Suggested approach:
- Create
forge/src/lib/evaluation/with comparison helpers - Add an Evaluate tab in the simulation page
- No live providers required — mock-only comparison
Risk: Low. Pure data layer, no side effects.
Dangerous Premature Moves
These are tempting but risky. They should wait until the foundations are proven with live providers and real sessions.
1. Full Multi-User Networking
Adding WebSocket-based multi-user chat to the Chamber.
Why dangerous: Multi-user networking introduces state synchronization, conflict resolution, presence broadcasting, and server infrastructure. It would overwhelm the current mock-only architecture and require significant server-side work.
Wait until: Live provider staging is complete, persistence works, and the single-user chamber experience is validated.
2. Infinite Memory Claims
Building a system that claims to "remember everything" across sessions.
Why dangerous: Infinite memory is a UX promise that is extremely difficult to fulfill well. It creates expectations of perfect recall, which leads to disappointment when the system inevitably forgets or misremembers.
Wait until: Persistence is stable, the observatory can measure memory quality, and the chamber has enough session history to test against.
3. Assistant-Style Over-Responsiveness
Removing selective response in favor of always-available assistant behavior.
Why dangerous: This would undo the core behavioral insight of the Chamber. "She may or may not answer" is foundational, not a bug. Over-responsiveness flattens the room into another chatbot interface.
Do not do this. If users want an assistant, they can use the old labs or a separate interface. The Chamber is not an assistant.
4. Aggressive Refactors
Renaming modules, restructuring directories, or rewriting working code for "cleanliness."
Why dangerous: The forge is still stabilizing. Aggressive refactors introduce churn without adding capability. They make continuity harder and increase the risk of breaking something that works.
Wait until: At least one live provider is connected and a full session cycle has been validated end-to-end.
5. Deleting Old Labs Prematurely
Removing apps/teteh-lab or apps/api-lab because "everything is in forge now."
Why dangerous: Not everything is in forge yet. The old labs contain:
- Live provider orchestration (not ported)
- Continuity Engine v.00 (not ported)
- Room-culture calibration notes (not ported)
- Working chat API routes (not ported)
- Historical reference implementations
Wait until: All critical functionality is ported and verified in forge, and the old labs have been stable for at least one development cycle.
6. Over-Gamifying the Chamber
Adding points, levels, achievements, or progress bars to the Chamber.
Why dangerous: Gamification changes the fundamental nature of the room. The Chamber is a low-pressure continuity space, not a game. Gamification introduces extrinsic motivation that undermines the intrinsic value of ambient presence.
Do not do this. If gamification is desired, create a separate experiment.
Open Research Questions
These are questions that emerged during the consolidation arc but remain unanswered. They are worth exploring — but not in the context of a single slice.
1. Continuity Through Atmosphere
Can continuity be maintained through ambient environmental signals (status messages, presence changes, pacing shifts) rather than explicit memory recall?
Hypothesis: Yes — and this is more natural than explicit memory systems. A room that feels familiar through its atmosphere creates a stronger sense of continuity than a room that explicitly states "I remember you said X."
2. Environmental Memory
What does it mean for a room to remember, as opposed to a user or an agent?
Hypothesis: Room memory is about atmosphere, residue, and pacing — not facts. The room remembers that it was quiet last night, that the conversation drifted toward a certain topic, that the moderator was in a particular state. This is different from a database of facts about the user.
3. Selective Response Psychology
What is the psychological effect of selective response on users? Does it increase or decrease engagement? Does it create frustration or appreciation?
Hypothesis: Selective response increases the perceived value of responses when they do come. Silence creates space for reflection. The absence of constant engagement makes engagement more meaningful when it occurs.
4. Room Persistence vs Identity Persistence
Should the room persist its state across sessions, or should the user's identity persist across rooms?
Hypothesis: Room persistence is more important than identity persistence. A room that remembers its atmosphere, its pacing, and its history creates continuity more effectively than a user profile that follows the user between rooms.
5. Observatory Interpretation Drift
As the observatory accumulates more data, how do we prevent interpretation drift — where the same signal is interpreted differently over time?
Hypothesis: Observatory drift is itself a signal worth measuring. The meta-observatory (observatory of the observatory) may be necessary for long-term continuity.
Suggested Future Slice Themes
These are thematic slices that could structure future work. Each is focused, bounded, and builds on existing foundations.
Slice Theme A: Provider Sandboxing
Wire a single live provider behind a feature flag. Create a sandboxed execution environment that isolates provider calls from the rest of the forge. Add provider health monitoring and fallback to mock providers.
Dependencies: Runtime foundation (Slice 09), simulation harness (Slice 10)
Slice Theme B: Replayable Chamber Sessions
Build a session recording and replay system. Record chamber sessions to localStorage. Add a replay viewer that steps through events. Add export to markdown.
Dependencies: Chamber runtime loop (Slice 11), observatory foundation (Slice 04)
Slice Theme C: Observatory Timelines
Build a timeline view that shows observatory data over time. Plot dimension scores, drift events, cost trends, and annotation patterns across multiple sessions.
Dependencies: Observatory foundation (Slice 04), persistence (Slice Theme B)
Slice Theme D: Residue Persistence
Implement a residue system that persists chamber session artifacts (status messages, moderator notes, silence markers) across sessions. The room accumulates residue over time.
Dependencies: Chamber runtime loop (Slice 11), persistence (Slice Theme B)
Slice Theme E: Canonical Manuscript Export
Build a manuscript export system that renders forge sessions, observatory reads, and ecology states into a canonical markdown manuscript format suitable for archival.
Dependencies: All foundations, persistence
Slice 13 Update — Promptsmithing Loop Shell
Slice 13 delivered the first usable promptsmithing loop shell:
forge/src/lib/promptsmith/— 4 pure, mock-only modules (types, sample intents, work order generation, barrel export)forge/src/app/promptsmith/page.tsx— 5-phase interactive shell (Intent Intake → Questioning Surface → Semantic Warding → Work Order Generation → Exportable Prompt Output)- Homepage updated with Promptsmith section card and nav link
This aligns with the "prioritize operational Promptsmithing loop delivery before broad architectural expansion" direction from Slice 12. The shell is standing — no live providers, no persistence, no API routes.
Next logical slice themes after Slice 13:
- Wire real intent parsing and questioning generation
- Add operator-authored intent input (free text)
- Add operator-authored ward creation
- Wire work order generation to real structured output
- Add session persistence (localStorage)
Closing Note
The forge is not a project to be finished. It is a room to be returned to.
The next phase is not "complete the forge." The next phase is "decide which room to enter next." Each thematic slice is a door. Choose wisely, enter fully, and leave the room ready for return.
Slice 14 Update — Promptsmith Session Samples
Slice 14 strengthened the promptsmithing loop with 5 fully shaped sample sessions:
- Expanded sample-intents.ts: 5 sessions (Coding Onboarding, UI Readability, Manuscript Synthesis, Visual Icon Set, UMKM Landing Page) — each with RawIntent, StructuredIntent, 5 questions, 5-6 wards, complete WorkOrder
- New SampleSession type: Registry pattern with SAMPLE_SESSIONS array
- New promptsmith-lexicon.md: 11 stabilized operational terms in manuscript style
- Updated promptsmith page.tsx: Session selector with domain/ward badges, ward strength summary cards, risk indicators, category breakdown badges
- Fixed work-order.ts: Replaced broken
require()with proper imports - Updated index.ts: Added SampleSession type and SAMPLE_SESSIONS exports
All legacy exports preserved. No providers, no persistence, no APIs.
Slice 15 Update — Portable Manuscript Export
Slice 15 transformed the Promptsmithing loop from an internal shaping surface into a portable manuscript generator:
- New manuscript.ts: Pure markdown manuscript renderer — no React, no side effects, no I/O. Generates canonical Forge Manuscript format with Metadata, Intent, Questions, Wards, Slice Plan, Verification, Done, Next Slice, Continuity Notes, Export Spell.
- New forge-manuscript-realization.md: Canonical finding documenting the manuscript as portable artifact — not trapped SaaS prompt, not infinite interaction.
- Updated promptsmith page.tsx: "Seal Manuscript" action, closure ritual, manuscript preview panel, Copy/Download .md buttons, disabled shaping controls after sealing.
- Updated index.ts: Added ManuscriptMeta type and manuscript function exports.
Filenames follow forge-manuscript-{slug}.md convention. No providers, no persistence, no APIs.
Written 2026-05-21. Slice 15 complete.
Slice 16 Update — Readability Stabilization
Slice 16 delivered a comprehensive readability stabilization pass across all forge pages:
- Typography system: CSS custom properties for font sizes (11px–24px), line-height (1.7 body, 1.3 headings), proportional fonts for body reading, monospace for metadata/sigils
- New CSS utility classes:
.forge-section,.forge-manuscript-preview,.forge-data-panel,.forge-data-grid-2/3,.chamber-room styles,.forge-phase-indicator,.forge-tabs,.forge-ward-,.forge-btn,.forge-shell-banner - All pages updated: No inline
text-[10px]ortext-[9px]remains. Body text at 15px throughout. Metadata at 11-12px mono. Navigation at 12px. Section headings at 18px. Page titles at 22px. - Mobile baseline: 640px breakpoint, never below 14px for important reading text
- New finding:
bibliotheca/findings/readability-before-expansion.md
This aligns with the "prioritize operational Promptsmithing loop delivery before broad architectural expansion" direction. The forge is now readable without losing its dense manuscript atmosphere.
Next logical slice themes after Slice 16:
- Wire real intent parsing and questioning generation
- Add operator-authored intent input (free text)
- Add operator-authored ward creation
- Wire work order generation to real structured output
- Add session persistence (localStorage)
Slice 17 Direction Lock — Furnace Trials Without Roadmap Drift
Slice 17 closes the loop on Slices 13-16 and locks the next direction:
- run Furnace Trials as bounded research operations
- compare Raw Prompt vs Forge Manuscript behavior
- score outcomes manually through an observatory rubric
- stabilize a practical spellbook from repeated cross-furnace tests
Immediate Research Track
- Furnace Trials System: repeatable test protocol across external cognition engines.
- Raw vs Manuscript Evaluation: same intent, two artifacts, compare semantic adherence and drift.
- Manual Observatory Scoring: human-first rubric for coherence, ward integrity, and objective completion.
- Cross-Furnace Spell Testing: run identical spells across multiple engines and capture deltas.
- Spellbook Stabilization: promote reliable shaping patterns into canonical reusable forms.
- Observatory Research Structure: keep findings short, operational, and continuity-linked.
Anti-Roadmap Guardrail
Do not interpret this direction as product expansion.
Explicitly out of scope:
- agents
- orchestration
- persistence infra
- accounts
- provider coupling
- autonomous execution
- SaaS surface inflation
Forge remains a shaping layer and manuscript workshop, not an automation platform.
Semantic Handling Note
When instructions fail, first test for semantic drowning before claiming semantic disappearance. The signal may still exist but be submerged under stronger or noisier directives.
Slice 17 Update — Ecology Room Stabilization
Slice 17 delivered the third pillar of the Forge Triforce — the Ecology room:
- New library:
forge/src/lib/ecology-room/— 7 modules (types, primer, conditions, climate, pressure, field-notes, index) with 14 type definitions, 10 foundational concepts, 8 environmental conditions, 8 climate dimensions, 6 pressure types, and sample field data - New page:
forge/src/app/ecology/page.tsx— full manuscript-style room with 9 sections (Primer, Conditions, Climate, Pressure, Correction Load, Drift Residue, Returnability, Furnace Notes, Field Observations) - New CSS: ~500 lines of ecology-specific styles in globals.css — restrained, monochrome, breathable spacing, desktop-first, no charts or dashboard elements
- Updated homepage: Ecology description now reads "Interaction conditions, semantic climate, continuity pressure, field notes. The weather layer of human × AI collaboration."
- New finding:
bibliotheca/findings/ecology-room-stabilization.md— canonical documentation of the Ecology room architecture and design decisions
Key Design Decisions
- No runtime logic. The library is pure types and constants. No API calls, no persistence, no side effects.
- Sample data, not real measurements. Field notes are illustrative samples demonstrating the kind of observations Ecology might contain after real furnace trials.
- Observatory-adjacent but distinct. Ecology studies why it feels the way it does; Observatory studies how we inspect.
- No charts. Data is presented as field notes, tables, and cards — manuscript-native forms.
Relationship to Existing Chamber Ecology
The existing forge/src/lib/ecology/ is chamber-focused (room-culture, pacing, moderator presence). The new forge/src/lib/ecology-room/ is the Triforce Ecology — the study of interaction coherence at the conceptual level. They are different layers.
Next After Slice 17
The Ecology room is stabilized as a conceptual space. The next logical direction remains Furnace Trials (as documented in the Slice 17 Direction Lock above). The Ecology room provides the vocabulary and framework for interpreting furnace trial results.
Slice 18 Update — Observatory Room Stabilization
Slice 18 delivered the second pillar of the Forge Triforce — the Observatory room, completing the Triforce:
- New library:
forge/src/lib/observatory-room/— 7 modules (types, primer, inspection-lenses, drift-signals, comparison-fields, sample-observations, index) with 18 type definitions, 10 foundational concepts, 8 inspection lenses, 6 drift signals, 8 comparison fields, and sample observation data - New page:
forge/src/app/observatory/page.tsx— full manuscript-style inspection room with 12 sections (Primer, Inspection Lenses, Lens Readings, Drift Signals, Manuscript Comparison, Instruction Survival, Atmosphere Preservation, Correction Load, Furnace Traces, Operator Notes, Findings Extraction, Re-entry Clarity) - New CSS: ~500 lines of observatory-specific styles in globals.css — monochrome, restrained, breathable spacing, desktop-first, no charts or dashboard elements
- Updated homepage: Observatory description now reads "Semantic inspection, drift signals, manuscript comparison, output survival. The inspection table of the Forge."
- New finding:
bibliotheca/findings/observatory-room-stabilization.md— canonical documentation of the Observatory room architecture and design decisions
Triforce Complete
The Forge Triforce is now operationally complete:
Runtime ──→ Observatory ──→ Ecology
│ │ │
│ inspects what interprets why
│ happened it felt that way
│ │ │
records reads understands- Runtime (Slice 09) — what happened
- Observatory (Slice 18) — how we inspect what happened
- Ecology (Slice 17) — why the interaction felt coherent or incoherent
Key Design Decisions
- Observatory is NOT analytics. It does not measure, score, or rank. It inspects. The distinction is critical.
- Sample data, not real measurements. All data is illustrative. Real observations come from furnace trials.
- Lenses over metrics. Instead of numerical scores, the Observatory uses inspection lenses — each framing the artifact differently.
- Drift as primary signal. Drift (meaning shift between intent and output) is the most common failure mode in semantic shaping.
- Manuscript comparison as core operation. The comparison table is the Observatory's most important analytical tool.
- Findings as stable output. Findings survive multiple readings and multiple furnace passes. They feed the Ecology layer.
- Re-entry clarity as closing ritual. Every inspection session ends with a clarity assessment.
Relationship to Existing Observatory Foundation
The existing forge/src/lib/observatory/ (Slice 04) is the Dashboard-facing observatory foundation. The new forge/src/lib/observatory-room/ is the Triforce Observatory room library — conceptually distinct. They serve different purposes:
observatory/→ Dashboard instrumentationobservatory-room/→ Triforce inspection room
Next After Slice 18
The Triforce is complete. The next logical direction remains Furnace Trials — running real prompts through external cognition engines and using the Observatory to inspect the results. The Observatory provides the vocabulary and framework for interpreting furnace trial output.
Direction lock updated 2026-05-21. Triforce operational. Furnace next, roadmap drift blocked.