00 / 00

BOOKCASE

16 MIN READfindings

Next Phase Directions

Slice 12 — Directions for future Forge work, organized by risk and readiness.

>

This document is not a roadmap. It is a compass. It records what is safe to attempt, what is dangerous to attempt, what remains unknown, and what thematic slices might structure future work.


Post-Inspection Direction Lock

After Slice 12 local inspection, one constraint is now explicit:

  • prioritize operational Promptsmithing loop delivery before broad architectural expansion

Why:

  • semantic geography is now understandable enough to support shell-based operation
  • Runtime + Observatory + Ecology already forms a usable inspection substrate
  • premature expansion risks reintroducing complexity before first real loop validation

Operational focus for immediate phase:

  • move Forge toward a real executable work-order loop (questioning -> warding -> slice output)
  • keep provider/persistence work bounded and reversible

Safe Next Steps

These are well-bounded, low-risk extensions that build on existing foundations without requiring live AI providers, persistence, or architectural changes.

1. Live Provider Staging

Stage a single provider (e.g., OpenAI) behind a feature flag in the runtime layer.

Why safe: The runtime foundation (forge/src/lib/runtime/) already has provider types, model registry, and mock execution. Wiring a real provider is a bounded integration task.

Suggested approach:

  • Create forge/src/lib/runtime/providers/openai.ts with a pure function that builds fetch

requests (no SDK import)

  • Add a LIVE_PROVIDERS feature flag in runtime config
  • Keep mock providers as fallback when flag is off
  • No env vars in code — read from .env.local at the page/route boundary

Risk: Low. Provider calls are stateless. No persistence required.

2. Local Persistence (Session Save/Load)

Add localStorage-based session persistence for chamber sessions and dashboard runs.

Why safe: localStorage is synchronous, client-only, and requires no server infrastructure. The observatory types (forge/src/lib/observatory/types.ts) already define the data shapes.

Suggested approach:

  • Create forge/src/lib/persistence/ with pure save/load helpers
  • Add a useSessionPersistence() hook for React components
  • No database, no filesystem, no server actions

Risk: Low. localStorage is well-understood and easily reversible.

3. Observatory Replay System

Build a replay viewer that steps through a chamber session event-by-event.

Why safe: The chamber bridge layer (forge/src/lib/chamber/) already produces ChamberSession objects with timestamped events. Replay is a pure UI concern.

Suggested approach:

  • Create a SessionReplay component that takes a ChamberSession and renders events

sequentially with play/pause/step controls

  • No new data layer required — consumes existing types

Risk: Low. Pure UI, no side effects.

4. Chamber Session Export

Add markdown export for chamber sessions (similar to existing observatory export).

Why safe: The observatory layer already has renderChamberSessionInspectionMarkdown() in the chamber adapter. This is a UI wrapper around existing pure functions.

Suggested approach:

  • Add an "Export" button to the Chamber Inspection Panel
  • Generate markdown from existing adapter functions
  • Use navigator.clipboard or download as .md file

Risk: Low. Pure UI, no new data layer.

5. Richer IRC Chrome

Enhance the /chamber page with more IRC-ambient UI elements: member list sidebar, status line animations, timestamps, join/part messages.

Why safe: The chamber-runtime layer already produces all the data needed. This is purely a UI enhancement pass.

Suggested approach:

  • Add member list sidebar showing present users and moderator state
  • Animate status line updates
  • Add timestamps to log entries
  • Style join/part messages differently

Risk: Low. Pure UI, no behavioral changes.

6. Runtime Evaluation Tooling

Build a simple evaluation harness that runs a mock scenario against multiple mock providers and compares outputs.

Why safe: The simulation harness (forge/src/lib/simulation/) already has scenarios and mock execution. Evaluation is a pure function that compares outputs.

Suggested approach:

  • Create forge/src/lib/evaluation/ with comparison helpers
  • Add an Evaluate tab in the simulation page
  • No live providers required — mock-only comparison

Risk: Low. Pure data layer, no side effects.


Dangerous Premature Moves

These are tempting but risky. They should wait until the foundations are proven with live providers and real sessions.

1. Full Multi-User Networking

Adding WebSocket-based multi-user chat to the Chamber.

Why dangerous: Multi-user networking introduces state synchronization, conflict resolution, presence broadcasting, and server infrastructure. It would overwhelm the current mock-only architecture and require significant server-side work.

Wait until: Live provider staging is complete, persistence works, and the single-user chamber experience is validated.

2. Infinite Memory Claims

Building a system that claims to "remember everything" across sessions.

Why dangerous: Infinite memory is a UX promise that is extremely difficult to fulfill well. It creates expectations of perfect recall, which leads to disappointment when the system inevitably forgets or misremembers.

Wait until: Persistence is stable, the observatory can measure memory quality, and the chamber has enough session history to test against.

3. Assistant-Style Over-Responsiveness

Removing selective response in favor of always-available assistant behavior.

Why dangerous: This would undo the core behavioral insight of the Chamber. "She may or may not answer" is foundational, not a bug. Over-responsiveness flattens the room into another chatbot interface.

Do not do this. If users want an assistant, they can use the old labs or a separate interface. The Chamber is not an assistant.

4. Aggressive Refactors

Renaming modules, restructuring directories, or rewriting working code for "cleanliness."

Why dangerous: The forge is still stabilizing. Aggressive refactors introduce churn without adding capability. They make continuity harder and increase the risk of breaking something that works.

Wait until: At least one live provider is connected and a full session cycle has been validated end-to-end.

5. Deleting Old Labs Prematurely

Removing apps/teteh-lab or apps/api-lab because "everything is in forge now."

Why dangerous: Not everything is in forge yet. The old labs contain:

  • Live provider orchestration (not ported)
  • Continuity Engine v.00 (not ported)
  • Room-culture calibration notes (not ported)
  • Working chat API routes (not ported)
  • Historical reference implementations

Wait until: All critical functionality is ported and verified in forge, and the old labs have been stable for at least one development cycle.

6. Over-Gamifying the Chamber

Adding points, levels, achievements, or progress bars to the Chamber.

Why dangerous: Gamification changes the fundamental nature of the room. The Chamber is a low-pressure continuity space, not a game. Gamification introduces extrinsic motivation that undermines the intrinsic value of ambient presence.

Do not do this. If gamification is desired, create a separate experiment.


Open Research Questions

These are questions that emerged during the consolidation arc but remain unanswered. They are worth exploring — but not in the context of a single slice.

1. Continuity Through Atmosphere

Can continuity be maintained through ambient environmental signals (status messages, presence changes, pacing shifts) rather than explicit memory recall?

Hypothesis: Yes — and this is more natural than explicit memory systems. A room that feels familiar through its atmosphere creates a stronger sense of continuity than a room that explicitly states "I remember you said X."

2. Environmental Memory

What does it mean for a room to remember, as opposed to a user or an agent?

Hypothesis: Room memory is about atmosphere, residue, and pacing — not facts. The room remembers that it was quiet last night, that the conversation drifted toward a certain topic, that the moderator was in a particular state. This is different from a database of facts about the user.

3. Selective Response Psychology

What is the psychological effect of selective response on users? Does it increase or decrease engagement? Does it create frustration or appreciation?

Hypothesis: Selective response increases the perceived value of responses when they do come. Silence creates space for reflection. The absence of constant engagement makes engagement more meaningful when it occurs.

4. Room Persistence vs Identity Persistence

Should the room persist its state across sessions, or should the user's identity persist across rooms?

Hypothesis: Room persistence is more important than identity persistence. A room that remembers its atmosphere, its pacing, and its history creates continuity more effectively than a user profile that follows the user between rooms.

5. Observatory Interpretation Drift

As the observatory accumulates more data, how do we prevent interpretation drift — where the same signal is interpreted differently over time?

Hypothesis: Observatory drift is itself a signal worth measuring. The meta-observatory (observatory of the observatory) may be necessary for long-term continuity.


Suggested Future Slice Themes

These are thematic slices that could structure future work. Each is focused, bounded, and builds on existing foundations.

Slice Theme A: Provider Sandboxing

Wire a single live provider behind a feature flag. Create a sandboxed execution environment that isolates provider calls from the rest of the forge. Add provider health monitoring and fallback to mock providers.

Dependencies: Runtime foundation (Slice 09), simulation harness (Slice 10)

Slice Theme B: Replayable Chamber Sessions

Build a session recording and replay system. Record chamber sessions to localStorage. Add a replay viewer that steps through events. Add export to markdown.

Dependencies: Chamber runtime loop (Slice 11), observatory foundation (Slice 04)

Slice Theme C: Observatory Timelines

Build a timeline view that shows observatory data over time. Plot dimension scores, drift events, cost trends, and annotation patterns across multiple sessions.

Dependencies: Observatory foundation (Slice 04), persistence (Slice Theme B)

Slice Theme D: Residue Persistence

Implement a residue system that persists chamber session artifacts (status messages, moderator notes, silence markers) across sessions. The room accumulates residue over time.

Dependencies: Chamber runtime loop (Slice 11), persistence (Slice Theme B)

Slice Theme E: Canonical Manuscript Export

Build a manuscript export system that renders forge sessions, observatory reads, and ecology states into a canonical markdown manuscript format suitable for archival.

Dependencies: All foundations, persistence


Slice 13 Update — Promptsmithing Loop Shell

Slice 13 delivered the first usable promptsmithing loop shell:

  • forge/src/lib/promptsmith/ — 4 pure, mock-only modules (types, sample intents, work order generation, barrel export)
  • forge/src/app/promptsmith/page.tsx — 5-phase interactive shell (Intent Intake → Questioning Surface → Semantic Warding → Work Order Generation → Exportable Prompt Output)
  • Homepage updated with Promptsmith section card and nav link

This aligns with the "prioritize operational Promptsmithing loop delivery before broad architectural expansion" direction from Slice 12. The shell is standing — no live providers, no persistence, no API routes.

Next logical slice themes after Slice 13:

  • Wire real intent parsing and questioning generation
  • Add operator-authored intent input (free text)
  • Add operator-authored ward creation
  • Wire work order generation to real structured output
  • Add session persistence (localStorage)

Closing Note

The forge is not a project to be finished. It is a room to be returned to.

The next phase is not "complete the forge." The next phase is "decide which room to enter next." Each thematic slice is a door. Choose wisely, enter fully, and leave the room ready for return.

Slice 14 Update — Promptsmith Session Samples

Slice 14 strengthened the promptsmithing loop with 5 fully shaped sample sessions:

  • Expanded sample-intents.ts: 5 sessions (Coding Onboarding, UI Readability, Manuscript Synthesis, Visual Icon Set, UMKM Landing Page) — each with RawIntent, StructuredIntent, 5 questions, 5-6 wards, complete WorkOrder
  • New SampleSession type: Registry pattern with SAMPLE_SESSIONS array
  • New promptsmith-lexicon.md: 11 stabilized operational terms in manuscript style
  • Updated promptsmith page.tsx: Session selector with domain/ward badges, ward strength summary cards, risk indicators, category breakdown badges
  • Fixed work-order.ts: Replaced broken require() with proper imports
  • Updated index.ts: Added SampleSession type and SAMPLE_SESSIONS exports

All legacy exports preserved. No providers, no persistence, no APIs.


Slice 15 Update — Portable Manuscript Export

Slice 15 transformed the Promptsmithing loop from an internal shaping surface into a portable manuscript generator:

  • New manuscript.ts: Pure markdown manuscript renderer — no React, no side effects, no I/O. Generates canonical Forge Manuscript format with Metadata, Intent, Questions, Wards, Slice Plan, Verification, Done, Next Slice, Continuity Notes, Export Spell.
  • New forge-manuscript-realization.md: Canonical finding documenting the manuscript as portable artifact — not trapped SaaS prompt, not infinite interaction.
  • Updated promptsmith page.tsx: "Seal Manuscript" action, closure ritual, manuscript preview panel, Copy/Download .md buttons, disabled shaping controls after sealing.
  • Updated index.ts: Added ManuscriptMeta type and manuscript function exports.

Filenames follow forge-manuscript-{slug}.md convention. No providers, no persistence, no APIs.


Written 2026-05-21. Slice 15 complete.


Slice 16 Update — Readability Stabilization

Slice 16 delivered a comprehensive readability stabilization pass across all forge pages:

  • Typography system: CSS custom properties for font sizes (11px–24px), line-height (1.7 body, 1.3 headings), proportional fonts for body reading, monospace for metadata/sigils
  • New CSS utility classes: .forge-section, .forge-manuscript-preview, .forge-data-panel, .forge-data-grid-2/3, .chamber- room styles, .forge-phase-indicator, .forge-tabs, .forge-ward-, .forge-btn, .forge-shell-banner
  • All pages updated: No inline text-[10px] or text-[9px] remains. Body text at 15px throughout. Metadata at 11-12px mono. Navigation at 12px. Section headings at 18px. Page titles at 22px.
  • Mobile baseline: 640px breakpoint, never below 14px for important reading text
  • New finding: bibliotheca/findings/readability-before-expansion.md

This aligns with the "prioritize operational Promptsmithing loop delivery before broad architectural expansion" direction. The forge is now readable without losing its dense manuscript atmosphere.

Next logical slice themes after Slice 16:

  • Wire real intent parsing and questioning generation
  • Add operator-authored intent input (free text)
  • Add operator-authored ward creation
  • Wire work order generation to real structured output
  • Add session persistence (localStorage)

Slice 17 Direction Lock — Furnace Trials Without Roadmap Drift

Slice 17 closes the loop on Slices 13-16 and locks the next direction:

  • run Furnace Trials as bounded research operations
  • compare Raw Prompt vs Forge Manuscript behavior
  • score outcomes manually through an observatory rubric
  • stabilize a practical spellbook from repeated cross-furnace tests

Immediate Research Track

  1. Furnace Trials System: repeatable test protocol across external cognition engines.
  2. Raw vs Manuscript Evaluation: same intent, two artifacts, compare semantic adherence and drift.
  3. Manual Observatory Scoring: human-first rubric for coherence, ward integrity, and objective completion.
  4. Cross-Furnace Spell Testing: run identical spells across multiple engines and capture deltas.
  5. Spellbook Stabilization: promote reliable shaping patterns into canonical reusable forms.
  6. Observatory Research Structure: keep findings short, operational, and continuity-linked.

Anti-Roadmap Guardrail

Do not interpret this direction as product expansion.

Explicitly out of scope:

  • agents
  • orchestration
  • persistence infra
  • accounts
  • provider coupling
  • autonomous execution
  • SaaS surface inflation

Forge remains a shaping layer and manuscript workshop, not an automation platform.

Semantic Handling Note

When instructions fail, first test for semantic drowning before claiming semantic disappearance. The signal may still exist but be submerged under stronger or noisier directives.


Slice 17 Update — Ecology Room Stabilization

Slice 17 delivered the third pillar of the Forge Triforce — the Ecology room:

  • New library: forge/src/lib/ecology-room/ — 7 modules (types, primer, conditions, climate, pressure, field-notes, index) with 14 type definitions, 10 foundational concepts, 8 environmental conditions, 8 climate dimensions, 6 pressure types, and sample field data
  • New page: forge/src/app/ecology/page.tsx — full manuscript-style room with 9 sections (Primer, Conditions, Climate, Pressure, Correction Load, Drift Residue, Returnability, Furnace Notes, Field Observations)
  • New CSS: ~500 lines of ecology-specific styles in globals.css — restrained, monochrome, breathable spacing, desktop-first, no charts or dashboard elements
  • Updated homepage: Ecology description now reads "Interaction conditions, semantic climate, continuity pressure, field notes. The weather layer of human × AI collaboration."
  • New finding: bibliotheca/findings/ecology-room-stabilization.md — canonical documentation of the Ecology room architecture and design decisions

Key Design Decisions

  1. No runtime logic. The library is pure types and constants. No API calls, no persistence, no side effects.
  2. Sample data, not real measurements. Field notes are illustrative samples demonstrating the kind of observations Ecology might contain after real furnace trials.
  3. Observatory-adjacent but distinct. Ecology studies why it feels the way it does; Observatory studies how we inspect.
  4. No charts. Data is presented as field notes, tables, and cards — manuscript-native forms.

Relationship to Existing Chamber Ecology

The existing forge/src/lib/ecology/ is chamber-focused (room-culture, pacing, moderator presence). The new forge/src/lib/ecology-room/ is the Triforce Ecology — the study of interaction coherence at the conceptual level. They are different layers.

Next After Slice 17

The Ecology room is stabilized as a conceptual space. The next logical direction remains Furnace Trials (as documented in the Slice 17 Direction Lock above). The Ecology room provides the vocabulary and framework for interpreting furnace trial results.


Slice 18 Update — Observatory Room Stabilization

Slice 18 delivered the second pillar of the Forge Triforce — the Observatory room, completing the Triforce:

  • New library: forge/src/lib/observatory-room/ — 7 modules (types, primer, inspection-lenses, drift-signals, comparison-fields, sample-observations, index) with 18 type definitions, 10 foundational concepts, 8 inspection lenses, 6 drift signals, 8 comparison fields, and sample observation data
  • New page: forge/src/app/observatory/page.tsx — full manuscript-style inspection room with 12 sections (Primer, Inspection Lenses, Lens Readings, Drift Signals, Manuscript Comparison, Instruction Survival, Atmosphere Preservation, Correction Load, Furnace Traces, Operator Notes, Findings Extraction, Re-entry Clarity)
  • New CSS: ~500 lines of observatory-specific styles in globals.css — monochrome, restrained, breathable spacing, desktop-first, no charts or dashboard elements
  • Updated homepage: Observatory description now reads "Semantic inspection, drift signals, manuscript comparison, output survival. The inspection table of the Forge."
  • New finding: bibliotheca/findings/observatory-room-stabilization.md — canonical documentation of the Observatory room architecture and design decisions

Triforce Complete

The Forge Triforce is now operationally complete:

Runtime ──→ Observatory ──→ Ecology
  │              │               │
  │        inspects what     interprets why
  │        happened          it felt that way
  │              │               │
  records      reads          understands
  • Runtime (Slice 09) — what happened
  • Observatory (Slice 18) — how we inspect what happened
  • Ecology (Slice 17) — why the interaction felt coherent or incoherent

Key Design Decisions

  1. Observatory is NOT analytics. It does not measure, score, or rank. It inspects. The distinction is critical.
  2. Sample data, not real measurements. All data is illustrative. Real observations come from furnace trials.
  3. Lenses over metrics. Instead of numerical scores, the Observatory uses inspection lenses — each framing the artifact differently.
  4. Drift as primary signal. Drift (meaning shift between intent and output) is the most common failure mode in semantic shaping.
  5. Manuscript comparison as core operation. The comparison table is the Observatory's most important analytical tool.
  6. Findings as stable output. Findings survive multiple readings and multiple furnace passes. They feed the Ecology layer.
  7. Re-entry clarity as closing ritual. Every inspection session ends with a clarity assessment.

Relationship to Existing Observatory Foundation

The existing forge/src/lib/observatory/ (Slice 04) is the Dashboard-facing observatory foundation. The new forge/src/lib/observatory-room/ is the Triforce Observatory room library — conceptually distinct. They serve different purposes:

  • observatory/ → Dashboard instrumentation
  • observatory-room/ → Triforce inspection room

Next After Slice 18

The Triforce is complete. The next logical direction remains Furnace Trials — running real prompts through external cognition engines and using the Observatory to inspect the results. The Observatory provides the vocabulary and framework for interpreting furnace trial output.


Direction lock updated 2026-05-21. Triforce operational. Furnace next, roadmap drift blocked.