AI Integration Architecture
Script analysis, asset generation, and AI chat with Gemini and the Agent Development Kit (ADK).
Overview
On Book Pro uses Gemini for:
- Script Analysis — PDF parsing to extract scenes, characters, props, and cues
- SVG Asset Generation — AI-generated stage furniture icons
- Barry AI Assistant — ADK-powered conversational agent with tool-calling for production queries
All AI features run on Firebase Cloud Functions to keep client app lightweight.
Script Analysis Flow
1. Client Upload
// Client: Upload PDF to Firebase Storage
const storageRef = ref(storage, `scripts/${projectId}/${filename}`);
await uploadBytes(storageRef, pdfFile);
// Trigger Cloud Function
const analyzeScript = httpsCallable(functions, 'analyzeScriptPDF');
const result = await analyzeScript({
projectId,
fileUrl: await getDownloadURL(storageRef)
});2. Cloud Function Processing
Located: functions/src/ai/analyzeScript.ts
import { Genkit } from 'genkit';
import { gemini15Pro } from '@genkit-ai/googleai';
export const analyzeScriptPDF = onCall(async (request) => {
const { projectId, fileUrl } = request.data;
// 1. Download PDF from Storage
const pdfBuffer = await downloadFile(fileUrl);
// 2. Extract text (using pdf-parse or similar)
const scriptText = await parsePDF(pdfBuffer);
// 3. Send to Gemini for structured extraction
const ai = genkit({ plugins: [googleAI()] });
const result = await ai.generate({
model: gemini15Pro,
prompt: buildAnalysisPrompt(scriptText),
output: {
schema: ScriptAnalysisSchema, // Zod schema
},
});
return result.output;
});3. Structured Schema
import { z } from 'zod';
// ScriptBlock represents a single unit of script content
const ScriptBlockSchema = z.object({
id: z.string(),
pageNumber: z.number(),
blockIndex: z.number(),
type: z.enum([
'scene_heading', // ACT/SCENE headers
'character_cue', // Character name before dialogue
'dialogue', // Spoken lines
'lyrics', // Sung text
'stage_direction', // Directions, parentheticals
'transition', // BLACKOUT, END OF ACT
'technical_note', // SFX, LX, VAMP markers
'other' // Front matter, unclassified
]),
text: z.string(),
actId: z.string().nullable(),
sceneId: z.string().nullable(),
characterId: z.string().nullable(),
confidence: z.number().optional(),
isVerified: z.boolean(),
originalText: z.string().optional(),
});
const ScriptAnalysisSchema = z.object({
showStructure: z.object({
acts: z.array(z.object({
id: z.string(),
name: z.string(),
isPreset: z.boolean().optional(),
sceneIds: z.array(z.string()),
})),
scenes: z.array(z.object({
id: z.string(),
name: z.string(),
notes: z.string().nullish(),
})),
scriptBlocks: z.array(ScriptBlockSchema).optional(),
}),
characters: z.array(z.object({
id: z.string(),
name: z.string(),
notes: z.string().nullish(),
isEnsemble: z.boolean(),
sceneIds: z.array(z.string()).optional(),
})),
// Props and soundCues are now extracted on-demand (lazy extraction)
props: z.array(z.object({ id: z.string(), name: z.string(), sceneId: z.string() })).optional(),
soundCues: z.array(z.object({ id: z.string(), description: z.string(), sceneId: z.string() })).optional(),
});4. Client Data Mapping
// Client receives AI response
const { showStructure, characters, props, soundCues } = result.data;
// Map acts and scenes to Zustand store format
const mappedActs = showStructure.acts.map(act => ({
id: generateId(),
name: `Act ${act.actNumber}`,
sceneIds: [], // Will be populated with scene IDs
}));
const mappedScenes = showStructure.acts.flatMap(act =>
act.scenes.map(scene => ({
id: generateId(),
actId: act.actNumber,
sceneNumber: scene.sceneNumber,
name: scene.sceneTitle,
location: scene.location,
characterIds: mapCharacterNames(scene.characters),
}))
);
// Map characters to store format
const mappedCharacters = characters.map(char => ({
id: generateId(),
name: char.name,
notes: char.description,
sceneIds: [],
parentId: null,
}));
// Merge into store - characters go into showStructure.characters
mergeShowStructure({
acts: mappedActs,
scenes: mappedScenes,
characters: mappedCharacters, // Characters consolidated in showStructure
});
// Props and sound cues remain separate slices
mergeProps(props);
mergeSoundCues(soundCues);Note: Since Jan 2026, characters are stored in
showStructure.charactersrather than a separate top-level array. The AI response schema returns characters separately, but the client maps them into the unified showStructure.
Prompt Engineering
Analysis Prompt Structure
const buildAnalysisPrompt = (scriptText: string) => `
You are analyzing a theatrical script. Extract the following structured data:
1. Show Structure:
- Acts and their scenes
- Scene titles and locations
- Characters appearing in each scene
2. Characters:
- All character names (consistent spelling)
- Brief descriptions if provided
3. Props:
- All props mentioned in stage directions
- Scene where they first appear
4. Sound Cues:
- Audio effects mentioned
- When they occur
SCRIPT TEXT:
${scriptText}
Return ONLY valid JSON matching the schema. Do not include any markdown or explanation.
`;Key Strategies:
- Explicit output format — "Return ONLY valid JSON"
- Schema enforcement — Genkit validates against Zod schema
- Defensive filtering — Client validates data before merging
SVG Asset Generation
1. User Request
// Client: Request AI-generated asset
const generateAsset = httpsCallable(functions, 'generateStageAsset');
const result = await generateAsset({
description: 'blue victorian sofa',
dimensions: { width: 6, height: 3 } // feet
});2. Cloud Function
Located: functions/src/ai/generateAsset.ts
export const generateStageAsset = onCall(async (request) => {
const { description, dimensions } = request.data;
const ai = genkit({ plugins: [googleAI()] });
const result = await ai.generate({
model: gemini15Pro,
prompt: `
Generate an SVG icon for a stage prop: "${description}".
The SVG should:
- Be a simple, top-down view
- Use single color (#333333)
- Be 100x100 viewBox
- Be production-ready (no text labels)
Return ONLY the SVG code, no explanation.
`,
});
// Validate SVG
const svg = sanitizeSVG(result.text);
return { svg };
});3. Client Integration
// Add to Set Builder canvas
const asset = {
id: generateId(),
type: 'custom',
svg: result.data.svg,
x: 0,
y: 0,
width: dimensions.width * scalePixelsPerFoot,
height: dimensions.height * scalePixelsPerFoot,
};
addAssetToSet(asset);Hardening Patterns
1. Non-Determinism Guards
AI responses vary—always validate:
// ❌ Unsafe
const scenes = result.data.scenes;
updateScenes(scenes);
// ✅ Safe
const scenes = Array.isArray(result.data?.scenes)
? result.data.scenes.filter(isValidScene)
: [];
updateScenes(scenes);2. Duplicate Filtering
AI may return duplicate characters:
const uniqueCharacters = characters.reduce((acc, char) => {
const existing = acc.find(c =>
c.name.toLowerCase() === char.name.toLowerCase()
);
if (!existing) acc.push(char);
return acc;
}, []);3. Defensive Merging
Never overwrite existing data—append with notes:
// Characters are now in showStructure.characters
const mergeCharacters = (aiCharacters) => {
const existingCharacters = useAppStore.getState().showStructure.characters;
aiCharacters.forEach(aiChar => {
const existing = existingCharacters.find(c => c.name === aiChar.name);
if (existing) {
// Append AI description as note
showStructureActions.updateCharacter(existing.id, {
notes: `${existing.notes || ''}\n[AI]: ${aiChar.description}`,
});
} else {
// Add new character via showStructureActions
const newCharacters = [
...existingCharacters,
{ ...aiChar, id: generateId(), sceneIds: [], parentId: null }
];
showStructureActions.setCharacters(newCharacters);
}
});
};Error Handling
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| Response doesn't match schema | LLM hallucination or prompt drift | Add retry with schema validation |
| Timeout | Large scripts | Chunk script into acts before sending |
| Incomplete extraction | Poor formatting in script | Allow manual editing post-import |
| Weird SVG output | Misinterpreted prompt | Add SVG validation (check for <svg> tag) |
Retry Logic
const analyzeWithRetry = async (scriptText: string, attempts = 3) => {
for (let i = 0; i < attempts; i++) {
try {
const result = await ai.generate({ ... });
// Validate against schema
const validated = ScriptAnalysisSchema.parse(result.output);
return validated;
} catch (error) {
if (i === attempts - 1) throw error;
console.warn(`Attempt ${i + 1} failed, retrying...`);
}
}
};Cost Optimization
Caching
Cache AI results in Firestore to avoid re-analyzing:
// Check if script already analyzed
const cacheRef = doc(db, `aiCache/${scriptHash}`);
const cached = await getDoc(cacheRef);
if (cached.exists()) {
return cached.data();
}
// Analyze and cache
const result = await analyzeScript(scriptText);
await setDoc(cacheRef, result);Token Limiting
Limit input size to control costs:
const MAX_SCRIPT_LENGTH = 50000; // characters
if (scriptText.length > MAX_SCRIPT_LENGTH) {
throw new Error('Script too long. Please upload acts separately.');
}Testing
Mock AI Responses
// tests/mocks/aiResponses.ts
export const mockScriptAnalysis = {
showStructure: {
title: 'Hamlet',
acts: [
{
actNumber: 1,
scenes: [
{
sceneNumber: 1,
sceneTitle: 'Elsinore Castle',
characters: ['Hamlet', 'Claudius'],
},
],
},
],
},
characters: [
{ name: 'Hamlet', description: 'Prince of Denmark' },
],
props: [
{ name: 'Skull', sceneReference: 'Act 5, Scene 1' },
],
};Barry AI Agent
"Barry" is an in-app production assistant that answers natural-language questions by querying project data. Built on the Google Agent Development Kit (ADK) using LlmAgent and FunctionTool abstractions.
Architecture
┌─────────────────┐
│ BarryPanel.tsx │ ← Liquid Glass FAB → morphing chat panel
└────────┬────────┘
│ httpsCallable('barryChat')
▼
┌─────────────────┐
│ barry-chat.ts │ ← ADK orchestrator (onCall)
│ (Cloud Fn) │
└────────┬────────┘
│ InMemoryRunner.runAsync()
▼
┌─────────────────┐
│ barry-agent.ts │ ← LlmAgent + FunctionTools (Zod schemas)
└────────┬────────┘
│ ADK auto-dispatches tool calls
┌────┴────┐
▼ ▼
┌────────┐ ┌──────────────┐
│searchFi│ │getProduction │
│les │ │Data │
└────┬───┘ └──────┬───────┘
│ │
▼ ▼
┌────────┐ ┌──────────────┐
│Vertex │ │ Firestore │
│AI Data │ │ Shards │
│Store │ │ │
└────────┘ └──────────────┘Key Files
| File | Purpose |
|---|---|
functions/src/barry-agent.ts | ADK LlmAgent definition with FunctionTool instances (Zod-validated schemas, closures for per-request context) |
functions/src/barry-chat.ts | Cloud Function orchestrator — creates agent, runs InMemoryRunner.runAsync(), collects events |
functions/src/barry-tools.ts | Tool execution — executeSearchFiles (Vertex AI Data Store RAG), executeGetProductionData (Firestore shard reads with optional filtering) |
functions/src/barry-datastore.ts | Vertex AI Search Data Store integration — ensures data store exists, executes search queries |
functions/src/barry-config.ts | Model selection, shard path mappings, generation config (temperature, token limits) |
functions/src/barry-prompts.ts | System prompts with theatrical domain context, few-shot examples |
functions/src/barry-model-provider.ts | v2 — Tiered model provider: selects Flash Lite (managed) or Pro (BYOK) based on project settings |
functions/src/barry-callbacks.ts | v2 — Server-side guardrails: scope checking (production-topic enforcement) and daily token budget (50K managed, unlimited BYOK) |
functions/src/barry-embeddings.ts | v2 — MiniLM-L6-v2 embedding generation via @huggingface/transformers for Firestore onWrite triggers |
functions/src/barry-settings.ts | v2 — updateBarrySettings Cloud Function for BYOK API key management with RBAC gating |
src/features/barry/BarryPanel.tsx | Client UI — Liquid Glass FAB morphing into chat panel, message bubbles, suggestion chips |
src/features/barry/useBarryChat.ts | Client hook — send/receive, streaming via Firestore onSnapshot, loading state, error handling |
src/features/barry/store.ts | Zustand slice — messages, open/close, loading state, model tier indicator |
src/features/barry/barry-semantic.ts | v2 — Client-side semantic search using ONNX MiniLM-L6-v2 model for instant local retrieval |
src/features/barry/components/BarryConfigPanel.tsx | v2 — Config panel UI for model selection, BYOK key entry, and tier badge display |
src/features/barry/settings-api.ts | v2 — Client API for updateBarrySettings callable (BYOK key save/clear) |
ADK Tool Definitions
Tools are defined as FunctionTool instances with Zod schemas in barry-agent.ts:
import { FunctionTool } from '@google/adk';
import { z } from 'zod';
const searchFilesTool = new FunctionTool({
name: 'searchFiles',
description: 'Search uploaded project files (scripts, notes, riders)',
parameters: z.object({
query: z.string().describe('The search query'),
}),
execute: async ({ query }) => {
return { result: await executeSearchFiles(dataStoreId, query) };
},
});
const getProductionDataTool = new FunctionTool({
name: 'getProductionData',
description: 'Read live production data from a specific domain',
parameters: z.object({
shard: z.enum(['structure', 'cast', 'props', 'scheduler', ...]),
filter: z.string().optional(),
}),
execute: async ({ shard, filter }) => {
return { result: await executeGetProductionData(projectId, shard, filter) };
},
});Tiered Model Provider (v2)
Barry v2 introduces a two-tier model strategy managed in barry-model-provider.ts:
| Tier | Model | API Key | Daily Token Budget | Use Case |
|---|---|---|---|---|
| Lite (default) | gemini-3.1-flash-lite | Managed (platform key) | 50,000 tokens | Quick lookups, schedule queries |
| Pro (BYOK) | gemini-3.1-pro | User-provided (Google AI Studio) | Unlimited | Complex reasoning, deep analysis |
The model provider reads the project's barrySettings document from Firestore and selects the appropriate model + API key at request time.
BYOK Settings (v2)
The updateBarrySettings Cloud Function (barry-settings.ts) handles API key management:
// Client: save BYOK key (write-only — key is never returned to client)
const updateSettings = httpsCallable(functions, 'updateBarrySettings');
await updateSettings({ projectId, apiKey: 'user-key-here' });
// Client: clear key (revert to Lite)
await updateSettings({ projectId, apiKey: null });RBAC: Only owner, stage_manager, and production_manager roles can call updateBarrySettings. The function validates permissions server-side before writing.
Semantic Retrieval (v2)
Barry v2 adds semantic search for production data using MiniLM-L6-v2 embeddings:
- Server-side embedding (
barry-embeddings.ts) — FirestoreonWritetriggers generate 384-dimensional embeddings via@huggingface/transformerswhenever production data changes - Client-side search (
barry-semantic.ts) — Loads the ONNX MiniLM-L6-v2 model in the browser for instant vector similarity search against cached embeddings - Hybrid retrieval — Results from client-side semantic search are combined with Vertex AI Data Store results for comprehensive coverage
Token Budget & Guardrails (v2)
Server-side enforcement in barry-callbacks.ts:
- Scope checking — Validates queries are production-related before forwarding to the model
- Daily token budget — Tracks per-project daily usage; 50K limit for managed tier, unlimited for BYOK
- Budget exceeded — Returns a friendly message directing users to configure a BYOK key
Streaming Responses (v2)
Barry v2 streams responses via Firestore streaming documents:
- Cloud Function writes partial response chunks to a Firestore
barryStreaming/{sessionId}document - Client listens via
onSnapshotfor real-time updates - UI renders tokens as they arrive (word-by-word appearance)
Session Continuity (v2)
Chat history persists across sessions using ADK's SessionService backed by Firestore. Users can close and reopen Barry without losing conversation context.
Environment Configuration
ADK uses environment variables for Vertex AI, set at module level in barry-chat.ts:
process.env.GOOGLE_GENAI_USE_VERTEXAI = 'TRUE';
process.env.GOOGLE_CLOUD_PROJECT = GCP_PROJECT_ID;
process.env.GOOGLE_CLOUD_LOCATION = GCP_LOCATION;Client Integration
// useBarryChat.ts — sends message and listens for streaming response
const barryChat = httpsCallable<BarryChatRequest, BarryChatResponse>(
functions, 'barryChat'
);
const send = async (message: string) => {
const result = await barryChat({
projectId,
message,
history: messages.map(m => ({ role: m.role, text: m.text })),
});
// result.data.response — model's answer
// result.data.toolsUsed — which tools were invoked
// result.data.modelTier — 'lite' | 'pro'
};"Ask Barry" Contextual Help Pattern
A lightweight, inline AI help system that provides contextual assistance without requiring the full Barry chat panel. This pattern is used at friction points — moments where users are likely confused or need guidance.
Architecture
AskBarryChip (UI) → useAskBarry (Hook) → sendBarryMessage (API)
↓
[INLINE_HELP] prefix
↓
barryChat Cloud FunctionKey Files
| File | Purpose |
|---|---|
src/features/barry/hooks/useAskBarry.ts | Hook managing inline AI query lifecycle |
src/features/barry/components/AskBarryChip.tsx | Compact UI trigger with suggestion display |
src/features/barry/api/barry-api.ts | Shared sendBarryMessage callable |
useAskBarry Hook
const { suggestion, loading, error, ask } = useAskBarry();
// Trigger with context
ask("CSV headers and sample data:\nName,Email\nJohn,john@test.com\n\nParse warnings:\nMissing phone column");The hook:
- Prepends
[INLINE_HELP]to the message (signals Barry to give a concise, actionable response) - Calls
sendBarryMessage(the same Cloud Function as the full chat) - Returns a short suggestion string, not a full conversation
AskBarryChip Component
<AskBarryChip
context={`CSV headers:\n${headerLine}\n\nParse warnings:\n${warnings.join('\n')}`}
question="What might be wrong with this CSV data?"
label="Ask Barry for help"
/>Props:
context— Background data Barry needs (not shown to user)question— The question to ask (optional, defaults to context-based inference)label— Button text (defaults to "Ask Barry")
Integration Example: CSV Import Wizard
In RosterImportWizard.tsx Step 2 (Preview), when parse warnings exist:
{parseResult.errors.length > 0 && (
<AskBarryChip
context={`CSV headers and sample data:\n${csvText.split('\n').slice(0, 3).join('\n')}\n\nParse warnings:\n${parseResult.errors.join('\n')}`}
question="Help me understand these CSV import warnings"
label="Ask Barry about these warnings"
/>
)}When to Use This Pattern
Use AskBarryChip when:
- A user action produces warnings or ambiguous results
- The UI can't fully explain what happened (e.g., fuzzy header matching decisions)
- Users might need guidance but a full modal or documentation link is too heavy
Do not use for:
- Simple validation errors (show inline error messages instead)
- Actions that need immediate correction (use form validation)
- Questions that require multi-turn conversation (direct to the full Barry panel)
Further Reading
Last updated: March 20, 2026 (Barry v2 — tiered models, BYOK, semantic retrieval, streaming)