Skip to content

AI Integration Architecture

Script analysis, asset generation, and AI chat with Gemini and the Agent Development Kit (ADK).


Overview

On Book Pro uses Gemini for:

  • Script Analysis — PDF parsing to extract scenes, characters, props, and cues
  • SVG Asset Generation — AI-generated stage furniture icons
  • Barry AI Assistant — ADK-powered conversational agent with tool-calling for production queries

All AI features run on Firebase Cloud Functions to keep client app lightweight.


Script Analysis Flow

1. Client Upload

typescript
// Client: Upload PDF to Firebase Storage
const storageRef = ref(storage, `scripts/${projectId}/${filename}`);
await uploadBytes(storageRef, pdfFile);

// Trigger Cloud Function
const analyzeScript = httpsCallable(functions, 'analyzeScriptPDF');
const result = await analyzeScript({ 
  projectId, 
  fileUrl: await getDownloadURL(storageRef) 
});

2. Cloud Function Processing

Located: functions/src/ai/analyzeScript.ts

typescript
import { Genkit } from 'genkit';
import { gemini15Pro } from '@genkit-ai/googleai';

export const analyzeScriptPDF = onCall(async (request) => {
  const { projectId, fileUrl } = request.data;
  
  // 1. Download PDF from Storage
  const pdfBuffer = await downloadFile(fileUrl);
  
  // 2. Extract text (using pdf-parse or similar)
  const scriptText = await parsePDF(pdfBuffer);
  
  // 3. Send to Gemini for structured extraction
  const ai = genkit({ plugins: [googleAI()] });
  
  const result = await ai.generate({
    model: gemini15Pro,
    prompt: buildAnalysisPrompt(scriptText),
    output: {
      schema: ScriptAnalysisSchema, // Zod schema
    },
  });
  
  return result.output;
});

3. Structured Schema

typescript
import { z } from 'zod';

// ScriptBlock represents a single unit of script content
const ScriptBlockSchema = z.object({
  id: z.string(),
  pageNumber: z.number(),
  blockIndex: z.number(),
  type: z.enum([
    'scene_heading',      // ACT/SCENE headers
    'character_cue',      // Character name before dialogue
    'dialogue',           // Spoken lines
    'lyrics',             // Sung text
    'stage_direction',    // Directions, parentheticals
    'transition',         // BLACKOUT, END OF ACT
    'technical_note',     // SFX, LX, VAMP markers
    'other'               // Front matter, unclassified
  ]),
  text: z.string(),
  actId: z.string().nullable(),
  sceneId: z.string().nullable(),
  characterId: z.string().nullable(),
  confidence: z.number().optional(),
  isVerified: z.boolean(),
  originalText: z.string().optional(),
});

const ScriptAnalysisSchema = z.object({
  showStructure: z.object({
    acts: z.array(z.object({
      id: z.string(),
      name: z.string(),
      isPreset: z.boolean().optional(),
      sceneIds: z.array(z.string()),
    })),
    scenes: z.array(z.object({
      id: z.string(),
      name: z.string(),
      notes: z.string().nullish(),
    })),
    scriptBlocks: z.array(ScriptBlockSchema).optional(),
  }),
  characters: z.array(z.object({
    id: z.string(),
    name: z.string(),
    notes: z.string().nullish(),
    isEnsemble: z.boolean(),
    sceneIds: z.array(z.string()).optional(),
  })),
  // Props and soundCues are now extracted on-demand (lazy extraction)
  props: z.array(z.object({ id: z.string(), name: z.string(), sceneId: z.string() })).optional(),
  soundCues: z.array(z.object({ id: z.string(), description: z.string(), sceneId: z.string() })).optional(),
});

4. Client Data Mapping

typescript
// Client receives AI response
const { showStructure, characters, props, soundCues } = result.data;

// Map acts and scenes to Zustand store format
const mappedActs = showStructure.acts.map(act => ({
  id: generateId(),
  name: `Act ${act.actNumber}`,
  sceneIds: [],  // Will be populated with scene IDs
}));

const mappedScenes = showStructure.acts.flatMap(act =>
  act.scenes.map(scene => ({
    id: generateId(),
    actId: act.actNumber,
    sceneNumber: scene.sceneNumber,
    name: scene.sceneTitle,
    location: scene.location,
    characterIds: mapCharacterNames(scene.characters),
  }))
);

// Map characters to store format
const mappedCharacters = characters.map(char => ({
  id: generateId(),
  name: char.name,
  notes: char.description,
  sceneIds: [],
  parentId: null,
}));

// Merge into store - characters go into showStructure.characters
mergeShowStructure({
  acts: mappedActs,
  scenes: mappedScenes,
  characters: mappedCharacters,  // Characters consolidated in showStructure
});

// Props and sound cues remain separate slices
mergeProps(props);
mergeSoundCues(soundCues);

Note: Since Jan 2026, characters are stored in showStructure.characters rather than a separate top-level array. The AI response schema returns characters separately, but the client maps them into the unified showStructure.


Prompt Engineering

Analysis Prompt Structure

typescript
const buildAnalysisPrompt = (scriptText: string) => `
You are analyzing a theatrical script. Extract the following structured data:

1. Show Structure:
   - Acts and their scenes
   - Scene titles and locations
   - Characters appearing in each scene

2. Characters:
   - All character names (consistent spelling)
   - Brief descriptions if provided

3. Props:
   - All props mentioned in stage directions
   - Scene where they first appear

4. Sound Cues:
   - Audio effects mentioned
   - When they occur

SCRIPT TEXT:
${scriptText}

Return ONLY valid JSON matching the schema. Do not include any markdown or explanation.
`;

Key Strategies:

  • Explicit output format — "Return ONLY valid JSON"
  • Schema enforcement — Genkit validates against Zod schema
  • Defensive filtering — Client validates data before merging

SVG Asset Generation

1. User Request

typescript
// Client: Request AI-generated asset
const generateAsset = httpsCallable(functions, 'generateStageAsset');
const result = await generateAsset({ 
  description: 'blue victorian sofa',
  dimensions: { width: 6, height: 3 } // feet
});

2. Cloud Function

Located: functions/src/ai/generateAsset.ts

typescript
export const generateStageAsset = onCall(async (request) => {
  const { description, dimensions } = request.data;
  
  const ai = genkit({ plugins: [googleAI()] });
  
  const result = await ai.generate({
    model: gemini15Pro,
    prompt: `
      Generate an SVG icon for a stage prop: "${description}".
      The SVG should:
      - Be a simple, top-down view
      - Use single color (#333333)
      - Be 100x100 viewBox
      - Be production-ready (no text labels)
      
      Return ONLY the SVG code, no explanation.
    `,
  });
  
  // Validate SVG
  const svg = sanitizeSVG(result.text);
  
  return { svg };
});

3. Client Integration

typescript
// Add to Set Builder canvas
const asset = {
  id: generateId(),
  type: 'custom',
  svg: result.data.svg,
  x: 0,
  y: 0,
  width: dimensions.width * scalePixelsPerFoot,
  height: dimensions.height * scalePixelsPerFoot,
};

addAssetToSet(asset);

Hardening Patterns

1. Non-Determinism Guards

AI responses vary—always validate:

typescript
// ❌ Unsafe
const scenes = result.data.scenes;
updateScenes(scenes);

// ✅ Safe
const scenes = Array.isArray(result.data?.scenes) 
  ? result.data.scenes.filter(isValidScene)
  : [];
updateScenes(scenes);

2. Duplicate Filtering

AI may return duplicate characters:

typescript
const uniqueCharacters = characters.reduce((acc, char) => {
  const existing = acc.find(c => 
    c.name.toLowerCase() === char.name.toLowerCase()
  );
  if (!existing) acc.push(char);
  return acc;
}, []);

3. Defensive Merging

Never overwrite existing data—append with notes:

typescript
// Characters are now in showStructure.characters
const mergeCharacters = (aiCharacters) => {
  const existingCharacters = useAppStore.getState().showStructure.characters;
  
  aiCharacters.forEach(aiChar => {
    const existing = existingCharacters.find(c => c.name === aiChar.name);
    
    if (existing) {
      // Append AI description as note
      showStructureActions.updateCharacter(existing.id, {
        notes: `${existing.notes || ''}\n[AI]: ${aiChar.description}`,
      });
    } else {
      // Add new character via showStructureActions
      const newCharacters = [
        ...existingCharacters,
        { ...aiChar, id: generateId(), sceneIds: [], parentId: null }
      ];
      showStructureActions.setCharacters(newCharacters);
    }
  });
};

Error Handling

Common Issues

IssueCauseSolution
Response doesn't match schemaLLM hallucination or prompt driftAdd retry with schema validation
TimeoutLarge scriptsChunk script into acts before sending
Incomplete extractionPoor formatting in scriptAllow manual editing post-import
Weird SVG outputMisinterpreted promptAdd SVG validation (check for <svg> tag)

Retry Logic

typescript
const analyzeWithRetry = async (scriptText: string, attempts = 3) => {
  for (let i = 0; i < attempts; i++) {
    try {
      const result = await ai.generate({ ... });
      
      // Validate against schema
      const validated = ScriptAnalysisSchema.parse(result.output);
      return validated;
      
    } catch (error) {
      if (i === attempts - 1) throw error;
      console.warn(`Attempt ${i + 1} failed, retrying...`);
    }
  }
};

Cost Optimization

Caching

Cache AI results in Firestore to avoid re-analyzing:

typescript
// Check if script already analyzed
const cacheRef = doc(db, `aiCache/${scriptHash}`);
const cached = await getDoc(cacheRef);

if (cached.exists()) {
  return cached.data();
}

// Analyze and cache
const result = await analyzeScript(scriptText);
await setDoc(cacheRef, result);

Token Limiting

Limit input size to control costs:

typescript
const MAX_SCRIPT_LENGTH = 50000; // characters

if (scriptText.length > MAX_SCRIPT_LENGTH) {
  throw new Error('Script too long. Please upload acts separately.');
}

Testing

Mock AI Responses

typescript
// tests/mocks/aiResponses.ts
export const mockScriptAnalysis = {
  showStructure: {
    title: 'Hamlet',
    acts: [
      {
        actNumber: 1,
        scenes: [
          {
            sceneNumber: 1,
            sceneTitle: 'Elsinore Castle',
            characters: ['Hamlet', 'Claudius'],
          },
        ],
      },
    ],
  },
  characters: [
    { name: 'Hamlet', description: 'Prince of Denmark' },
  ],
  props: [
    { name: 'Skull', sceneReference: 'Act 5, Scene 1' },
  ],
};

Barry AI Agent

"Barry" is an in-app production assistant that answers natural-language questions by querying project data. Built on the Google Agent Development Kit (ADK) using LlmAgent and FunctionTool abstractions.

Architecture

┌─────────────────┐
│  BarryPanel.tsx  │  ← Liquid Glass FAB → morphing chat panel
└────────┬────────┘
         │ httpsCallable('barryChat')

┌─────────────────┐
│  barry-chat.ts  │  ← ADK orchestrator (onCall)
│  (Cloud Fn)     │
└────────┬────────┘
         │ InMemoryRunner.runAsync()

┌─────────────────┐
│  barry-agent.ts │  ← LlmAgent + FunctionTools (Zod schemas)
└────────┬────────┘
         │ ADK auto-dispatches tool calls
    ┌────┴────┐
    ▼         ▼
┌────────┐  ┌──────────────┐
│searchFi│  │getProduction │
│les     │  │Data          │
└────┬───┘  └──────┬───────┘
     │             │
     ▼             ▼
┌────────┐  ┌──────────────┐
│Vertex  │  │  Firestore   │
│AI Data │  │  Shards      │
│Store   │  │              │
└────────┘  └──────────────┘

Key Files

FilePurpose
functions/src/barry-agent.tsADK LlmAgent definition with FunctionTool instances (Zod-validated schemas, closures for per-request context)
functions/src/barry-chat.tsCloud Function orchestrator — creates agent, runs InMemoryRunner.runAsync(), collects events
functions/src/barry-tools.tsTool execution — executeSearchFiles (Vertex AI Data Store RAG), executeGetProductionData (Firestore shard reads with optional filtering)
functions/src/barry-datastore.tsVertex AI Search Data Store integration — ensures data store exists, executes search queries
functions/src/barry-config.tsModel selection, shard path mappings, generation config (temperature, token limits)
functions/src/barry-prompts.tsSystem prompts with theatrical domain context, few-shot examples
functions/src/barry-model-provider.tsv2 — Tiered model provider: selects Flash Lite (managed) or Pro (BYOK) based on project settings
functions/src/barry-callbacks.tsv2 — Server-side guardrails: scope checking (production-topic enforcement) and daily token budget (50K managed, unlimited BYOK)
functions/src/barry-embeddings.tsv2 — MiniLM-L6-v2 embedding generation via @huggingface/transformers for Firestore onWrite triggers
functions/src/barry-settings.tsv2updateBarrySettings Cloud Function for BYOK API key management with RBAC gating
src/features/barry/BarryPanel.tsxClient UI — Liquid Glass FAB morphing into chat panel, message bubbles, suggestion chips
src/features/barry/useBarryChat.tsClient hook — send/receive, streaming via Firestore onSnapshot, loading state, error handling
src/features/barry/store.tsZustand slice — messages, open/close, loading state, model tier indicator
src/features/barry/barry-semantic.tsv2 — Client-side semantic search using ONNX MiniLM-L6-v2 model for instant local retrieval
src/features/barry/components/BarryConfigPanel.tsxv2 — Config panel UI for model selection, BYOK key entry, and tier badge display
src/features/barry/settings-api.tsv2 — Client API for updateBarrySettings callable (BYOK key save/clear)

ADK Tool Definitions

Tools are defined as FunctionTool instances with Zod schemas in barry-agent.ts:

typescript
import { FunctionTool } from '@google/adk';
import { z } from 'zod';

const searchFilesTool = new FunctionTool({
    name: 'searchFiles',
    description: 'Search uploaded project files (scripts, notes, riders)',
    parameters: z.object({
        query: z.string().describe('The search query'),
    }),
    execute: async ({ query }) => {
        return { result: await executeSearchFiles(dataStoreId, query) };
    },
});

const getProductionDataTool = new FunctionTool({
    name: 'getProductionData',
    description: 'Read live production data from a specific domain',
    parameters: z.object({
        shard: z.enum(['structure', 'cast', 'props', 'scheduler', ...]),
        filter: z.string().optional(),
    }),
    execute: async ({ shard, filter }) => {
        return { result: await executeGetProductionData(projectId, shard, filter) };
    },
});

Tiered Model Provider (v2)

Barry v2 introduces a two-tier model strategy managed in barry-model-provider.ts:

TierModelAPI KeyDaily Token BudgetUse Case
Lite (default)gemini-3.1-flash-liteManaged (platform key)50,000 tokensQuick lookups, schedule queries
Pro (BYOK)gemini-3.1-proUser-provided (Google AI Studio)UnlimitedComplex reasoning, deep analysis

The model provider reads the project's barrySettings document from Firestore and selects the appropriate model + API key at request time.

BYOK Settings (v2)

The updateBarrySettings Cloud Function (barry-settings.ts) handles API key management:

typescript
// Client: save BYOK key (write-only — key is never returned to client)
const updateSettings = httpsCallable(functions, 'updateBarrySettings');
await updateSettings({ projectId, apiKey: 'user-key-here' });

// Client: clear key (revert to Lite)
await updateSettings({ projectId, apiKey: null });

RBAC: Only owner, stage_manager, and production_manager roles can call updateBarrySettings. The function validates permissions server-side before writing.

Semantic Retrieval (v2)

Barry v2 adds semantic search for production data using MiniLM-L6-v2 embeddings:

  1. Server-side embedding (barry-embeddings.ts) — Firestore onWrite triggers generate 384-dimensional embeddings via @huggingface/transformers whenever production data changes
  2. Client-side search (barry-semantic.ts) — Loads the ONNX MiniLM-L6-v2 model in the browser for instant vector similarity search against cached embeddings
  3. Hybrid retrieval — Results from client-side semantic search are combined with Vertex AI Data Store results for comprehensive coverage

Token Budget & Guardrails (v2)

Server-side enforcement in barry-callbacks.ts:

  • Scope checking — Validates queries are production-related before forwarding to the model
  • Daily token budget — Tracks per-project daily usage; 50K limit for managed tier, unlimited for BYOK
  • Budget exceeded — Returns a friendly message directing users to configure a BYOK key

Streaming Responses (v2)

Barry v2 streams responses via Firestore streaming documents:

  1. Cloud Function writes partial response chunks to a Firestore barryStreaming/{sessionId} document
  2. Client listens via onSnapshot for real-time updates
  3. UI renders tokens as they arrive (word-by-word appearance)

Session Continuity (v2)

Chat history persists across sessions using ADK's SessionService backed by Firestore. Users can close and reopen Barry without losing conversation context.

Environment Configuration

ADK uses environment variables for Vertex AI, set at module level in barry-chat.ts:

typescript
process.env.GOOGLE_GENAI_USE_VERTEXAI = 'TRUE';
process.env.GOOGLE_CLOUD_PROJECT = GCP_PROJECT_ID;
process.env.GOOGLE_CLOUD_LOCATION = GCP_LOCATION;

Client Integration

typescript
// useBarryChat.ts — sends message and listens for streaming response
const barryChat = httpsCallable<BarryChatRequest, BarryChatResponse>(
    functions, 'barryChat'
);

const send = async (message: string) => {
    const result = await barryChat({
        projectId,
        message,
        history: messages.map(m => ({ role: m.role, text: m.text })),
    });
    // result.data.response — model's answer
    // result.data.toolsUsed — which tools were invoked
    // result.data.modelTier — 'lite' | 'pro'
};

"Ask Barry" Contextual Help Pattern

A lightweight, inline AI help system that provides contextual assistance without requiring the full Barry chat panel. This pattern is used at friction points — moments where users are likely confused or need guidance.

Architecture

AskBarryChip (UI)  →  useAskBarry (Hook)  →  sendBarryMessage (API)

                                    [INLINE_HELP] prefix

                                    barryChat Cloud Function

Key Files

FilePurpose
src/features/barry/hooks/useAskBarry.tsHook managing inline AI query lifecycle
src/features/barry/components/AskBarryChip.tsxCompact UI trigger with suggestion display
src/features/barry/api/barry-api.tsShared sendBarryMessage callable

useAskBarry Hook

typescript
const { suggestion, loading, error, ask } = useAskBarry();

// Trigger with context
ask("CSV headers and sample data:\nName,Email\nJohn,john@test.com\n\nParse warnings:\nMissing phone column");

The hook:

  1. Prepends [INLINE_HELP] to the message (signals Barry to give a concise, actionable response)
  2. Calls sendBarryMessage (the same Cloud Function as the full chat)
  3. Returns a short suggestion string, not a full conversation

AskBarryChip Component

tsx
<AskBarryChip
    context={`CSV headers:\n${headerLine}\n\nParse warnings:\n${warnings.join('\n')}`}
    question="What might be wrong with this CSV data?"
    label="Ask Barry for help"
/>

Props:

  • context — Background data Barry needs (not shown to user)
  • question — The question to ask (optional, defaults to context-based inference)
  • label — Button text (defaults to "Ask Barry")

Integration Example: CSV Import Wizard

In RosterImportWizard.tsx Step 2 (Preview), when parse warnings exist:

tsx
{parseResult.errors.length > 0 && (
    <AskBarryChip
        context={`CSV headers and sample data:\n${csvText.split('\n').slice(0, 3).join('\n')}\n\nParse warnings:\n${parseResult.errors.join('\n')}`}
        question="Help me understand these CSV import warnings"
        label="Ask Barry about these warnings"
    />
)}

When to Use This Pattern

Use AskBarryChip when:

  • A user action produces warnings or ambiguous results
  • The UI can't fully explain what happened (e.g., fuzzy header matching decisions)
  • Users might need guidance but a full modal or documentation link is too heavy

Do not use for:

  • Simple validation errors (show inline error messages instead)
  • Actions that need immediate correction (use form validation)
  • Questions that require multi-turn conversation (direct to the full Barry panel)

Further Reading


Last updated: March 20, 2026 (Barry v2 — tiered models, BYOK, semantic retrieval, streaming)