# Output Parsers: Structured Output Extraction

**Part 2: Composition - Lesson 2**

> LLMs return text. You need data.

## Overview

You've learned to create great prompts. LLMs return unstructured text, and in some cases you might need structured data:

```javascript
// LLM returns this:
"The sentiment is positive with a confidence of 0.92"

// You need this:
{
    sentiment: "positive",
    confidence: 0.92
}
```

**Output parsers** transform LLM text into structured data you can use in your applications.

## Why This Matters

### The Problem: Parsing Chaos

Without parsers, your code is full of brittle string manipulation:

```javascript
const response = await llm.invoke("Classify: I love this product!");

// Fragile parsing code everywhere
if (response.includes("positive")) {
    sentiment = "positive";
} else if (response.includes("negative")) {
    sentiment = "negative";
}

// What if format changes?
// What if LLM adds extra text?
// How do you handle errors?
```

Problems:
- Brittle regex and string matching
- No validation of output format
- Hard to test parsing logic
- Inconsistent error handling
- Parser code duplicated everywhere

### The Solution: Output Parsers

```javascript
const parser = new JsonOutputParser();

const prompt = new PromptTemplate({
    template: `Classify the sentiment. Respond in JSON:
{{"sentiment": "positive/negative/neutral", "confidence": 0.0-1.0}}

Text: {text}`,
    inputVariables: ["text"]
});

const chain = prompt.pipe(llm).pipe(parser);

const result = await chain.invoke({ text: "I love this!" });
// { sentiment: "positive", confidence: 0.95 }
```

Benefits:
- ✅ Reliable structured extraction
- ✅ Format validation
- ✅ Error handling built-in
- ✅ Reusable parsing logic
- ✅ Type-safe outputs

## Learning Objectives

By the end of this lesson, you will:

- ✅ Build a BaseOutputParser abstraction
- ✅ Create a StringOutputParser for text cleanup
- ✅ Implement JsonOutputParser for JSON extraction
- ✅ Build ListOutputParser for arrays
- ✅ Create StructuredOutputParser with schemas
- ✅ Use parsers in chains with prompts
- ✅ Handle parsing errors gracefully

## Core Concepts

### What is an Output Parser?

An output parser **transforms LLM text output into structured data**.

**Flow:**
```
LLM Output (text) → Parser → Structured Data
    ↓                ↓              ↓
"positive: 0.95"  parse()    {sentiment: "positive", confidence: 0.95}
```

### The Parser Hierarchy

```
BaseOutputParser (abstract)
    ├── StringOutputParser (clean text)
    ├── JsonOutputParser (extract JSON)
    ├── ListOutputParser (extract lists)
    ├── RegexOutputParser (regex patterns)
    └── StructuredOutputParser (schema validation)
```

Each parser handles a specific output format.

### Key Operations

1. **Parse**: Extract structured data from text
2. **Get Format Instructions**: Tell LLM how to format response
3. **Validate**: Check output matches expected structure
4. **Handle Errors**: Gracefully handle malformed outputs

## Implementation Guide

### Step 1: Base Output Parser

**Location:** `src/output-parsers/base-parser.js`

This is the abstract base class all parsers inherit from.

**What it does:**
- Defines the interface for all parsers
- Extends Runnable (so parsers work in chains)
- Provides format instruction generation
- Handles parsing errors

**Implementation:**

```javascript
import { Runnable } from '../core/runnable.js';

/**
 * Base class for all output parsers
 * Transforms LLM text output into structured data
 */
export class BaseOutputParser extends Runnable {
    constructor() {
        super();
        this.name = this.constructor.name;
    }

    /**
     * Parse the LLM output into structured data
     * @abstract
     * @param {string} text - Raw LLM output
     * @returns {Promise<any>} Parsed data
     */
    async parse(text) {
        throw new Error(`${this.name} must implement parse()`);
    }

    /**
     * Get instructions for the LLM on how to format output
     * @returns {string} Format instructions
     */
    getFormatInstructions() {
        return '';
    }

    /**
     * Runnable interface: parse the output
     */
    async _call(input, config) {
        // Input can be a string or a Message
        const text = typeof input === 'string' 
            ? input 
            : input.content;
        
        return await this.parse(text);
    }

    /**
     * Parse with error handling
     */
    async parseWithPrompt(text, prompt) {
        try {
            return await this.parse(text);
        } catch (error) {
            throw new OutputParserException(
                `Failed to parse output from prompt: ${error.message}`,
                text,
                error
            );
        }
    }
}

/**
 * Exception thrown when parsing fails
 */
export class OutputParserException extends Error {
    constructor(message, llmOutput, originalError) {
        super(message);
        this.name = 'OutputParserException';
        this.llmOutput = llmOutput;
        this.originalError = originalError;
    }
}
```

**Key insights:**
- Extends `Runnable` so parsers can be piped in chains
- `_call` extracts text from strings or Messages
- `getFormatInstructions()` helps prompt the LLM
- Error handling wraps parse failures with context

---

### Step 2: String Output Parser

**Location:** `src/output-parsers/string-parser.js`

The simplest parser - cleans up text output.

**What it does:**
- Strips leading/trailing whitespace
- Optionally removes markdown code blocks
- Returns clean string

**Use when:**
- You just need clean text
- No structure needed
- Want to remove formatting artifacts

**Implementation:**

```javascript
import { BaseOutputParser } from './base-parser.js';

/**
 * Parser that returns cleaned string output
 * Strips whitespace and optionally removes markdown
 * 
 * Example:
 *   const parser = new StringOutputParser();
 *   const result = await parser.parse("  Hello World  ");
 *   // "Hello World"
 */
export class StringOutputParser extends BaseOutputParser {
    constructor(options = {}) {
        super();
        this.stripMarkdown = options.stripMarkdown ?? true;
    }

    /**
     * Parse: clean the text
     */
    async parse(text) {
        let cleaned = text.trim();

        if (this.stripMarkdown) {
            cleaned = this._stripMarkdownCodeBlocks(cleaned);
        }

        return cleaned;
    }

    /**
     * Remove markdown code blocks (```code```)
     */
    _stripMarkdownCodeBlocks(text) {
        // Remove ```language\ncode\n```
        return text.replace(/```[\w]*\n([\s\S]*?)\n```/g, '$1').trim();
    }

    getFormatInstructions() {
        return 'Respond with plain text. No markdown formatting.';
    }
}
```

**Usage:**

```javascript
const parser = new StringOutputParser();

// Handles various formats
await parser.parse("  Hello  ");           // "Hello"
await parser.parse("```\ncode\n```");      // "code"
await parser.parse("   \n  Text  \n   "); // "Text"
```

---

### Step 3: JSON Output Parser

**Location:** `src/output-parsers/json-parser.js`

Extracts and validates JSON from LLM output.

**What it does:**
- Finds JSON in text (handles markdown, extra text)
- Parses and validates JSON
- Optionally validates against a schema

**Use when:**
- Need structured objects
- Want type-safe data
- Need validation

**Implementation:**

```javascript
import { BaseOutputParser, OutputParserException } from './base-parser.js';

/**
 * Parser that extracts JSON from LLM output
 * Handles markdown code blocks and extra text
 * 
 * Example:
 *   const parser = new JsonOutputParser();
 *   const result = await parser.parse('```json\n{"name": "Alice"}\n```');
 *   // { name: "Alice" }
 */
export class JsonOutputParser extends BaseOutputParser {
    constructor(options = {}) {
        super();
        this.schema = options.schema;
    }

    /**
     * Parse JSON from text
     */
    async parse(text) {
        try {
            // Try to extract JSON from the text
            const jsonText = this._extractJson(text);
            const parsed = JSON.parse(jsonText);

            // Validate against schema if provided
            if (this.schema) {
                this._validateSchema(parsed);
            }

            return parsed;
        } catch (error) {
            throw new OutputParserException(
                `Failed to parse JSON: ${error.message}`,
                text,
                error
            );
        }
    }

    /**
     * Extract JSON from text (handles markdown, extra text)
     */
    _extractJson(text) {
        // Try direct parse first
        try {
            JSON.parse(text.trim());
            return text.trim();
        } catch {
            // Not direct JSON, try to find it
        }

        // Look for JSON in markdown code blocks
        const markdownMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)\n?```/);
        if (markdownMatch) {
            return markdownMatch[1].trim();
        }

        // Look for JSON object/array patterns
        const jsonObjectMatch = text.match(/\{[\s\S]*\}/);
        if (jsonObjectMatch) {
            return jsonObjectMatch[0];
        }

        const jsonArrayMatch = text.match(/\[[\s\S]*\]/);
        if (jsonArrayMatch) {
            return jsonArrayMatch[0];
        }

        // Give up, return original
        return text.trim();
    }

    /**
     * Validate parsed JSON against schema
     */
    _validateSchema(parsed) {
        if (!this.schema) return;

        for (const [key, type] of Object.entries(this.schema)) {
            if (!(key in parsed)) {
                throw new Error(`Missing required field: ${key}`);
            }

            const actualType = typeof parsed[key];
            if (actualType !== type) {
                throw new Error(
                    `Field ${key} should be ${type}, got ${actualType}`
                );
            }
        }
    }

    getFormatInstructions() {
        let instructions = 'Respond with valid JSON.';
        
        if (this.schema) {
            const schemaDesc = Object.entries(this.schema)
                .map(([key, type]) => `"${key}": ${type}`)
                .join(', ');
            instructions += ` Schema: { ${schemaDesc} }`;
        }

        return instructions;
    }
}
```

**Usage:**

```javascript
const parser = new JsonOutputParser({
    schema: {
        name: 'string',
        age: 'number',
        active: 'boolean'
    }
});

// Handles various JSON formats
await parser.parse('{"name": "Alice", "age": 30, "active": true}');
await parser.parse('```json\n{"name": "Bob", "age": 25, "active": false}\n```');
await parser.parse('Sure! Here\'s the data: {"name": "Charlie", "age": 35, "active": true}');
```

---

### Step 4: List Output Parser

**Location:** `src/output-parsers/list-parser.js`

Extracts lists/arrays from text.

**What it does:**
- Parses numbered lists, bullet points, comma-separated
- Returns array of items
- Cleans each item

**Use when:**
- Need arrays of strings
- LLM outputs lists
- Want simple arrays

**Implementation:**

```javascript
import { BaseOutputParser } from './base-parser.js';

/**
 * Parser that extracts lists from text
 * Handles: numbered lists, bullets, comma-separated
 * 
 * Example:
 *   const parser = new ListOutputParser();
 *   const result = await parser.parse("1. Apple\n2. Banana\n3. Orange");
 *   // ["Apple", "Banana", "Orange"]
 */
export class ListOutputParser extends BaseOutputParser {
    constructor(options = {}) {
        super();
        this.separator = options.separator;
    }

    /**
     * Parse list from text
     */
    async parse(text) {
        const cleaned = text.trim();

        // If separator specified, use it
        if (this.separator) {
            return cleaned
                .split(this.separator)
                .map(item => item.trim())
                .filter(item => item.length > 0);
        }

        // Try to detect format
        if (this._isNumberedList(cleaned)) {
            return this._parseNumberedList(cleaned);
        }

        if (this._isBulletList(cleaned)) {
            return this._parseBulletList(cleaned);
        }

        // Try comma-separated
        if (cleaned.includes(',')) {
            return cleaned
                .split(',')
                .map(item => item.trim())
                .filter(item => item.length > 0);
        }

        // Try newline-separated
        return cleaned
            .split('\n')
            .map(item => item.trim())
            .filter(item => item.length > 0);
    }

    /**
     * Check if text is numbered list (1. Item\n2. Item)
     */
    _isNumberedList(text) {
        return /^\d+\./.test(text);
    }

    /**
     * Check if text is bullet list (- Item\n- Item or * Item)
     */
    _isBulletList(text) {
        return /^[-*•]/.test(text);
    }

    /**
     * Parse numbered list
     */
    _parseNumberedList(text) {
        return text
            .split('\n')
            .map(line => line.replace(/^\d+\.\s*/, '').trim())
            .filter(item => item.length > 0);
    }

    /**
     * Parse bullet list
     */
    _parseBulletList(text) {
        return text
            .split('\n')
            .map(line => line.replace(/^[-*•]\s*/, '').trim())
            .filter(item => item.length > 0);
    }

    getFormatInstructions() {
        if (this.separator) {
            return `Respond with items separated by "${this.separator}".`;
        }
        return 'Respond with a numbered list (1. Item) or bullet list (- Item).';
    }
}
```

**Usage:**

```javascript
const parser = new ListOutputParser();

// Handles various list formats
await parser.parse("1. Apple\n2. Banana\n3. Orange");
// ["Apple", "Banana", "Orange"]

await parser.parse("- Red\n- Green\n- Blue");
// ["Red", "Green", "Blue"]

await parser.parse("cat, dog, bird");
// ["cat", "dog", "bird"]

// Custom separator
const csvParser = new ListOutputParser({ separator: ',' });
await csvParser.parse("apple,banana,orange");
// ["apple", "banana", "orange"]
```

---

### Step 5: Regex Output Parser

**Location:** `src/output-parsers/regex-parser.js`

Uses regex patterns to extract structured data.

**What it does:**
- Applies regex to extract groups
- Maps groups to field names
- Returns structured object

**Use when:**
- Output has predictable patterns
- Need custom extraction logic
- Regex is simplest solution

**Implementation:**

```javascript
import { BaseOutputParser, OutputParserException } from './base-parser.js';

/**
 * Parser that uses regex to extract structured data
 * 
 * Example:
 *   const parser = new RegexOutputParser({
 *       regex: /Sentiment: (\w+), Confidence: ([\d.]+)/,
 *       outputKeys: ["sentiment", "confidence"]
 *   });
 *   
 *   const result = await parser.parse("Sentiment: positive, Confidence: 0.92");
 *   // { sentiment: "positive", confidence: "0.92" }
 */
export class RegexOutputParser extends BaseOutputParser {
    constructor(options = {}) {
        super();
        this.regex = options.regex;
        this.outputKeys = options.outputKeys || [];
        this.dotAll = options.dotAll ?? false;

        if (this.dotAll) {
            // Add 's' flag for dotAll if not present
            const flags = this.regex.flags.includes('s') 
                ? this.regex.flags 
                : this.regex.flags + 's';
            this.regex = new RegExp(this.regex.source, flags);
        }
    }

    /**
     * Parse using regex
     */
    async parse(text) {
        const match = text.match(this.regex);

        if (!match) {
            throw new OutputParserException(
                `Text does not match regex pattern: ${this.regex}`,
                text
            );
        }

        // If no output keys, return the groups as array
        if (this.outputKeys.length === 0) {
            return match.slice(1); // Exclude full match
        }

        // Map groups to keys
        const result = {};
        for (let i = 0; i < this.outputKeys.length; i++) {
            result[this.outputKeys[i]] = match[i + 1]; // +1 to skip full match
        }

        return result;
    }

    getFormatInstructions() {
        if (this.outputKeys.length > 0) {
            return `Format your response to match: ${this.outputKeys.join(', ')}`;
        }
        return 'Follow the specified format exactly.';
    }
}
```

**Usage:**

```javascript
const parser = new RegexOutputParser({
    regex: /Sentiment: (\w+), Confidence: ([\d.]+)/,
    outputKeys: ["sentiment", "confidence"]
});

const result = await parser.parse("Sentiment: positive, Confidence: 0.92");
// { sentiment: "positive", confidence: "0.92" }
```

---
# Output Parsers: Advanced Patterns & Integration

## Advanced Parser: Structured Output Parser

### Step 6: Structured Output Parser

**Location:** `src/output-parsers/structured-parser.js`

The most powerful parser - validates against a full schema with types and descriptions.

**What it does:**
- Defines expected schema with types
- Generates format instructions for LLM
- Validates all fields and types
- Provides detailed error messages

**Use when:**
- Need complex structured data
- Want strong type validation
- Need to generate format instructions automatically

**Implementation:**

```javascript
import { BaseOutputParser, OutputParserException } from './base-parser.js';

/**
 * Parser with full schema validation
 * 
 * Example:
 *   const parser = new StructuredOutputParser({
 *       responseSchemas: [
 *           {
 *               name: "sentiment",
 *               type: "string",
 *               description: "The sentiment (positive/negative/neutral)",
 *               enum: ["positive", "negative", "neutral"]
 *           },
 *           {
 *               name: "confidence",
 *               type: "number",
 *               description: "Confidence score between 0 and 1"
 *           }
 *       ]
 *   });
 */
export class StructuredOutputParser extends BaseOutputParser {
    constructor(options = {}) {
        super();
        this.responseSchemas = options.responseSchemas || [];
    }

    /**
     * Parse and validate against schema
     */
    async parse(text) {
        try {
            // Extract JSON
            const jsonText = this._extractJson(text);
            const parsed = JSON.parse(jsonText);

            // Validate against schema
            this._validateAgainstSchema(parsed);

            return parsed;
        } catch (error) {
            throw new OutputParserException(
                `Failed to parse structured output: ${error.message}`,
                text,
                error
            );
        }
    }

    /**
     * Extract JSON from text (same as JsonOutputParser)
     */
    _extractJson(text) {
        try {
            JSON.parse(text.trim());
            return text.trim();
        } catch {}

        const markdownMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)\n?```/);
        if (markdownMatch) return markdownMatch[1].trim();

        const jsonMatch = text.match(/\{[\s\S]*\}/);
        if (jsonMatch) return jsonMatch[0];

        return text.trim();
    }

    /**
     * Validate parsed data against schema
     */
    _validateAgainstSchema(parsed) {
        for (const schema of this.responseSchemas) {
            const { name, type, enum: enumValues, required = true } = schema;

            // Check required fields
            if (required && !(name in parsed)) {
                throw new Error(`Missing required field: ${name}`);
            }

            if (name in parsed) {
                const value = parsed[name];

                // Check type
                if (!this._checkType(value, type)) {
                    throw new Error(
                        `Field ${name} should be ${type}, got ${typeof value}`
                    );
                }

                // Check enum values
                if (enumValues && !enumValues.includes(value)) {
                    throw new Error(
                        `Field ${name} must be one of: ${enumValues.join(', ')}`
                    );
                }
            }
        }
    }

    /**
     * Check if value matches expected type
     */
    _checkType(value, type) {
        switch (type) {
            case 'string':
                return typeof value === 'string';
            case 'number':
                return typeof value === 'number' && !isNaN(value);
            case 'boolean':
                return typeof value === 'boolean';
            case 'array':
                return Array.isArray(value);
            case 'object':
                return typeof value === 'object' && value !== null && !Array.isArray(value);
            default:
                return true;
        }
    }

    /**
     * Generate format instructions for LLM
     */
    getFormatInstructions() {
        const schemaDescriptions = this.responseSchemas.map(schema => {
            let desc = `"${schema.name}": ${schema.type}`;
            if (schema.description) {
                desc += ` // ${schema.description}`;
            }
            if (schema.enum) {
                desc += ` (one of: ${schema.enum.join(', ')})`;
            }
            return desc;
        });

        return `Respond with valid JSON matching this schema:
{
${schemaDescriptions.map(d => '  ' + d).join(',\n')}
}`;
    }

    /**
     * Static helper to create from simple schema
     */
    static fromNamesAndDescriptions(schemas) {
        const responseSchemas = Object.entries(schemas).map(([name, description]) => ({
            name,
            description,
            type: 'string' // Default type
        }));

        return new StructuredOutputParser({ responseSchemas });
    }
}
```

**Usage:**

```javascript
const parser = new StructuredOutputParser({
    responseSchemas: [
        {
            name: "sentiment",
            type: "string",
            description: "The sentiment of the text",
            enum: ["positive", "negative", "neutral"],
            required: true
        },
        {
            name: "confidence",
            type: "number",
            description: "Confidence score from 0 to 1",
            required: true
        },
        {
            name: "keywords",
            type: "array",
            description: "Key themes in the text",
            required: false
        }
    ]
});

// Get format instructions to add to prompt
const instructions = parser.getFormatInstructions();
console.log(instructions);

// Parse and validate
const result = await parser.parse(`{
    "sentiment": "positive",
    "confidence": 0.92,
    "keywords": ["great", "love", "excellent"]
}`);
```

---

## Real-World Examples

### Example 1: Email Classification with Structured Parser

```javascript
import { StructuredOutputParser } from './output-parsers/structured-parser.js';
import { PromptTemplate } from './prompts/prompt-template.js';
import { LlamaCppLLM } from './llm/llama-cpp-llm.js';

// Define the output structure
const parser = new StructuredOutputParser({
    responseSchemas: [
        {
            name: "category",
            type: "string",
            description: "Email category",
            enum: ["spam", "invoice", "meeting", "urgent", "personal", "other"]
        },
        {
            name: "confidence",
            type: "number",
            description: "Confidence score (0-1)"
        },
        {
            name: "reason",
            type: "string",
            description: "Brief explanation for classification"
        },
        {
            name: "actionRequired",
            type: "boolean",
            description: "Does email require action?"
        }
    ]
});

// Build prompt with format instructions
const prompt = new PromptTemplate({
    template: `Classify this email.

Email:
From: {from}
Subject: {subject}
Body: {body}

{format_instructions}`,
    inputVariables: ["from", "subject", "body"],
    partialVariables: {
        format_instructions: parser.getFormatInstructions()
    }
});

// Create chain
const llm = new LlamaCppLLM({ modelPath: './model.gguf' });
const chain = prompt.pipe(llm).pipe(parser);

// Use it
const result = await chain.invoke({
    from: "billing@company.com",
    subject: "Invoice #12345",
    body: "Payment due by March 15th"
});

console.log(result);
// {
//   category: "invoice",
//   confidence: 0.98,
//   reason: "Email contains invoice number and payment deadline",
//   actionRequired: true
// }
```

---

### Example 2: Content Extraction with JSON Parser

```javascript
import { JsonOutputParser } from './output-parsers/json-parser.js';
import { ChatPromptTemplate } from './prompts/chat-prompt-template.js';

const parser = new JsonOutputParser({
    schema: {
        title: 'string',
        summary: 'string',
        tags: 'object',  // Will be array
        author: 'string'
    }
});

const prompt = ChatPromptTemplate.fromMessages([
    ["system", "Extract article metadata. Respond with JSON."],
    ["human", "Article: {article}"]
]);

const chain = prompt.pipe(llm).pipe(parser);

const result = await chain.invoke({
    article: "Title: AI Revolution\nBy: John Doe\n\nAI is transforming..."
});

// {
//   title: "AI Revolution",
//   summary: "Article discusses AI's transformative impact",
//   tags: ["AI", "technology", "future"],
//   author: "John Doe"
// }
```

---

### Example 3: List Extraction for Recommendations

```javascript
import { ListOutputParser } from './output-parsers/list-parser.js';
import { PromptTemplate } from './prompts/prompt-template.js';

const parser = new ListOutputParser();

const prompt = new PromptTemplate({
    template: `Recommend 5 {category} for someone interested in {interest}.

{format_instructions}

List:`,
    inputVariables: ["category", "interest"],
    partialVariables: {
        format_instructions: parser.getFormatInstructions()
    }
});

const chain = prompt.pipe(llm).pipe(parser);

const books = await chain.invoke({
    category: "books",
    interest: "machine learning"
});

console.log(books);
// [
//   "Pattern Recognition and Machine Learning",
//   "Deep Learning by Goodfellow",
//   "Hands-On Machine Learning",
//   "The Hundred-Page Machine Learning Book",
//   "Machine Learning Yearning"
// ]
```

---

### Example 4: Sentiment Analysis with Retry

```javascript
import { JsonOutputParser } from './output-parsers/json-parser.js';
import { PromptTemplate } from './prompts/prompt-template.js';

const parser = new JsonOutputParser();

// If parsing fails, retry with clearer instructions
async function robustSentimentAnalysis(text) {
    const prompt = new PromptTemplate({
        template: `Analyze sentiment of: "{text}"

Respond with ONLY valid JSON:
{{"sentiment": "positive/negative/neutral", "score": 0.0-1.0}}`
    });

    const chain = prompt.pipe(llm).pipe(parser);

    try {
        return await chain.invoke({ text });
    } catch (error) {
        console.log('Parse failed, retrying with stricter prompt...');
        
        // Retry with more explicit prompt
        const strictPrompt = new PromptTemplate({
            template: `Analyze: "{text}"

IMPORTANT: Respond with ONLY this JSON structure, nothing else:
{{"sentiment": "positive", "score": 0.9}}

Your response:`
        });

        const retryChain = strictPrompt.pipe(llm).pipe(parser);
        return await retryChain.invoke({ text });
    }
}
```

---

## Advanced Patterns

### Pattern 1: Fallback Parsing

```javascript
class FallbackOutputParser extends BaseOutputParser {
    constructor(parsers) {
        super();
        this.parsers = parsers;
    }

    async parse(text) {
        const errors = [];

        for (const parser of this.parsers) {
            try {
                return await parser.parse(text);
            } catch (error) {
                errors.push({ parser: parser.name, error });
            }
        }

        throw new OutputParserException(
            `All parsers failed. Errors: ${JSON.stringify(errors)}`,
            text
        );
    }
}

// Usage
const parser = new FallbackOutputParser([
    new JsonOutputParser(),      // Try JSON first
    new RegexOutputParser({...}), // Try regex second
    new StringOutputParser()      // Fallback to string
]);
```

---

### Pattern 2: Transform After Parse

```javascript
class TransformOutputParser extends BaseOutputParser {
    constructor(parser, transform) {
        super();
        this.parser = parser;
        this.transform = transform;
    }

    async parse(text) {
        const parsed = await this.parser.parse(text);
        return this.transform(parsed);
    }
}

// Usage: parse JSON then transform values
const parser = new TransformOutputParser(
    new JsonOutputParser(),
    (data) => ({
        ...data,
        confidence: parseFloat(data.confidence),
        timestamp: new Date().toISOString()
    })
);
```

---

### Pattern 3: Conditional Parsing

```javascript
class ConditionalOutputParser extends BaseOutputParser {
    constructor(condition, trueParser, falseParser) {
        super();
        this.condition = condition;
        this.trueParser = trueParser;
        this.falseParser = falseParser;
    }

    async parse(text) {
        const useTrue = this.condition(text);
        const parser = useTrue ? this.trueParser : this.falseParser;
        return await parser.parse(text);
    }
}

// Usage: different parsers based on content
const parser = new ConditionalOutputParser(
    (text) => text.includes('{'),  // Has JSON?
    new JsonOutputParser(),
    new ListOutputParser()
);
```

---

### Pattern 4: Validated Output

```javascript
class ValidatedOutputParser extends BaseOutputParser {
    constructor(parser, validator) {
        super();
        this.parser = parser;
        this.validator = validator;
    }

    async parse(text) {
        const parsed = await this.parser.parse(text);
        
        const isValid = this.validator(parsed);
        if (!isValid) {
            throw new OutputParserException(
                'Parsed output failed validation',
                text
            );
        }

        return parsed;
    }
}

// Usage: ensure confidence is in range
const parser = new ValidatedOutputParser(
    new JsonOutputParser(),
    (data) => data.confidence >= 0 && data.confidence <= 1
);
```

---

## Integration with Full Chain

### Complete Example: Sentiment Analysis API

```javascript
import { PromptTemplate } from './prompts/prompt-template.js';
import { LlamaCppLLM } from './llm/llama-cpp-llm.js';
import { StructuredOutputParser } from './output-parsers/structured-parser.js';
import { ConsoleCallback } from './utils/callbacks.js';

// Define output structure
const parser = new StructuredOutputParser({
    responseSchemas: [
        {
            name: "sentiment",
            type: "string",
            enum: ["positive", "negative", "neutral"]
        },
        {
            name: "confidence",
            type: "number"
        },
        {
            name: "emotions",
            type: "array",
            description: "List of detected emotions"
        }
    ]
});

// Build prompt
const prompt = new PromptTemplate({
    template: `Analyze the sentiment of this text:

"{text}"

{format_instructions}`,
    inputVariables: ["text"],
    partialVariables: {
        format_instructions: parser.getFormatInstructions()
    }
});

// Create LLM
const llm = new LlamaCppLLM({
    modelPath: './model.gguf',
    temperature: 0.1  // Low temp for consistent classification
});

// Build chain with logging
const chain = prompt.pipe(llm).pipe(parser);

const logger = new ConsoleCallback();

// Analyze sentiment
async function analyzeSentiment(text) {
    try {
        const result = await chain.invoke(
            { text },
            { callbacks: [logger] }
        );

        return {
            success: true,
            data: result
        };
    } catch (error) {
        return {
            success: false,
            error: error.message,
            rawOutput: error.llmOutput
        };
    }
}

// Use it
const result = await analyzeSentiment("I absolutely love this product! It's amazing!");
console.log(result);
// {
//   success: true,
//   data: {
//     sentiment: "positive",
//     confidence: 0.95,
//     emotions: ["joy", "excitement", "satisfaction"]
//   }
// }
```

---

## Error Handling

### Pattern: Graceful Degradation

```javascript
async function parseWithFallback(text, primaryParser, fallbackValue) {
    try {
        return await primaryParser.parse(text);
    } catch (error) {
        console.warn('Primary parser failed:', error.message);
        console.warn('Using fallback value:', fallbackValue);
        return fallbackValue;
    }
}

// Usage
const result = await parseWithFallback(
    llmOutput,
    new JsonOutputParser(),
    { error: true, message: "Failed to parse", raw: llmOutput }
);
```

---

### Pattern: Retry with Fix Instructions

```javascript
async function parseWithRetry(text, parser, llm, maxRetries = 2) {
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
            return await parser.parse(text);
        } catch (error) {
            if (attempt === maxRetries - 1) throw error;

            // Ask LLM to fix the output
            const fixPrompt = `The following output is malformed:
${text}

Error: ${error.message}

Please provide the output in correct format:
${parser.getFormatInstructions()}`;

            text = await llm.invoke(fixPrompt);
        }
    }
}
```

---

## Testing Parsers

### Unit Tests

```javascript
import { describe, it, expect } from 'your-test-framework';
import { JsonOutputParser } from './output-parsers/json-parser.js';

describe('JsonOutputParser', () => {
    it('should parse plain JSON', async () => {
        const parser = new JsonOutputParser();
        const result = await parser.parse('{"name": "Alice", "age": 30}');
        
        expect(result.name).toBe('Alice');
        expect(result.age).toBe(30);
    });

    it('should extract JSON from markdown', async () => {
        const parser = new JsonOutputParser();
        const text = '```json\n{"key": "value"}\n```';
        const result = await parser.parse(text);
        
        expect(result.key).toBe('value');
    });

    it('should validate against schema', async () => {
        const parser = new JsonOutputParser({
            schema: { name: 'string', age: 'number' }
        });

        await expect(
            parser.parse('{"name": "Bob", "age": "invalid"}')
        ).rejects.toThrow();
    });

    it('should throw on invalid JSON', async () => {
        const parser = new JsonOutputParser();
        await expect(parser.parse('not json')).rejects.toThrow();
    });
});
```

---

## Best Practices

### ✅ DO:

**1. Include format instructions in prompts**
```javascript
const prompt = new PromptTemplate({
    template: `{task}

{format_instructions}`,
    partialVariables: {
        format_instructions: parser.getFormatInstructions()
    }
});
```

**2. Use schema validation for complex outputs**
```javascript
const parser = new StructuredOutputParser({
    responseSchemas: [
        { name: "field1", type: "string", required: true },
        { name: "field2", type: "number", required: true }
    ]
});
```

**3. Handle parsing errors gracefully**
```javascript
try {
    const parsed = await parser.parse(text);
} catch (error) {
    console.error('Parsing failed:', error.message);
    // Fallback or retry logic
}
```

**4. Test parsers independently**
```javascript
// Test without LLM
const result = await parser.parse(mockLLMOutput);
expect(result).toMatchSchema();
```

**5. Use low temperature for structured outputs**
```javascript
const llm = new LlamaCppLLM({
    temperature: 0.1  // More consistent formatting
});
```

---

### ❌ DON'T:

**1. Don't assume perfect LLM formatting**
```javascript
// Bad
const data = JSON.parse(llmOutput);  // Will fail often

// Good
const data = await jsonParser.parse(llmOutput);  // Handles variations
```

**2. Don't skip validation**
```javascript
// Bad
const result = await parser.parse(text);
// Use result.field without checking

// Good
const result = await parser.parse(text);
if (result.field && typeof result.field === 'string') {
    // Use result.field
}
```

**3. Don't use parsers for simple text**
```javascript
// Bad
const parser = new JsonOutputParser();
const result = await parser.parse(simpleText);

// Good
const parser = new StringOutputParser();
const result = await parser.parse(simpleText);
```

---

## Exercises

Practice using output parsers in real-world scenarios from simple to complex:

### Exercise 21: Product Review Analyzer 
Extract clean summaries and sentiment from product reviews using StringOutputParser.  
**Starter code**: [`exercises/21-review-analyzer.js`](exercises/21-review-analyzer.js)

### Exercise 22: Contact Information Extractor 
Parse structured contact details and skills from unstructured text using JSON and List parsers.  
**Starter code**: [`exercises/22-contact-extractor.js`](exercises/22-contact-extractor.js)

### Exercise 23: Article Metadata Extractor 
Extract complex metadata with schema validation using StructuredOutputParser.  
**Starter code**: [`exercises/23-article-metadata.js`](exercises/23-article-metadata.js)

### Exercise 24: Multi-Parser Content Pipeline 
Build production-ready pipelines with multiple parsers, fallback strategies, and content routing.  
**Starter code**: [`exercises/24-multi-parser-pipeline.js`](exercises/24-multi-parser-pipeline.js)

---

## Summary

You've built a complete output parsing system!

### Key Takeaways

1. **BaseOutputParser**: Foundation for all parsers
2. **StringOutputParser**: Clean text output
3. **JsonOutputParser**: Extract and validate JSON
4. **ListOutputParser**: Parse lists/arrays
5. **RegexOutputParser**: Pattern-based extraction
6. **StructuredOutputParser**: Full schema validation

### What You Built

A parsing system that:
- ✅ Extracts structured data reliably
- ✅ Validates output formats
- ✅ Handles errors gracefully
- ✅ Generates format instructions
- ✅ Works in chains with prompts
- ✅ Is testable in isolation

### Next Steps

Now you can combine prompts + LLMs + parsers into complete chains.

➡️ **Next: [LLM Chains](./03-llm-chain.md)**

Learn how to build complete prompt → LLM → parser pipelines.

---

**Built with ❤️ for learners who want to understand AI frameworks deeply**

[← Previous: Prompts](./01-prompts.md) | [Tutorial Index](../README.md) | [Next: LLM Chains →](./03-llm-chain.md)