Spaces:

lenzcom
/

Email

Running

File size: 40,273 Bytes

e706de2

# Output Parsers: Structured Output Extraction

**Part 2: Composition - Lesson 2**

> LLMs return text. You need data.

## Overview

You've learned to create great prompts. LLMs return unstructured text, and in some cases you might need structured data:

```javascript

// LLM returns this:

"The sentiment is positive with a confidence of 0.92"



// You need this:

{

    sentiment: "positive",

    confidence: 0.92

}

```

**Output parsers** transform LLM text into structured data you can use in your applications.

## Why This Matters

### The Problem: Parsing Chaos

Without parsers, your code is full of brittle string manipulation:

```javascript

const response = await llm.invoke("Classify: I love this product!");



// Fragile parsing code everywhere

if (response.includes("positive")) {

    sentiment = "positive";

} else if (response.includes("negative")) {

    sentiment = "negative";

}



// What if format changes?

// What if LLM adds extra text?

// How do you handle errors?

```

Problems:
- Brittle regex and string matching
- No validation of output format
- Hard to test parsing logic
- Inconsistent error handling
- Parser code duplicated everywhere

### The Solution: Output Parsers

```javascript

const parser = new JsonOutputParser();



const prompt = new PromptTemplate({

    template: `Classify the sentiment. Respond in JSON:

{{"sentiment": "positive/negative/neutral", "confidence": 0.0-1.0}}



Text: {text}`,

    inputVariables: ["text"]

});



const chain = prompt.pipe(llm).pipe(parser);



const result = await chain.invoke({ text: "I love this!" });

// { sentiment: "positive", confidence: 0.95 }

```

Benefits:
- ✅ Reliable structured extraction
- ✅ Format validation
- ✅ Error handling built-in
- ✅ Reusable parsing logic
- ✅ Type-safe outputs

## Learning Objectives

By the end of this lesson, you will:

- ✅ Build a BaseOutputParser abstraction
- ✅ Create a StringOutputParser for text cleanup
- ✅ Implement JsonOutputParser for JSON extraction
- ✅ Build ListOutputParser for arrays
- ✅ Create StructuredOutputParser with schemas
- ✅ Use parsers in chains with prompts
- ✅ Handle parsing errors gracefully

## Core Concepts

### What is an Output Parser?

An output parser **transforms LLM text output into structured data**.

**Flow:**
```

LLM Output (text) → Parser → Structured Data

    ↓                ↓              ↓

"positive: 0.95"  parse()    {sentiment: "positive", confidence: 0.95}

```

### The Parser Hierarchy

```

BaseOutputParser (abstract)

    ├── StringOutputParser (clean text)

    ├── JsonOutputParser (extract JSON)

    ├── ListOutputParser (extract lists)

    ├── RegexOutputParser (regex patterns)

    └── StructuredOutputParser (schema validation)

```

Each parser handles a specific output format.

### Key Operations

1. **Parse**: Extract structured data from text
2. **Get Format Instructions**: Tell LLM how to format response
3. **Validate**: Check output matches expected structure
4. **Handle Errors**: Gracefully handle malformed outputs

## Implementation Guide

### Step 1: Base Output Parser

**Location:** `src/output-parsers/base-parser.js`

This is the abstract base class all parsers inherit from.

**What it does:**
- Defines the interface for all parsers
- Extends Runnable (so parsers work in chains)
- Provides format instruction generation
- Handles parsing errors

**Implementation:**

```javascript

import { Runnable } from '../core/runnable.js';



/**

 * Base class for all output parsers

 * Transforms LLM text output into structured data

 */

export class BaseOutputParser extends Runnable {

    constructor() {

        super();

        this.name = this.constructor.name;

    }



    /**

     * Parse the LLM output into structured data

     * @abstract

     * @param {string} text - Raw LLM output

     * @returns {Promise<any>} Parsed data

     */

    async parse(text) {

        throw new Error(`${this.name} must implement parse()`);

    }



    /**

     * Get instructions for the LLM on how to format output

     * @returns {string} Format instructions

     */

    getFormatInstructions() {

        return '';

    }



    /**

     * Runnable interface: parse the output

     */

    async _call(input, config) {

        // Input can be a string or a Message

        const text = typeof input === 'string' 

            ? input 

            : input.content;

        

        return await this.parse(text);

    }



    /**

     * Parse with error handling

     */

    async parseWithPrompt(text, prompt) {

        try {

            return await this.parse(text);

        } catch (error) {

            throw new OutputParserException(

                `Failed to parse output from prompt: ${error.message}`,

                text,

                error

            );

        }

    }

}



/**

 * Exception thrown when parsing fails

 */

export class OutputParserException extends Error {

    constructor(message, llmOutput, originalError) {

        super(message);

        this.name = 'OutputParserException';

        this.llmOutput = llmOutput;

        this.originalError = originalError;

    }

}

```

**Key insights:**
- Extends `Runnable` so parsers can be piped in chains
- `_call` extracts text from strings or Messages
- `getFormatInstructions()` helps prompt the LLM
- Error handling wraps parse failures with context

---

### Step 2: String Output Parser

**Location:** `src/output-parsers/string-parser.js`

The simplest parser - cleans up text output.

**What it does:**
- Strips leading/trailing whitespace
- Optionally removes markdown code blocks
- Returns clean string

**Use when:**
- You just need clean text
- No structure needed
- Want to remove formatting artifacts

**Implementation:**

```javascript

import { BaseOutputParser } from './base-parser.js';



/**

 * Parser that returns cleaned string output

 * Strips whitespace and optionally removes markdown

 * 

 * Example:

 *   const parser = new StringOutputParser();

 *   const result = await parser.parse("  Hello World  ");

 *   // "Hello World"

 */

export class StringOutputParser extends BaseOutputParser {

    constructor(options = {}) {

        super();

        this.stripMarkdown = options.stripMarkdown ?? true;

    }



    /**

     * Parse: clean the text

     */

    async parse(text) {

        let cleaned = text.trim();



        if (this.stripMarkdown) {

            cleaned = this._stripMarkdownCodeBlocks(cleaned);

        }



        return cleaned;

    }



    /**

     * Remove markdown code blocks (```code```)

     */

    _stripMarkdownCodeBlocks(text) {

        // Remove ```language\ncode\n```
        return text.replace(/```[\w]*\n([\s\S]*?)\n```/g, '$1').trim();

    }


    getFormatInstructions() {

        return 'Respond with plain text. No markdown formatting.';

    }

}

```


**Usage:**

```javascript

const parser = new StringOutputParser();



// Handles various formats

await parser.parse("  Hello  ");           // "Hello"

await parser.parse("```\ncode\n```");      // "code"

await parser.parse("   \n  Text  \n   "); // "Text"

```

---

### Step 3: JSON Output Parser

**Location:** `src/output-parsers/json-parser.js`

Extracts and validates JSON from LLM output.

**What it does:**
- Finds JSON in text (handles markdown, extra text)
- Parses and validates JSON
- Optionally validates against a schema

**Use when:**
- Need structured objects
- Want type-safe data
- Need validation

**Implementation:**

```javascript

import { BaseOutputParser, OutputParserException } from './base-parser.js';



/**

 * Parser that extracts JSON from LLM output

 * Handles markdown code blocks and extra text

 * 

 * Example:

 *   const parser = new JsonOutputParser();

 *   const result = await parser.parse('```json\n{"name": "Alice"}\n```');

 *   // { name: "Alice" }

 */

export class JsonOutputParser extends BaseOutputParser {

    constructor(options = {}) {

        super();

        this.schema = options.schema;

    }



    /**

     * Parse JSON from text

     */

    async parse(text) {

        try {

            // Try to extract JSON from the text

            const jsonText = this._extractJson(text);

            const parsed = JSON.parse(jsonText);



            // Validate against schema if provided

            if (this.schema) {

                this._validateSchema(parsed);

            }



            return parsed;

        } catch (error) {

            throw new OutputParserException(

                `Failed to parse JSON: ${error.message}`,

                text,

                error

            );

        }

    }



    /**

     * Extract JSON from text (handles markdown, extra text)

     */

    _extractJson(text) {

        // Try direct parse first

        try {

            JSON.parse(text.trim());

            return text.trim();

        } catch {

            // Not direct JSON, try to find it

        }



        // Look for JSON in markdown code blocks

        const markdownMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)\n?```/);

        if (markdownMatch) {

            return markdownMatch[1].trim();

        }



        // Look for JSON object/array patterns

        const jsonObjectMatch = text.match(/\{[\s\S]*\}/);

        if (jsonObjectMatch) {

            return jsonObjectMatch[0];

        }



        const jsonArrayMatch = text.match(/\[[\s\S]*\]/);

        if (jsonArrayMatch) {

            return jsonArrayMatch[0];

        }



        // Give up, return original

        return text.trim();

    }



    /**

     * Validate parsed JSON against schema

     */

    _validateSchema(parsed) {

        if (!this.schema) return;



        for (const [key, type] of Object.entries(this.schema)) {

            if (!(key in parsed)) {

                throw new Error(`Missing required field: ${key}`);

            }



            const actualType = typeof parsed[key];

            if (actualType !== type) {

                throw new Error(

                    `Field ${key} should be ${type}, got ${actualType}`

                );

            }

        }

    }



    getFormatInstructions() {

        let instructions = 'Respond with valid JSON.';

        

        if (this.schema) {

            const schemaDesc = Object.entries(this.schema)

                .map(([key, type]) => `"${key}": ${type}`)

                .join(', ');

            instructions += ` Schema: { ${schemaDesc} }`;

        }



        return instructions;

    }

}

```

**Usage:**

```javascript

const parser = new JsonOutputParser({

    schema: {

        name: 'string',

        age: 'number',

        active: 'boolean'

    }

});



// Handles various JSON formats

await parser.parse('{"name": "Alice", "age": 30, "active": true}');

await parser.parse('```json\n{"name": "Bob", "age": 25, "active": false}\n```');

await parser.parse('Sure! Here\'s the data: {"name": "Charlie", "age": 35, "active": true}');

```

---

### Step 4: List Output Parser

**Location:** `src/output-parsers/list-parser.js`

Extracts lists/arrays from text.

**What it does:**
- Parses numbered lists, bullet points, comma-separated
- Returns array of items
- Cleans each item

**Use when:**
- Need arrays of strings
- LLM outputs lists
- Want simple arrays

**Implementation:**

```javascript

import { BaseOutputParser } from './base-parser.js';



/**

 * Parser that extracts lists from text

 * Handles: numbered lists, bullets, comma-separated

 * 

 * Example:

 *   const parser = new ListOutputParser();

 *   const result = await parser.parse("1. Apple\n2. Banana\n3. Orange");

 *   // ["Apple", "Banana", "Orange"]

 */

export class ListOutputParser extends BaseOutputParser {

    constructor(options = {}) {

        super();

        this.separator = options.separator;

    }



    /**

     * Parse list from text

     */

    async parse(text) {

        const cleaned = text.trim();



        // If separator specified, use it

        if (this.separator) {

            return cleaned

                .split(this.separator)

                .map(item => item.trim())

                .filter(item => item.length > 0);

        }



        // Try to detect format

        if (this._isNumberedList(cleaned)) {

            return this._parseNumberedList(cleaned);

        }



        if (this._isBulletList(cleaned)) {

            return this._parseBulletList(cleaned);

        }



        // Try comma-separated

        if (cleaned.includes(',')) {

            return cleaned

                .split(',')

                .map(item => item.trim())

                .filter(item => item.length > 0);

        }



        // Try newline-separated

        return cleaned

            .split('\n')

            .map(item => item.trim())

            .filter(item => item.length > 0);

    }



    /**

     * Check if text is numbered list (1. Item\n2. Item)

     */

    _isNumberedList(text) {

        return /^\d+\./.test(text);

    }



    /**

     * Check if text is bullet list (- Item\n- Item or * Item)

     */

    _isBulletList(text) {

        return /^[-*•]/.test(text);

    }



    /**

     * Parse numbered list

     */

    _parseNumberedList(text) {

        return text

            .split('\n')

            .map(line => line.replace(/^\d+\.\s*/, '').trim())

            .filter(item => item.length > 0);

    }



    /**

     * Parse bullet list

     */

    _parseBulletList(text) {

        return text

            .split('\n')

            .map(line => line.replace(/^[-*•]\s*/, '').trim())

            .filter(item => item.length > 0);

    }



    getFormatInstructions() {

        if (this.separator) {

            return `Respond with items separated by "${this.separator}".`;

        }

        return 'Respond with a numbered list (1. Item) or bullet list (- Item).';

    }

}

```

**Usage:**

```javascript

const parser = new ListOutputParser();



// Handles various list formats

await parser.parse("1. Apple\n2. Banana\n3. Orange");

// ["Apple", "Banana", "Orange"]



await parser.parse("- Red\n- Green\n- Blue");

// ["Red", "Green", "Blue"]



await parser.parse("cat, dog, bird");

// ["cat", "dog", "bird"]



// Custom separator

const csvParser = new ListOutputParser({ separator: ',' });

await csvParser.parse("apple,banana,orange");

// ["apple", "banana", "orange"]

```

---

### Step 5: Regex Output Parser

**Location:** `src/output-parsers/regex-parser.js`

Uses regex patterns to extract structured data.

**What it does:**
- Applies regex to extract groups
- Maps groups to field names
- Returns structured object

**Use when:**
- Output has predictable patterns
- Need custom extraction logic
- Regex is simplest solution

**Implementation:**

```javascript

import { BaseOutputParser, OutputParserException } from './base-parser.js';



/**

 * Parser that uses regex to extract structured data

 * 

 * Example:

 *   const parser = new RegexOutputParser({

 *       regex: /Sentiment: (\w+), Confidence: ([\d.]+)/,

 *       outputKeys: ["sentiment", "confidence"]

 *   });

 *   

 *   const result = await parser.parse("Sentiment: positive, Confidence: 0.92");

 *   // { sentiment: "positive", confidence: "0.92" }

 */

export class RegexOutputParser extends BaseOutputParser {

    constructor(options = {}) {

        super();

        this.regex = options.regex;

        this.outputKeys = options.outputKeys || [];

        this.dotAll = options.dotAll ?? false;



        if (this.dotAll) {

            // Add 's' flag for dotAll if not present

            const flags = this.regex.flags.includes('s') 

                ? this.regex.flags 

                : this.regex.flags + 's';

            this.regex = new RegExp(this.regex.source, flags);

        }

    }



    /**

     * Parse using regex

     */

    async parse(text) {

        const match = text.match(this.regex);



        if (!match) {

            throw new OutputParserException(

                `Text does not match regex pattern: ${this.regex}`,

                text

            );

        }



        // If no output keys, return the groups as array

        if (this.outputKeys.length === 0) {

            return match.slice(1); // Exclude full match

        }



        // Map groups to keys

        const result = {};

        for (let i = 0; i < this.outputKeys.length; i++) {

            result[this.outputKeys[i]] = match[i + 1]; // +1 to skip full match

        }



        return result;

    }



    getFormatInstructions() {

        if (this.outputKeys.length > 0) {

            return `Format your response to match: ${this.outputKeys.join(', ')}`;

        }

        return 'Follow the specified format exactly.';

    }

}

```

**Usage:**

```javascript

const parser = new RegexOutputParser({

    regex: /Sentiment: (\w+), Confidence: ([\d.]+)/,

    outputKeys: ["sentiment", "confidence"]

});



const result = await parser.parse("Sentiment: positive, Confidence: 0.92");

// { sentiment: "positive", confidence: "0.92" }

```

---
# Output Parsers: Advanced Patterns & Integration

## Advanced Parser: Structured Output Parser

### Step 6: Structured Output Parser

**Location:** `src/output-parsers/structured-parser.js`

The most powerful parser - validates against a full schema with types and descriptions.

**What it does:**
- Defines expected schema with types
- Generates format instructions for LLM
- Validates all fields and types
- Provides detailed error messages

**Use when:**
- Need complex structured data
- Want strong type validation
- Need to generate format instructions automatically

**Implementation:**

```javascript

import { BaseOutputParser, OutputParserException } from './base-parser.js';



/**

 * Parser with full schema validation

 * 

 * Example:

 *   const parser = new StructuredOutputParser({

 *       responseSchemas: [

 *           {

 *               name: "sentiment",

 *               type: "string",

 *               description: "The sentiment (positive/negative/neutral)",

 *               enum: ["positive", "negative", "neutral"]

 *           },

 *           {

 *               name: "confidence",

 *               type: "number",

 *               description: "Confidence score between 0 and 1"

 *           }

 *       ]

 *   });

 */

export class StructuredOutputParser extends BaseOutputParser {

    constructor(options = {}) {

        super();

        this.responseSchemas = options.responseSchemas || [];

    }



    /**

     * Parse and validate against schema

     */

    async parse(text) {

        try {

            // Extract JSON

            const jsonText = this._extractJson(text);

            const parsed = JSON.parse(jsonText);



            // Validate against schema

            this._validateAgainstSchema(parsed);



            return parsed;

        } catch (error) {

            throw new OutputParserException(

                `Failed to parse structured output: ${error.message}`,

                text,

                error

            );

        }

    }



    /**

     * Extract JSON from text (same as JsonOutputParser)

     */

    _extractJson(text) {

        try {

            JSON.parse(text.trim());

            return text.trim();

        } catch {}



        const markdownMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)\n?```/);

        if (markdownMatch) return markdownMatch[1].trim();



        const jsonMatch = text.match(/\{[\s\S]*\}/);

        if (jsonMatch) return jsonMatch[0];



        return text.trim();

    }



    /**

     * Validate parsed data against schema

     */

    _validateAgainstSchema(parsed) {

        for (const schema of this.responseSchemas) {

            const { name, type, enum: enumValues, required = true } = schema;



            // Check required fields

            if (required && !(name in parsed)) {

                throw new Error(`Missing required field: ${name}`);

            }



            if (name in parsed) {

                const value = parsed[name];



                // Check type

                if (!this._checkType(value, type)) {

                    throw new Error(

                        `Field ${name} should be ${type}, got ${typeof value}`

                    );

                }



                // Check enum values

                if (enumValues && !enumValues.includes(value)) {

                    throw new Error(

                        `Field ${name} must be one of: ${enumValues.join(', ')}`

                    );

                }

            }

        }

    }



    /**

     * Check if value matches expected type

     */

    _checkType(value, type) {

        switch (type) {

            case 'string':

                return typeof value === 'string';

            case 'number':

                return typeof value === 'number' && !isNaN(value);

            case 'boolean':

                return typeof value === 'boolean';

            case 'array':

                return Array.isArray(value);

            case 'object':

                return typeof value === 'object' && value !== null && !Array.isArray(value);

            default:

                return true;

        }

    }



    /**

     * Generate format instructions for LLM

     */

    getFormatInstructions() {

        const schemaDescriptions = this.responseSchemas.map(schema => {

            let desc = `"${schema.name}": ${schema.type}`;

            if (schema.description) {

                desc += ` // ${schema.description}`;

            }

            if (schema.enum) {

                desc += ` (one of: ${schema.enum.join(', ')})`;

            }

            return desc;

        });



        return `Respond with valid JSON matching this schema:

{

${schemaDescriptions.map(d => '  ' + d).join(',\n')}

}`;

    }



    /**

     * Static helper to create from simple schema

     */

    static fromNamesAndDescriptions(schemas) {

        const responseSchemas = Object.entries(schemas).map(([name, description]) => ({

            name,

            description,

            type: 'string' // Default type

        }));



        return new StructuredOutputParser({ responseSchemas });

    }

}

```

**Usage:**

```javascript

const parser = new StructuredOutputParser({

    responseSchemas: [

        {

            name: "sentiment",

            type: "string",

            description: "The sentiment of the text",

            enum: ["positive", "negative", "neutral"],

            required: true

        },

        {

            name: "confidence",

            type: "number",

            description: "Confidence score from 0 to 1",

            required: true

        },

        {

            name: "keywords",

            type: "array",

            description: "Key themes in the text",

            required: false

        }

    ]

});



// Get format instructions to add to prompt

const instructions = parser.getFormatInstructions();

console.log(instructions);



// Parse and validate

const result = await parser.parse(`{

    "sentiment": "positive",

    "confidence": 0.92,

    "keywords": ["great", "love", "excellent"]

}`);

```

---

## Real-World Examples

### Example 1: Email Classification with Structured Parser

```javascript

import { StructuredOutputParser } from './output-parsers/structured-parser.js';

import { PromptTemplate } from './prompts/prompt-template.js';

import { LlamaCppLLM } from './llm/llama-cpp-llm.js';



// Define the output structure

const parser = new StructuredOutputParser({

    responseSchemas: [

        {

            name: "category",

            type: "string",

            description: "Email category",

            enum: ["spam", "invoice", "meeting", "urgent", "personal", "other"]

        },

        {

            name: "confidence",

            type: "number",

            description: "Confidence score (0-1)"

        },

        {

            name: "reason",

            type: "string",

            description: "Brief explanation for classification"

        },

        {

            name: "actionRequired",

            type: "boolean",

            description: "Does email require action?"

        }

    ]

});



// Build prompt with format instructions

const prompt = new PromptTemplate({

    template: `Classify this email.



Email:

From: {from}

Subject: {subject}

Body: {body}



{format_instructions}`,

    inputVariables: ["from", "subject", "body"],

    partialVariables: {

        format_instructions: parser.getFormatInstructions()

    }

});



// Create chain

const llm = new LlamaCppLLM({ modelPath: './model.gguf' });

const chain = prompt.pipe(llm).pipe(parser);



// Use it

const result = await chain.invoke({

    from: "billing@company.com",

    subject: "Invoice #12345",

    body: "Payment due by March 15th"

});



console.log(result);

// {

//   category: "invoice",

//   confidence: 0.98,

//   reason: "Email contains invoice number and payment deadline",

//   actionRequired: true

// }

```

---

### Example 2: Content Extraction with JSON Parser

```javascript

import { JsonOutputParser } from './output-parsers/json-parser.js';

import { ChatPromptTemplate } from './prompts/chat-prompt-template.js';



const parser = new JsonOutputParser({

    schema: {

        title: 'string',

        summary: 'string',

        tags: 'object',  // Will be array

        author: 'string'

    }

});



const prompt = ChatPromptTemplate.fromMessages([

    ["system", "Extract article metadata. Respond with JSON."],

    ["human", "Article: {article}"]

]);



const chain = prompt.pipe(llm).pipe(parser);



const result = await chain.invoke({

    article: "Title: AI Revolution\nBy: John Doe\n\nAI is transforming..."

});



// {

//   title: "AI Revolution",

//   summary: "Article discusses AI's transformative impact",

//   tags: ["AI", "technology", "future"],

//   author: "John Doe"

// }

```

---

### Example 3: List Extraction for Recommendations

```javascript

import { ListOutputParser } from './output-parsers/list-parser.js';

import { PromptTemplate } from './prompts/prompt-template.js';



const parser = new ListOutputParser();



const prompt = new PromptTemplate({

    template: `Recommend 5 {category} for someone interested in {interest}.



{format_instructions}



List:`,

    inputVariables: ["category", "interest"],

    partialVariables: {

        format_instructions: parser.getFormatInstructions()

    }

});



const chain = prompt.pipe(llm).pipe(parser);



const books = await chain.invoke({

    category: "books",

    interest: "machine learning"

});



console.log(books);

// [

//   "Pattern Recognition and Machine Learning",

//   "Deep Learning by Goodfellow",

//   "Hands-On Machine Learning",

//   "The Hundred-Page Machine Learning Book",

//   "Machine Learning Yearning"

// ]

```

---

### Example 4: Sentiment Analysis with Retry

```javascript

import { JsonOutputParser } from './output-parsers/json-parser.js';

import { PromptTemplate } from './prompts/prompt-template.js';



const parser = new JsonOutputParser();



// If parsing fails, retry with clearer instructions

async function robustSentimentAnalysis(text) {

    const prompt = new PromptTemplate({

        template: `Analyze sentiment of: "{text}"



Respond with ONLY valid JSON:

{{"sentiment": "positive/negative/neutral", "score": 0.0-1.0}}`

    });



    const chain = prompt.pipe(llm).pipe(parser);



    try {

        return await chain.invoke({ text });

    } catch (error) {

        console.log('Parse failed, retrying with stricter prompt...');

        

        // Retry with more explicit prompt

        const strictPrompt = new PromptTemplate({

            template: `Analyze: "{text}"



IMPORTANT: Respond with ONLY this JSON structure, nothing else:

{{"sentiment": "positive", "score": 0.9}}



Your response:`

        });



        const retryChain = strictPrompt.pipe(llm).pipe(parser);

        return await retryChain.invoke({ text });

    }

}

```

---

## Advanced Patterns

### Pattern 1: Fallback Parsing

```javascript

class FallbackOutputParser extends BaseOutputParser {

    constructor(parsers) {

        super();

        this.parsers = parsers;

    }



    async parse(text) {

        const errors = [];



        for (const parser of this.parsers) {

            try {

                return await parser.parse(text);

            } catch (error) {

                errors.push({ parser: parser.name, error });

            }

        }



        throw new OutputParserException(

            `All parsers failed. Errors: ${JSON.stringify(errors)}`,

            text

        );

    }

}



// Usage

const parser = new FallbackOutputParser([

    new JsonOutputParser(),      // Try JSON first

    new RegexOutputParser({...}), // Try regex second

    new StringOutputParser()      // Fallback to string

]);

```

---

### Pattern 2: Transform After Parse

```javascript

class TransformOutputParser extends BaseOutputParser {

    constructor(parser, transform) {

        super();

        this.parser = parser;

        this.transform = transform;

    }



    async parse(text) {

        const parsed = await this.parser.parse(text);

        return this.transform(parsed);

    }

}



// Usage: parse JSON then transform values

const parser = new TransformOutputParser(

    new JsonOutputParser(),

    (data) => ({

        ...data,

        confidence: parseFloat(data.confidence),

        timestamp: new Date().toISOString()

    })

);

```

---

### Pattern 3: Conditional Parsing

```javascript

class ConditionalOutputParser extends BaseOutputParser {

    constructor(condition, trueParser, falseParser) {

        super();

        this.condition = condition;

        this.trueParser = trueParser;

        this.falseParser = falseParser;

    }



    async parse(text) {

        const useTrue = this.condition(text);

        const parser = useTrue ? this.trueParser : this.falseParser;

        return await parser.parse(text);

    }

}



// Usage: different parsers based on content

const parser = new ConditionalOutputParser(

    (text) => text.includes('{'),  // Has JSON?

    new JsonOutputParser(),

    new ListOutputParser()

);

```

---

### Pattern 4: Validated Output

```javascript

class ValidatedOutputParser extends BaseOutputParser {

    constructor(parser, validator) {

        super();

        this.parser = parser;

        this.validator = validator;

    }



    async parse(text) {

        const parsed = await this.parser.parse(text);

        

        const isValid = this.validator(parsed);

        if (!isValid) {

            throw new OutputParserException(

                'Parsed output failed validation',

                text

            );

        }



        return parsed;

    }

}



// Usage: ensure confidence is in range

const parser = new ValidatedOutputParser(

    new JsonOutputParser(),

    (data) => data.confidence >= 0 && data.confidence <= 1

);

```

---

## Integration with Full Chain

### Complete Example: Sentiment Analysis API

```javascript

import { PromptTemplate } from './prompts/prompt-template.js';

import { LlamaCppLLM } from './llm/llama-cpp-llm.js';

import { StructuredOutputParser } from './output-parsers/structured-parser.js';

import { ConsoleCallback } from './utils/callbacks.js';



// Define output structure

const parser = new StructuredOutputParser({

    responseSchemas: [

        {

            name: "sentiment",

            type: "string",

            enum: ["positive", "negative", "neutral"]

        },

        {

            name: "confidence",

            type: "number"

        },

        {

            name: "emotions",

            type: "array",

            description: "List of detected emotions"

        }

    ]

});



// Build prompt

const prompt = new PromptTemplate({

    template: `Analyze the sentiment of this text:



"{text}"



{format_instructions}`,

    inputVariables: ["text"],

    partialVariables: {

        format_instructions: parser.getFormatInstructions()

    }

});



// Create LLM

const llm = new LlamaCppLLM({

    modelPath: './model.gguf',

    temperature: 0.1  // Low temp for consistent classification

});



// Build chain with logging

const chain = prompt.pipe(llm).pipe(parser);



const logger = new ConsoleCallback();



// Analyze sentiment

async function analyzeSentiment(text) {

    try {

        const result = await chain.invoke(

            { text },

            { callbacks: [logger] }

        );



        return {

            success: true,

            data: result

        };

    } catch (error) {

        return {

            success: false,

            error: error.message,

            rawOutput: error.llmOutput

        };

    }

}



// Use it

const result = await analyzeSentiment("I absolutely love this product! It's amazing!");

console.log(result);

// {

//   success: true,

//   data: {

//     sentiment: "positive",

//     confidence: 0.95,

//     emotions: ["joy", "excitement", "satisfaction"]

//   }

// }

```

---

## Error Handling

### Pattern: Graceful Degradation

```javascript

async function parseWithFallback(text, primaryParser, fallbackValue) {

    try {

        return await primaryParser.parse(text);

    } catch (error) {

        console.warn('Primary parser failed:', error.message);

        console.warn('Using fallback value:', fallbackValue);

        return fallbackValue;

    }

}



// Usage

const result = await parseWithFallback(

    llmOutput,

    new JsonOutputParser(),

    { error: true, message: "Failed to parse", raw: llmOutput }

);

```

---

### Pattern: Retry with Fix Instructions

```javascript

async function parseWithRetry(text, parser, llm, maxRetries = 2) {

    for (let attempt = 0; attempt < maxRetries; attempt++) {

        try {

            return await parser.parse(text);

        } catch (error) {

            if (attempt === maxRetries - 1) throw error;



            // Ask LLM to fix the output

            const fixPrompt = `The following output is malformed:

${text}



Error: ${error.message}



Please provide the output in correct format:

${parser.getFormatInstructions()}`;



            text = await llm.invoke(fixPrompt);

        }

    }

}

```

---

## Testing Parsers

### Unit Tests

```javascript

import { describe, it, expect } from 'your-test-framework';

import { JsonOutputParser } from './output-parsers/json-parser.js';



describe('JsonOutputParser', () => {

    it('should parse plain JSON', async () => {

        const parser = new JsonOutputParser();

        const result = await parser.parse('{"name": "Alice", "age": 30}');

        

        expect(result.name).toBe('Alice');

        expect(result.age).toBe(30);

    });



    it('should extract JSON from markdown', async () => {

        const parser = new JsonOutputParser();

        const text = '```json\n{"key": "value"}\n```';

        const result = await parser.parse(text);

        

        expect(result.key).toBe('value');

    });



    it('should validate against schema', async () => {

        const parser = new JsonOutputParser({

            schema: { name: 'string', age: 'number' }

        });



        await expect(

            parser.parse('{"name": "Bob", "age": "invalid"}')

        ).rejects.toThrow();

    });



    it('should throw on invalid JSON', async () => {

        const parser = new JsonOutputParser();

        await expect(parser.parse('not json')).rejects.toThrow();

    });

});

```

---

## Best Practices

### ✅ DO:

**1. Include format instructions in prompts**
```javascript

const prompt = new PromptTemplate({

    template: `{task}



{format_instructions}`,

    partialVariables: {

        format_instructions: parser.getFormatInstructions()

    }

});

```

**2. Use schema validation for complex outputs**
```javascript

const parser = new StructuredOutputParser({

    responseSchemas: [

        { name: "field1", type: "string", required: true },

        { name: "field2", type: "number", required: true }

    ]

});

```

**3. Handle parsing errors gracefully**
```javascript

try {

    const parsed = await parser.parse(text);

} catch (error) {

    console.error('Parsing failed:', error.message);

    // Fallback or retry logic

}

```

**4. Test parsers independently**
```javascript

// Test without LLM

const result = await parser.parse(mockLLMOutput);

expect(result).toMatchSchema();

```

**5. Use low temperature for structured outputs**
```javascript

const llm = new LlamaCppLLM({

    temperature: 0.1  // More consistent formatting

});

```

---

### ❌ DON'T:

**1. Don't assume perfect LLM formatting**
```javascript

// Bad

const data = JSON.parse(llmOutput);  // Will fail often



// Good

const data = await jsonParser.parse(llmOutput);  // Handles variations

```

**2. Don't skip validation**
```javascript

// Bad

const result = await parser.parse(text);

// Use result.field without checking



// Good

const result = await parser.parse(text);

if (result.field && typeof result.field === 'string') {

    // Use result.field

}

```

**3. Don't use parsers for simple text**
```javascript

// Bad

const parser = new JsonOutputParser();

const result = await parser.parse(simpleText);



// Good

const parser = new StringOutputParser();

const result = await parser.parse(simpleText);

```

---

## Exercises

Practice using output parsers in real-world scenarios from simple to complex:

### Exercise 21: Product Review Analyzer 
Extract clean summaries and sentiment from product reviews using StringOutputParser.  
**Starter code**: [`exercises/21-review-analyzer.js`](exercises/21-review-analyzer.js)

### Exercise 22: Contact Information Extractor 
Parse structured contact details and skills from unstructured text using JSON and List parsers.  
**Starter code**: [`exercises/22-contact-extractor.js`](exercises/22-contact-extractor.js)

### Exercise 23: Article Metadata Extractor 
Extract complex metadata with schema validation using StructuredOutputParser.  
**Starter code**: [`exercises/23-article-metadata.js`](exercises/23-article-metadata.js)

### Exercise 24: Multi-Parser Content Pipeline 
Build production-ready pipelines with multiple parsers, fallback strategies, and content routing.  
**Starter code**: [`exercises/24-multi-parser-pipeline.js`](exercises/24-multi-parser-pipeline.js)

---

## Summary

You've built a complete output parsing system!

### Key Takeaways

1. **BaseOutputParser**: Foundation for all parsers
2. **StringOutputParser**: Clean text output
3. **JsonOutputParser**: Extract and validate JSON
4. **ListOutputParser**: Parse lists/arrays
5. **RegexOutputParser**: Pattern-based extraction
6. **StructuredOutputParser**: Full schema validation

### What You Built

A parsing system that:
- ✅ Extracts structured data reliably
- ✅ Validates output formats
- ✅ Handles errors gracefully
- ✅ Generates format instructions
- ✅ Works in chains with prompts
- ✅ Is testable in isolation

### Next Steps

Now you can combine prompts + LLMs + parsers into complete chains.

➡️ **Next: [LLM Chains](./03-llm-chain.md)**

Learn how to build complete prompt → LLM → parser pipelines.

---

**Built with ❤️ for learners who want to understand AI frameworks deeply**

[← Previous: Prompts](./01-prompts.md) | [Tutorial Index](../README.md) | [Next: LLM Chains →](./03-llm-chain.md)