Structured Output & JSON Parsing - LangChain Tutorial | TechLead

Why Structured Output?

LLMs naturally return free-form text, but most applications need structured data — JSON objects, arrays, specific fields. LangChain provides multiple strategies to ensure LLMs return data in the exact format your application expects.

🎯 When You Need Structured Output

API Responses: Return JSON from AI-powered endpoints
Data Extraction: Pull structured info from unstructured text
Form Generation: Create dynamic forms from descriptions
Database Inserts: Extract data to save to your database

Method 1: withStructuredOutput (Recommended)

The simplest and most reliable approach — uses the model's native function calling to guarantee valid JSON.

import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

// Define your schema with Zod
const PersonSchema = z.object({
  name: z.string().describe("The person's full name"),
  age: z.number().describe("Age in years"),
  occupation: z.string().describe("Current job title"),
  skills: z.array(z.string()).describe("List of technical skills"),
  isAvailable: z.boolean().describe("Whether they're available for hire"),
});

// Create a model that returns structured output
const model = new ChatOpenAI({
  modelName: "gpt-4",
  temperature: 0, // Lower = more deterministic
}).withStructuredOutput(PersonSchema);

// The response is automatically typed and validated
const person = await model.invoke(
  "Extract info: John Smith is a 32-year-old senior React developer " +
  "who knows TypeScript, Node.js, and GraphQL. He's currently looking for work."
);

console.log(person);
// {
//   name: "John Smith",
//   age: 32,
//   occupation: "Senior React Developer",
//   skills: ["TypeScript", "Node.js", "GraphQL", "React"],
//   isAvailable: true
// }

Method 2: StructuredOutputParser

Uses prompt instructions to guide the model to output JSON. Works with any model, even those without function calling.

import { StructuredOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { z } from "zod";

// Define schema
const parser = StructuredOutputParser.fromZodSchema(
  z.object({
    summary: z.string().describe("A brief summary"),
    sentiment: z.enum(["positive", "negative", "neutral"]),
    topics: z.array(z.string()).describe("Main topics discussed"),
    wordCount: z.number().describe("Approximate word count"),
  })
);

// Create prompt with format instructions
const prompt = ChatPromptTemplate.fromTemplate(`
Analyze the following text and provide structured output.

{format_instructions}

Text: {text}
`);

// Build chain
const chain = prompt.pipe(model).pipe(parser);

const result = await chain.invoke({
  text: "React 19 introduces exciting new features like Server Components...",
  format_instructions: parser.getFormatInstructions(),
});

console.log(result.sentiment); // "positive"
console.log(result.topics);    // ["React", "Server Components"]

Method 3: JsonOutputParser

Simple JSON parsing that works with streaming responses.

import { JsonOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";

// Works with streaming
const parser = new JsonOutputParser();

const prompt = ChatPromptTemplate.fromTemplate(`
Return a JSON object with "title", "description", and "tags" array 
for the following topic:

Topic: {topic}

Return ONLY valid JSON, no other text.
`);

const chain = prompt.pipe(model).pipe(parser);

const result = await chain.invoke({ topic: "Docker Compose" });
// { title: "...", description: "...", tags: [...] }

Extracting Data from Unstructured Text

import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

// Schema for extracting product reviews
const ReviewSchema = z.object({
  productName: z.string(),
  rating: z.number().min(1).max(5),
  pros: z.array(z.string()),
  cons: z.array(z.string()),
  wouldRecommend: z.boolean(),
});

const model = new ChatOpenAI({ modelName: "gpt-4" })
  .withStructuredOutput(ReviewSchema);

const review = await model.invoke(`
  Extract the review data from this text:
  
  "I bought the XPS 15 last month. Amazing screen quality and 
  the keyboard is great for coding. Battery life could be better 
  though, and it runs hot under load. Overall I'd give it a 4/5 
  and would definitely recommend it to developers."
`);

console.log(review);
// {
//   productName: "XPS 15",
//   rating: 4,
//   pros: ["Amazing screen quality", "Great keyboard for coding"],
//   cons: ["Battery life could be better", "Runs hot under load"],
//   wouldRecommend: true
// }

Handling Arrays and Complex Types

// Extract multiple items from text
const MenuSchema = z.object({
  restaurantName: z.string(),
  items: z.array(z.object({
    name: z.string(),
    price: z.number(),
    category: z.enum(["appetizer", "main", "dessert", "drink"]),
    isVegetarian: z.boolean(),
  })),
  totalItems: z.number(),
});

const model = new ChatOpenAI({ modelName: "gpt-4" })
  .withStructuredOutput(MenuSchema);

const menu = await model.invoke(
  "Parse this menu: Caesar Salad $12 (vegetarian), " +
  "Grilled Salmon $28, Chocolate Cake $10, Lemonade $5"
);

// Fully typed response with validated data!
menu.items.forEach(item => {
  console.log(`${item.name}: $${item.price} (${item.category})`);
});

Error Handling and Retry

import { OutputFixingParser } from "langchain/output_parsers";

// Auto-fix malformed JSON responses
const fixingParser = OutputFixingParser.fromLLM(
  new ChatOpenAI({ modelName: "gpt-4" }),
  parser
);

// If the initial response has invalid JSON, it sends it back
// to the LLM with the error message to fix it
try {
  const result = await chain.invoke({ text: "..." });
} catch (e) {
  // Auto-retry with fixing parser
  const fixed = await fixingParser.parse(badOutput);
}

💡 Key Takeaways

• Use withStructuredOutput() with Zod schemas — it's the most reliable method
• Set temperature: 0 for more consistent structured output
• Zod's .describe() helps the LLM understand what each field should contain
• StructuredOutputParser works with any model, even without function calling
• Use OutputFixingParser for automatic retry on malformed responses