Skip to main content

Command Palette

Search for a command to run...

What Happens After You Hit “Batch”?

Updated
3 min read
What Happens After You Hit “Batch”?
S

Graduated from KIIT, Bhubaneswar in 2023 with a B.Tech in CS. Did my majors in AI and Computational Mathematics. For me, Covid was a blessing in disguise. I got plenty of time, staying at home, tinkering and building stuff. Tried IoT, App Development, Backend, Cloud. Did a few internships in Flutter in my second year of college. Moved to Full stack, majorly focussing on backend. Single-handedly build a Whatsapp-like video calling solution for a CA based social media company. Teaching was also a passion. So, started up an ed-tech platform with a friend, Sridipto. That's our first venture together - Snipe. Raised some capital from a Bangalore based VC during 3rd year of college. Came to Bangalore. Scaled Snipe to around a million users. But, monetisation was a challenge, downfall of ed-tech making it worse. Had to pivot. Gamification was our core. Switched to B2B model and got some early success. Few big names onboarded - Burger King, Pedigree, Saffola - few of them. Cut to 2024 September, we're team of 20+ team. Business is doing well. But realised scaling is problem. We can't just remain as a Gamification Service company. We thought, let's build something big. Let's Build the Future of Computing. The biggest learning, if you have a big problem, break it up into smaller problems. Divide and Conquer. It becomes a lot easier.

If you’ve read my previous post, you know how to structure a .jsonlfile, upload it, and create a batch request to OpenAI.

But what happens after the request is made?

This blog walks you through:

  • How to track a batch job

  • How to download and interpret the output

  • What to do when something fails

  • A few underrated batch methods you should know


Step 1: Wait and Watch (Fetching Status)

After creating a batch, the first thing you should do is check its status. This helps you:

  • Know if it’s still validating, in progress, or completed

  • Ensure there were no silent failures

You can poll the status like so:

batch = client.batches.retrieve(batch_id="your_batch_id")
print(batch.status)

Once the status flips to "completed" and output_file_id is available, you’re ready to extract the results.


Step 2: Get the Output (Safely)

Here’s the complete script I used to:

  • Fetch the output from OpenAI

  • Save it in both .jsonl and .json formats

  • Cleanly extract just the custom_id and final response message

from openai import OpenAI
import os
import json
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Provide your completed batch ID
batch_id = "batch_682aed00c6d88190990751eb7966abeb"

# Retrieve batch info
batch = client.batches.retrieve(batch_id=batch_id)

# Ensure it's completed and output file exists
if batch.status != "completed" or not batch.output_file_id:
    raise ValueError(f"Batch not ready or missing output_file_id. Status: {batch.status}")

# Get the output content using `files.content`
output_file = client.files.content(batch.output_file_id)

# Save raw output as JSONL
lines = output_file.text
with open("openai_calls/batch_output.jsonl", "w", encoding="utf-8") as f:
    f.write(lines)

# Parse and save a clean JSON array
with open("openai_calls/batch_output.json", "w", encoding="utf-8") as f:
    json_data = [
        {
            'custom_id': json.loads(line)['custom_id'],
            'response': json.loads(line)['response']['body']['choices'][0]['message']['content']
        }
        for line in lines.splitlines()
    ]
    json.dump(json_data, f, indent=4)

This structure makes it easy to read, log, or even pipe into downstream analytics tools or databases.


Bonus Functions You Should Know

Batch APIs come with a couple of helpful utilities that make your workflow smoother:

1. List All Your Batches

See what you’ve run recently:

batches = client.batches.list()

Use this to track jobs across your team or workspace, especially when running multiple experiments.


2. Cancel a Batch (If You Catch a Mistake)

Did you spot an error in your batch input right after launching it? Cancel it before it starts processing:

client.batches.cancel(batch_id="your_batch_id")

Note: You can only cancel a batch while it's in the validating or queued stage. Once it moves to in_progress, it’s too late.


Summary: Life After Batch Creation

Here’s what a full batch lifecycle looks like:

  1. Create → Upload input and start the batch

  2. Track → Poll for status until completed

  3. Download → Use the output file ID to fetch responses

  4. Parse → Extract insights, summaries, or tags from the JSONL

  5. Repeat or Cancel → Use list() to audit, or cancel() when needed

If you're working with any asynchronous, large-scale LLM task, batch APIs are not just a convenience; they're an optimisation layer.


What’s Next?

I’m currently chaining these batch summaries into:

  • Embedding pipelines (for search)

  • Auto-tagging workflows (for knowledge org)

  • Notification systems (summarize & alert)

If you're exploring something similar, feel free to fork the script or ping me. I’ll also share more about embedding and search in future blogs.

Stay tuned.


Read the previous blog → What are Batch APIs? feat. OpenAI
Docs: OpenAI Batch API Guide

Let's OpenAI

Part 2 of 3

A hands-on blog series exploring how to build real-world tools with OpenAI — covering Batch APIs, chat/completion endpoints, summarization, embeddings, function calling, and workflow optimization.

Up next

What are Batch APIs? feat. OpenAI

🧠 TL;DR Batch APIs are your best friends when you want to run large-scale LLM tasks – without maxing out your rate limits or sending requests one at a time like it’s 2022. In this post: Why batch APIs are useful A real-world use case: summarizing ...