<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[My AI Journal]]></title><description><![CDATA[My AI Journal]]></description><link>https://blogs.sirsho.xyz</link><generator>RSS for Node</generator><lastBuildDate>Wed, 06 May 2026 07:33:57 GMT</lastBuildDate><atom:link href="https://blogs.sirsho.xyz/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Build Smarter AI Apps with This LangGraph + Vercel AI SDK Starter Template 🚀]]></title><description><![CDATA[For the past few weeks, I’ve been exploring how to build more sophisticated AI applications – not just chatbots, but actual agents that can reason, search the web, interact with APIs, and deliver useful outcomes.
I wanted a starter project that could...]]></description><link>https://blogs.sirsho.xyz/build-smarter-ai-apps-with-this-langgraph-vercel-ai-sdk-starter-template</link><guid isPermaLink="true">https://blogs.sirsho.xyz/build-smarter-ai-apps-with-this-langgraph-vercel-ai-sdk-starter-template</guid><category><![CDATA[langchain]]></category><category><![CDATA[vercel ai sdk]]></category><category><![CDATA[Vercel]]></category><category><![CDATA[AI]]></category><category><![CDATA[chatgpt]]></category><category><![CDATA[React]]></category><category><![CDATA[Next.js]]></category><category><![CDATA[serpapi]]></category><category><![CDATA[#ai-tools]]></category><category><![CDATA[ai agents]]></category><category><![CDATA[opensource]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Mon, 09 Jun 2025 13:00:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1749473915402/c2a4562c-ba99-45ad-927c-f0c156cc21a8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For the past few weeks, I’ve been exploring how to build more sophisticated AI applications – not just chatbots, but actual <em>agents</em> that can reason, search the web, interact with APIs, and deliver useful outcomes.</p>
<p>I wanted a starter project that could bring together the power of the <strong>Vercel AI SDK</strong>, the flexibility of <strong>LangGraph (via LangChain)</strong>, and the usability of a <strong>modern React frontend</strong>, all with built-in support for <strong>real-time web search using SerpAPI</strong>.</p>
<p>To my surprise, there wasn’t a good one out there.</p>
<p>So, I decided to build it myself.</p>
<p>👉 <a target="_blank" href="https://github.com/Sirsho29/dia-langchain"><strong>GitHub Repo</strong></a></p>
<hr />
<h2 id="heading-what-this-project-solves">🧠 What This Project Solves</h2>
<p>Most AI starter kits are either too simple or too fragmented – they might show you how to call an OpenAI model, or how to use LangChain chains, or how to build a React UI... but not all of it working together in a coherent, extendable structure.</p>
<p>This project aims to <strong>bridge that gap</strong> and give developers a solid foundation to build truly intelligent applications.</p>
<hr />
<h2 id="heading-tech-stack">🔧 Tech Stack</h2>
<p>Here’s what’s under the hood:</p>
<h3 id="heading-1-vercel-ai-sdk">1. <strong>Vercel AI SDK</strong></h3>
<p>Makes it easier to stream responses from LLMs with built-in support for OpenAI, Anthropic, and others. No need to handle SSE manually.</p>
<h3 id="heading-2-langchain-langgraph">2. <strong>LangChain + LangGraph</strong></h3>
<p>Used to create structured, multi-step flows that let the AI reason, take actions, and return meaningful outputs – not just one-shot completions.</p>
<p>LangGraph brings agent workflows to life through a graph-based architecture, which is perfect for branching logic and tool use.</p>
<h3 id="heading-3-react-frontend">3. <strong>React Frontend</strong></h3>
<p>A clean and responsive interface to interact with your AI agents. Uses <code>useStream</code> from the Vercel SDK for real-time UX.</p>
<h3 id="heading-4-serpapi">4. <strong>SerpAPI</strong></h3>
<p>Brings live, contextual web search into your LangGraph agents. Your AI can now look things up before answering, and soon – cite sources too.</p>
<hr />
<h2 id="heading-why-this-project-matters">⚡ Why This Project Matters</h2>
<p>If you’re building:</p>
<ul>
<li><p>AI copilots</p>
</li>
<li><p>Research agents</p>
</li>
<li><p>Knowledge assistants</p>
</li>
<li><p>Workflow automation bots</p>
</li>
</ul>
<p>...then starting with a strong, integrated foundation saves weeks of effort and gets you building features faster.</p>
<hr />
<h2 id="heading-whats-next">📌 What’s Next?</h2>
<p>This is just the beginning. Here’s what I’m planning to add next:</p>
<ol>
<li><p><strong>Source Attribution</strong><br /> Show which web links or sources were used to generate the answer – transparency matters.</p>
</li>
<li><p><strong>Composio Integration</strong><br /> Enable the agent to take action across multiple tools and APIs (e.g., Notion, Gmail, Google Calendar) with just one setup.</p>
</li>
<li><p><strong>Complex LangGraph Architectures</strong><br /> Think beyond simple tool calls — build real multi-step agents with memory, conditional flows, and fallback strategies.</p>
</li>
<li><p><strong>Mem0 for Memory</strong><br /> Persistent, contextual memory for users, allowing the agent to retain information across sessions and be truly helpful over time.</p>
</li>
</ol>
<hr />
<h2 id="heading-contribute-or-fork">🤝 Contribute or Fork</h2>
<p>This project is open-source and built to be extended. Whether you want to build your own custom AI product or contribute back, I’d love your feedback and ideas.</p>
<p>Check out the code and get started here:<br />👉 <a target="_blank" href="https://github.com/Sirsho29/dia-langchain"><strong>https://github.com/Sirsho29/dia-langchain</strong></a></p>
<p>If you find it useful, give it a star 🌟 and feel free to share what you build!</p>
<hr />
<h2 id="heading-lets-connect">💬 Let’s Connect</h2>
<p>If you’re working on AI apps, agents, or just interested in the future of intelligent tooling – let’s talk!<br />I’d love to learn from others building in this space.</p>
<hr />
<p><strong>#LangChain #LangGraph #VercelAI #ReactJS #SerpAPI #AIagents #LLMs #OpenSource #Hackernotes #AItools</strong></p>
]]></content:encoded></item><item><title><![CDATA[Let’s Decode Google I/O 2025]]></title><description><![CDATA[Some days you dive into code. Other days, you sit back, sip chai (or coffee if you must), and try to make sense of everything Sundar Pichai just casually dropped like it’s no big deal. Today’s one of those days. Google I/O 2025 was less of a keynote ...]]></description><link>https://blogs.sirsho.xyz/lets-decode-google-io-2025</link><guid isPermaLink="true">https://blogs.sirsho.xyz/lets-decode-google-io-2025</guid><category><![CDATA[Google]]></category><category><![CDATA[google io]]></category><category><![CDATA[google i/o 2025]]></category><category><![CDATA[AI]]></category><category><![CDATA[gemini]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Thu, 22 May 2025 09:17:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747904941725/e3d40db3-f2c6-49ce-8365-bc054a91f17d.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Some days you dive into code. Other days, you sit back, sip chai (or coffee if you must), and try to make sense of everything Sundar Pichai just casually dropped like it’s no big deal. Today’s one of those days. Google I/O 2025 was less of a keynote and more of a full-blown “AI is eating the world” performance. And honestly? I’m here for it.</p>
<p>Let’s break it down!</p>
<hr />
<h3 id="heading-shipping-at-a-relentless-pace-and-they-mean-it">🚢 <em>“Shipping at a Relentless Pace” – and they mean it</em></h3>
<p>Gone are the days when tech giants waited for I/O to launch their shiny toys. In this Gemini era, Google just yeets state-of-the-art models on a random Tuesday. The highlight? Gemini 2.5 Pro now <em>sweeps</em> LMArena like it’s cleaning house. They’ve cranked their Elo scores up by over 300 points since the OG Gemini Pro. Casual.</p>
<p>Also, the new <strong>Ironwood TPUs</strong>? 42.5 exaflops per pod. That’s not a typo. That’s just <em>obscene</em> compute power casually packed into a machine. Apparently, inferential AI is the new GPU arms race – and Google’s lifting heavy.</p>
<hr />
<h3 id="heading-480-trillion-tokens-later">🌍 <em>480 Trillion Tokens Later…</em></h3>
<p>Let’s talk scale. Last year they were processing 9.7 trillion tokens a month. Now it’s <strong>480 trillion</strong>. That’s not growth. That’s exponential puberty. And it’s not just devs geeking out.</p>
<ul>
<li><p>400M+ monthly Gemini users</p>
</li>
<li><p>7M+ developers building with Gemini</p>
</li>
<li><p>Usage on Vertex AI? Up 40x.</p>
</li>
</ul>
<p>Google has clearly figured out how to get their models <em>in</em> – in your apps, in your systems, and soon, in your cereal.</p>
<hr />
<h3 id="heading-project-starline-google-beam-holograms-but-real">👋 Project Starline → <strong>Google Beam</strong>: Holograms, But Real</h3>
<p>Remember Starline – that futuristic 3D calling concept? It just got real. Now called <strong>Google Beam</strong>, it uses six cameras, real-time head tracking, and a new AI-first video model to give you full-on <em>3D presence</em>.<br />Think Zoom… but if Zoom went to the gym, studied optics, and came out with an HP partnership.</p>
<p>Also – <strong>real-time speech translation</strong> in Google Meet is here. Matching your voice, tone, and facial expressions. English ↔ Spanish to start. So yes, your meetings might finally be both <em>global</em> and <em>understandable</em>.</p>
<hr />
<h3 id="heading-project-astra-gemini-live">👁️ Project Astra → <strong>Gemini Live</strong></h3>
<p>Now this is where it gets wild. Gemini Live is giving full-on Black Mirror vibes – in a good way (hopefully). Your phone camera becomes the eyes of the assistant. People are already using it to prep for interviews, plan marathons, and probably ask it if their outfit slaps.</p>
<p>Screen sharing, file uploads, and real-time assistant magic — all baked into your phone. Rolling out on Android now, iOS catching up.</p>
<hr />
<h3 id="heading-project-mariner-agent-mode">🕹️ Project Mariner → <strong>Agent Mode</strong></h3>
<p>This one is huge. Google’s building the agent economy.<br />Agent Mode can now use a computer like a person – click things, search Zillow, filter listings, and even <em>schedule a tour</em>. With "teach and repeat", you only have to show it once.</p>
<p>It’s like having an intern. A <em>very</em> competent, tireless, slightly sentient intern.</p>
<p>Bonus points to Google for backing interoperability. Their Agent2Agent protocol is now playing nice with Anthropic’s Model Context Protocol. In plain English? Agents can now talk to each other like it’s the beginning of an AI Avengers crossover.</p>
<hr />
<h3 id="heading-personal-context-smart-replies-that-actually-sound-like-you">✉️ Personal Context – Smart Replies That <em>Actually</em> Sound Like You</h3>
<p>Gemini will soon dig through your Docs, Drive, and Gmail (with permission) to craft hyper-personalised smart replies.<br />Your friend asks for road trip tips. Gemini will find that chaotic itinerary from 2018, capture your tone, and maybe even sneak in your usual “cheers, bro” signoff.</p>
<p>Gmail replies that actually sound like you <em>wrote them</em>? My inbox might finally stand a chance.</p>
<hr />
<h3 id="heading-ai-mode-in-search-a-full-redesign">🔍 <strong>AI Mode in Search</strong> – A Full Redesign</h3>
<p>We’ve seen AI Overviews. But now, <strong>AI Mode</strong> is a full-on <em>tab</em> in Google Search.</p>
<ul>
<li><p>You can ask longer, more complex queries</p>
</li>
<li><p>You can follow-up naturally</p>
</li>
<li><p>You’ll actually want to scroll down</p>
</li>
</ul>
<p>It’s now live in the U.S. and coming soon elsewhere. Google is effectively turning Search into a conversation — but with the world’s most overqualified librarian.</p>
<hr />
<h3 id="heading-gemini-25-pro-flash-deep-think">⚡ Gemini 2.5 Pro + Flash + Deep Think</h3>
<p>We’re now entering <strong>boss-level model mode</strong>:</p>
<ul>
<li><p>Gemini 2.5 Flash: Fast, cheap, and nearly as good as Pro.</p>
</li>
<li><p>Gemini 2.5 Pro: Getting a turbo boost called <strong>Deep Think</strong> — a new reasoning mode using parallel thinking.</p>
</li>
</ul>
<p>It’s like Gemini got a brain upgrade and now thinks in multiple tabs <em>simultaneously</em>.</p>
<hr />
<h3 id="heading-media-models-go-full-hollywood">🎨 <strong>Media Models Go Full Hollywood</strong></h3>
<p>Enter <strong>Veo 3</strong> (AI video with sound) and <strong>Imagen 4</strong> (top-tier AI images). These are already in the Gemini app. And there’s <strong>Flow</strong> – a new filmmaker tool that lets you stitch scenes and extend clips.</p>
<p>If you’re creative, this is your playground. If you’re not, Flow might make you one.</p>
<hr />
<h3 id="heading-the-big-picture">💡 The Big Picture</h3>
<p>What really stood out to me wasn’t just the firehose of features. It was about how <em>personal</em> this AI wave is getting.</p>
<ul>
<li><p>From personalization in Gmail</p>
</li>
<li><p>To immersive video calls</p>
</li>
<li><p>To AI assistants that know your context, tone, and tasks</p>
</li>
<li><p>And models that think better, faster, deeper</p>
</li>
</ul>
<p>Google’s clearly aiming for AI that <em>doesn’t just work</em> — it <em>works for you</em>. And maybe that’s the ultimate unlock: AI that understands your world, your files, your tone, your mess — and helps you get through the day like a silent, brilliant partner.</p>
<hr />
<h3 id="heading-final-thought">Final Thought</h3>
<p>Sundar ended his talk with a sweet anecdote about his dad being wowed by Waymo. It’s easy to forget that the stuff we build — the tech, the models, the hype — eventually lands in the hands of real people. People who are just trying to get home, or reply to an email, or call their family from another city.</p>
<p>And when that tech makes life a little easier, a little more magical — that’s when it hits different.</p>
<p>Google I/O 2025 wasn’t just about product updates. It was a quiet declaration:<br />The future is here. It just wants to be useful.</p>
<hr />
<p>If you're still reading, go play with the Gemini app — and maybe tell it to write your next email. You might be surprised by how much it sounds like… well, you.</p>
<p><strong>PS – If you haven’t played the I/O game yet, just go and check it out</strong> <a target="_blank" href="https://io.google/2025/puzzle"><strong>here</strong></a><strong>.</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747905256095/0184caef-df43-4755-b2c8-ad1292ddc48d.png" alt class="image--center mx-auto" /></p>
]]></content:encoded></item><item><title><![CDATA[What Happens After You Hit “Batch”?]]></title><description><![CDATA[If you’ve read my previous post, you know how to structure a .jsonlfile, upload it, and create a batch request to OpenAI.
But what happens after the request is made?
This blog walks you through:

How to track a batch job

How to download and interpre...]]></description><link>https://blogs.sirsho.xyz/what-happens-after-you-hit-batch</link><guid isPermaLink="true">https://blogs.sirsho.xyz/what-happens-after-you-hit-batch</guid><category><![CDATA[AI]]></category><category><![CDATA[openai]]></category><category><![CDATA[chatgpt]]></category><category><![CDATA[#ai-tools]]></category><category><![CDATA[APIs]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Wed, 21 May 2025 13:10:14 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747832895951/22d1fdf8-e782-436c-afd1-bfaca2c34d09.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you’ve read <a target="_blank" href="https://blogs.sirsho.xyz/what-are-batch-apis-feat-openai">my previous post</a>, you know how to structure a <code>.jsonl</code>file, upload it, and create a batch request to OpenAI.</p>
<p>But what happens <em>after</em> the request is made?</p>
<p>This blog walks you through:</p>
<ul>
<li><p>How to track a batch job</p>
</li>
<li><p>How to download and interpret the output</p>
</li>
<li><p>What to do when something fails</p>
</li>
<li><p>A few underrated batch methods you should know</p>
</li>
</ul>
<hr />
<h2 id="heading-step-1-wait-and-watch-fetching-status">Step 1: Wait and Watch (Fetching Status)</h2>
<p>After creating a batch, the first thing you should do is <strong>check its status</strong>. This helps you:</p>
<ul>
<li><p>Know if it’s still validating, in progress, or completed</p>
</li>
<li><p>Ensure there were no silent failures</p>
</li>
</ul>
<p>You can poll the status like so:</p>
<pre><code class="lang-python">batch = client.batches.retrieve(batch_id=<span class="hljs-string">"your_batch_id"</span>)
print(batch.status)
</code></pre>
<p>Once the status flips to <code>"completed"</code> and <code>output_file_id</code> is available, you’re ready to extract the results.</p>
<hr />
<h2 id="heading-step-2-get-the-output-safely">Step 2: Get the Output (Safely)</h2>
<p>Here’s the complete script I used to:</p>
<ul>
<li><p>Fetch the output from OpenAI</p>
</li>
<li><p>Save it in both <code>.jsonl</code> and <code>.json</code> formats</p>
</li>
<li><p>Cleanly extract just the <code>custom_id</code> and final response message</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

<span class="hljs-comment"># Load environment variables from .env file</span>
load_dotenv()

<span class="hljs-comment"># Initialize OpenAI client</span>
client = OpenAI(api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>))

<span class="hljs-comment"># Provide your completed batch ID</span>
batch_id = <span class="hljs-string">"batch_682aed00c6d88190990751eb7966abeb"</span>

<span class="hljs-comment"># Retrieve batch info</span>
batch = client.batches.retrieve(batch_id=batch_id)

<span class="hljs-comment"># Ensure it's completed and output file exists</span>
<span class="hljs-keyword">if</span> batch.status != <span class="hljs-string">"completed"</span> <span class="hljs-keyword">or</span> <span class="hljs-keyword">not</span> batch.output_file_id:
    <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Batch not ready or missing output_file_id. Status: <span class="hljs-subst">{batch.status}</span>"</span>)

<span class="hljs-comment"># Get the output content using `files.content`</span>
output_file = client.files.content(batch.output_file_id)

<span class="hljs-comment"># Save raw output as JSONL</span>
lines = output_file.text
<span class="hljs-keyword">with</span> open(<span class="hljs-string">"openai_calls/batch_output.jsonl"</span>, <span class="hljs-string">"w"</span>, encoding=<span class="hljs-string">"utf-8"</span>) <span class="hljs-keyword">as</span> f:
    f.write(lines)

<span class="hljs-comment"># Parse and save a clean JSON array</span>
<span class="hljs-keyword">with</span> open(<span class="hljs-string">"openai_calls/batch_output.json"</span>, <span class="hljs-string">"w"</span>, encoding=<span class="hljs-string">"utf-8"</span>) <span class="hljs-keyword">as</span> f:
    json_data = [
        {
            <span class="hljs-string">'custom_id'</span>: json.loads(line)[<span class="hljs-string">'custom_id'</span>],
            <span class="hljs-string">'response'</span>: json.loads(line)[<span class="hljs-string">'response'</span>][<span class="hljs-string">'body'</span>][<span class="hljs-string">'choices'</span>][<span class="hljs-number">0</span>][<span class="hljs-string">'message'</span>][<span class="hljs-string">'content'</span>]
        }
        <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> lines.splitlines()
    ]
    json.dump(json_data, f, indent=<span class="hljs-number">4</span>)
</code></pre>
<p>This structure makes it easy to read, log, or even pipe into downstream analytics tools or databases.</p>
<hr />
<h2 id="heading-bonus-functions-you-should-know">Bonus Functions You Should Know</h2>
<p>Batch APIs come with a couple of helpful utilities that make your workflow smoother:</p>
<h3 id="heading-1-list-all-your-batches">1. List All Your Batches</h3>
<p>See what you’ve run recently:</p>
<pre><code class="lang-python">batches = client.batches.list()
</code></pre>
<p>Use this to track jobs across your team or workspace, especially when running multiple experiments.</p>
<hr />
<h3 id="heading-2-cancel-a-batch-if-you-catch-a-mistake">2. Cancel a Batch (If You Catch a Mistake)</h3>
<p>Did you spot an error in your batch input right after launching it? Cancel it before it starts processing:</p>
<pre><code class="lang-python">client.batches.cancel(batch_id=<span class="hljs-string">"your_batch_id"</span>)
</code></pre>
<p><em>Note:</em> You can only cancel a batch while it's in the <code>validating</code> or <code>queued</code> stage. Once it moves to <code>in_progress</code>, it’s too late.</p>
<hr />
<h2 id="heading-summary-life-after-batch-creation">Summary: Life After Batch Creation</h2>
<p>Here’s what a full batch lifecycle looks like:</p>
<ol>
<li><p><strong>Create</strong> → Upload input and start the batch</p>
</li>
<li><p><strong>Track</strong> → Poll for status until completed</p>
</li>
<li><p><strong>Download</strong> → Use the output file ID to fetch responses</p>
</li>
<li><p><strong>Parse</strong> → Extract insights, summaries, or tags from the JSONL</p>
</li>
<li><p><strong>Repeat or Cancel</strong> → Use <code>list()</code> to audit, or <code>cancel()</code> when needed</p>
</li>
</ol>
<p>If you're working with any asynchronous, large-scale LLM task, batch APIs are not just a convenience; they're an optimisation layer.</p>
<hr />
<h2 id="heading-whats-next">What’s Next?</h2>
<p>I’m currently chaining these batch summaries into:</p>
<ul>
<li><p>Embedding pipelines (for search)</p>
</li>
<li><p>Auto-tagging workflows (for knowledge org)</p>
</li>
<li><p>Notification systems (summarize &amp; alert)</p>
</li>
</ul>
<p>If you're exploring something similar, feel free to fork the script or ping me. I’ll also share more about embedding and search in future blogs.</p>
<p>Stay tuned.</p>
<hr />
<p><strong>Read the previous blog →</strong> <a target="_blank" href="https://blogs.sirsho.xyz/what-are-batch-apis-feat-openai">What are Batch APIs? feat. OpenAI</a><br /><strong>Docs:</strong> <a target="_blank" href="https://platform.openai.com/docs/guides/batch?lang=python">OpenAI Batch API Guide</a></p>
]]></content:encoded></item><item><title><![CDATA[What are Batch APIs? feat. OpenAI]]></title><description><![CDATA[🧠 TL;DR
Batch APIs are your best friends when you want to run large-scale LLM tasks – without maxing out your rate limits or sending requests one at a time like it’s 2022.
In this post:

Why batch APIs are useful

A real-world use case: summarizing ...]]></description><link>https://blogs.sirsho.xyz/what-are-batch-apis-feat-openai</link><guid isPermaLink="true">https://blogs.sirsho.xyz/what-are-batch-apis-feat-openai</guid><category><![CDATA[AI]]></category><category><![CDATA[openai]]></category><category><![CDATA[coding]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[chatgpt]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Mon, 19 May 2025 11:32:52 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747654288034/ed860093-d6a5-40ad-8a3f-25f3804328e2.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-tldr">🧠 TL;DR</h2>
<p>Batch APIs are your best friends when you want to run large-scale LLM tasks – without maxing out your rate limits or sending requests one at a time like it’s 2022.</p>
<p>In this post:</p>
<ul>
<li><p>Why batch APIs are useful</p>
</li>
<li><p>A real-world use case: summarizing a bunch of CXO emails</p>
</li>
<li><p>How to set it up with OpenAI</p>
</li>
<li><p>What I learned (and what to watch out for)</p>
</li>
</ul>
<hr />
<h2 id="heading-the-problem">📨 The Problem</h2>
<p>If you’re building anything LLM-powered, this probably sounds familiar:</p>
<blockquote>
<p>“I’ve got hundreds of emails/docs/chats... and I want a summary for each.”</p>
</blockquote>
<p>Now imagine calling OpenAI's chat endpoint 500 times, one after another. You’ll:</p>
<ul>
<li><p>Hit rate limits</p>
</li>
<li><p>Burn through API tokens inefficiently</p>
</li>
<li><p>Lose time and, frankly, patience</p>
</li>
</ul>
<p>So instead, we use…</p>
<hr />
<h2 id="heading-enter-batch-apis">🧩 Enter: Batch APIs</h2>
<p><strong>Batch APIs</strong> let you send a bunch of requests together – in a single file – and OpenAI will process them asynchronously on their side.</p>
<p>Here’s what makes them awesome:</p>
<ul>
<li><p>✅ More efficient than real-time calls</p>
</li>
<li><p>✅ No need to manage retries or throttling</p>
</li>
<li><p>✅ Great for summarization, embedding, tagging, etc.</p>
</li>
</ul>
<p>📌 <em>Important</em>: As of now, OpenAI only supports a <strong>24h completion window</strong>. That means your batch gets processed within a day.</p>
<p>👉 <a target="_blank" href="https://platform.openai.com/docs/guides/batch?lang=python">OpenAI Docs: Batch APIs</a></p>
<hr />
<h2 id="heading-real-use-case-summarizing-emails-from-cxos">🛠️ Real Use Case: Summarizing Emails from CXOs</h2>
<p>I have prepared a dataset of internal and external emails (through ChatGPT) – like this one:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"from"</span>: <span class="hljs-string">"customer@loyalclient.com"</span>,
  <span class="hljs-attr">"to"</span>: <span class="hljs-string">"ceo@company.com"</span>,
  <span class="hljs-attr">"subject"</span>: <span class="hljs-string">"Praise and Feedback: Exceptional Support Experience"</span>,
  <span class="hljs-attr">"body"</span>: <span class="hljs-string">"I wanted to personally commend your support team—especially Priya and Omar..."</span>
}
</code></pre>
<p>And I wanted to generate short summaries like:</p>
<blockquote>
<p>"Michael O'Connor from LoyalClient Corp praised Priya and Omar for excellent integration support."</p>
</blockquote>
<p>So I did what any tired dev would do - I batch processed all of them with GPT-4 using OpenAI’s Batch API.</p>
<hr />
<h2 id="heading-step-by-step-code-to-batch-like-a-pro">🧪 Step-by-Step: Code to Batch Like a Pro</h2>
<h3 id="heading-1-first-install-the-required-package">1. First, install the required package:</h3>
<pre><code class="lang-bash">pip install openai python-dotenv
</code></pre>
<h3 id="heading-2-set-up-your-environment">2. Set up your environment:</h3>
<p>Make sure your <code>.env</code> file contains:</p>
<pre><code class="lang-plaintext">OPENAI_API_KEY="your-api-key-here"
</code></pre>
<hr />
<h3 id="heading-3-python-code-batch-creation">3. Python Code (Batch Creation)</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">import</span> json, os
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

load_dotenv()
client = OpenAI(api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>))

emails = json.load(open(<span class="hljs-string">"./dataset/email_samples.json"</span>, <span class="hljs-string">"r"</span>, encoding=<span class="hljs-string">"utf-8"</span>))

<span class="hljs-keyword">with</span> open(<span class="hljs-string">"openai_calls/batch_input.jsonl"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
    <span class="hljs-keyword">for</span> i, email <span class="hljs-keyword">in</span> enumerate(emails):
        prompt = <span class="hljs-string">f"From: <span class="hljs-subst">{email[<span class="hljs-string">'from'</span>]}</span>\nSubject: <span class="hljs-subst">{email[<span class="hljs-string">'subject'</span>]}</span>\n\n<span class="hljs-subst">{email[<span class="hljs-string">'body'</span>]}</span>"</span>
        obj = {
            <span class="hljs-string">"method"</span>: <span class="hljs-string">"POST"</span>,
            <span class="hljs-string">"url"</span>: <span class="hljs-string">"/v1/chat/completions"</span>,
            <span class="hljs-string">"body"</span>: {
                <span class="hljs-string">"model"</span>: <span class="hljs-string">"gpt-4.1-nano-2025-04-14"</span>,
                <span class="hljs-string">"messages"</span>: [
                    {
                        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
                        <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are an email summarizer. Summarize this email in 2–3 sentences. Make sure to include all important pointers in the email."</span>
                    },
                    {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: prompt}
                ]
            },
            <span class="hljs-string">"custom_id"</span>: <span class="hljs-string">f"email-<span class="hljs-subst">{i}</span>"</span>
        }
        f.write(json.dumps(obj) + <span class="hljs-string">"\n"</span>)
</code></pre>
<hr />
<h3 id="heading-4-upload-the-input-file-to-openai">4. Upload the input file to OpenAI</h3>
<pre><code class="lang-python">batch_input_file = client.files.create(
    file=open(<span class="hljs-string">"openai_calls/batch_input.jsonl"</span>, <span class="hljs-string">"rb"</span>),
    purpose=<span class="hljs-string">"batch"</span>
)
batch_input_file_id = batch_input_file.id
print(<span class="hljs-string">f"Uploaded batch file ID: <span class="hljs-subst">{batch_input_file_id}</span>"</span>)
</code></pre>
<hr />
<h3 id="heading-5-create-the-batch-request">5. Create the batch request</h3>
<pre><code class="lang-python">batch = client.batches.create(
    input_file_id=batch_input_file_id,
    endpoint=<span class="hljs-string">"/v1/chat/completions"</span>,
    completion_window=<span class="hljs-string">"24h"</span>,
    metadata={
        <span class="hljs-string">"description"</span>: <span class="hljs-string">"Summarize CXO email samples for AI blog"</span>
    }
)
print(<span class="hljs-string">"Batch request created:"</span>, batch)
</code></pre>
<p>⚠️ Don’t forget to store it somewhere – you’ll<code>batch.id</code> need it to track status and fetch results later.</p>
<hr />
<h2 id="heading-checking-status">🔁 Checking Status</h2>
<pre><code class="lang-python">batch = client.batches.retrieve(batch_id=<span class="hljs-string">"your_batch_id_here"</span>)
print(batch.status)
</code></pre>
<p>Common statuses:</p>
<ul>
<li><p><code>validating</code></p>
</li>
<li><p><code>in_progress</code></p>
</li>
<li><p><code>completed</code></p>
</li>
<li><p><code>failed</code></p>
</li>
</ul>
<p>You can also list all batches:</p>
<pre><code class="lang-python">batches = client.batches.list()
<span class="hljs-keyword">for</span> b <span class="hljs-keyword">in</span> batches:
    print(b.id, b.status)
</code></pre>
<hr />
<h2 id="heading-when-should-you-use-batch-apis">🔎 When Should You Use Batch APIs?</h2>
<p>Here’s a quick cheat sheet:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Use Case</td><td>Good for Batch?</td></tr>
</thead>
<tbody>
<tr>
<td>Email summarization</td><td>✅</td></tr>
<tr>
<td>Document tagging</td><td>✅</td></tr>
<tr>
<td>Large-scale embedding</td><td>✅</td></tr>
<tr>
<td>Real-time chat</td><td>❌</td></tr>
<tr>
<td>Function calling with rapid response</td><td>❌</td></tr>
</tbody>
</table>
</div><p>Batch APIs shine in background tasks where latency doesn’t matter – but cost, efficiency, and scale do.</p>
<hr />
<h2 id="heading-future-ideas">📦 Future Ideas</h2>
<p>I’m thinking of chaining this with:</p>
<ul>
<li><p>Try embeddings</p>
</li>
<li><p>Automatic storage into a vector database</p>
</li>
<li><p>Semantic search across summaries</p>
</li>
<li><p>Tag generation (next experiment?)</p>
</li>
</ul>
<hr />
<h2 id="heading-final-thoughts">🗨️ Final Thoughts</h2>
<p>If you’re still sending LLM requests one by one and hitting rate limits – do yourself a favour: batch it.</p>
<p>It’s faster. Cheaper. And built exactly for use cases like summarizing, tagging, classification, etc.</p>
<p>Let me know if you want a GitHub template or help plugging it into your own workflow.</p>
<p>📎 Docs link again (bookmark this): <a target="_blank" href="https://platform.openai.com/docs/guides/batch?lang=python">OpenAI Batch API Guide</a></p>
<p>📎 Codebase: <a target="_blank" href="https://github.com/Sirsho29/ai_blog_codebase">Github</a></p>
<p>📬 Got questions? DM me or drop a comment. Always happy to debug, rant, or batch together.</p>
]]></content:encoded></item><item><title><![CDATA[Enough theory; let's get our hands dirty!]]></title><description><![CDATA[Alright, fellow Gemini explorers, buckle up! In my last blog, we scratched the surface of the amazing things you can do with Vertex AI.
But enough theory, right? Let's dive into the good stuff: actually using the Gemini APIs with Python and the ever-...]]></description><link>https://blogs.sirsho.xyz/enough-theory-lets-get-our-hands-dirty</link><guid isPermaLink="true">https://blogs.sirsho.xyz/enough-theory-lets-get-our-hands-dirty</guid><category><![CDATA[AI]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[gemini]]></category><category><![CDATA[langchain]]></category><category><![CDATA[Google]]></category><category><![CDATA[Vertex-AI]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Fri, 16 May 2025 15:28:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/1xE5QnNXJH0/upload/cc3a460f3d0744aff5bde8306fe8db6b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Alright, fellow Gemini explorers, buckle up! In my last blog, we scratched the surface of the amazing things you can do with Vertex AI.</p>
<p>But enough theory, right? Let's dive into the good stuff: actually <em>using</em> the Gemini APIs with Python and the ever-so-handy Langchain.</p>
<h3 id="heading-whats-in-your-toolkit">What's in Your Toolkit?</h3>
<p>Before we embark on this coding adventure, make sure you have a few essentials:</p>
<ol>
<li><p><strong>A Python Virtual Environment:</strong> This is like creating a neat little sandbox for our project, keeping all our specific tools (packages) in one place without messing with your main Python setup. If you haven't got one, setting it up is a breeze. Most Python installations come with <code>venv</code>. Just navigate to your project directory in your terminal and type:</p>
<pre><code class="lang-bash"> python -m venv gemini_blog_env
</code></pre>
<p> And to activate it:</p>
<ul>
<li><p>On macOS and Linux: <code>source gemini_blog_env/bin/activate</code></p>
</li>
<li><p>On Windows: <code>.\gemini_blog_env\Scripts\activate</code> You'll know it's active when you see your environment's name in the terminal prompt.</p>
</li>
</ul>
</li>
<li><p><strong>A Few Key Packages:</strong> We'll need to invite some friends to our coding party. The main guests are:</p>
<ul>
<li><p><a target="_blank" href="https://pypi.org/project/langchain-google-genai/"><code>langchain-google-genai</code></a>: This is the star player, allowing Langchain to talk to Google's Gemini models.</p>
</li>
<li><p><a target="_blank" href="https://pypi.org/project/google-genai/"><code>google-genai</code></a>: The official Google AI Python SDK.</p>
</li>
<li><p><a target="_blank" href="https://pypi.org/search/?q=python-dotenv"><code>python-dotenv</code></a> (optional but recommended): Super useful for managing your precious API key without hardcoding it.</p>
</li>
</ul>
</li>
<li><p><strong>Your Gemini API Key:</strong> This is your golden ticket to access the Gemini models. You can grab one from Google AI Studio. Keep it secret, keep it safe!</p>
</li>
</ol>
<h3 id="heading-lets-get-installing">Let's Get Installing!</h3>
<p>Assuming your virtual environment is up and running (you'll see its name in your terminal prompt), let's install those packages. Open your terminal and type:</p>
<pre><code class="lang-bash">pip install langchain langchain-google-genai google-genai python-dotenv
</code></pre>
<p>Pip, Python's package installer, will fetch and install everything for you.</p>
<h3 id="heading-time-to-write-some-actual-code-the-exciting-part">Time to Write Some Actual Code! (The Exciting Part!)</h3>
<p>Alright, the stage is set. Let's get Langchain and Gemini to chat.</p>
<p>First, if you're using <code>python-dotenv</code> (which I highly recommend for keeping your API key secure), create a file named <code>.env</code> in your project directory and add your API key like this:</p>
<pre><code class="lang-plaintext">GOOGLE_API_KEY="YOUR_SUPER_SECRET_API_KEY_HERE"
</code></pre>
<p>Now, for the Python magic. Create a Python file (e.g., <code>gemini_</code><a target="_blank" href="http://chat.py"><code>chat.py</code></a>) and let's get coding:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
<span class="hljs-keyword">from</span> langchain_google_genai <span class="hljs-keyword">import</span> ChatGoogleGenerativeAI
<span class="hljs-keyword">from</span> langchain.prompts <span class="hljs-keyword">import</span> ChatPromptTemplate
<span class="hljs-keyword">from</span> langchain.schema <span class="hljs-keyword">import</span> HumanMessage, SystemMessage

<span class="hljs-comment"># Load environment variables from .env file</span>
load_dotenv()

<span class="hljs-comment"># Securely get your API key (optional if you set it directly)</span>
<span class="hljs-comment"># Make sure your GOOGLE_API_KEY is set in your environment or .env file</span>
google_api_key = os.getenv(<span class="hljs-string">"GOOGLE_API_KEY"</span>)
<span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> google_api_key:
    <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"GOOGLE_API_KEY not found in environment variables."</span>)

<span class="hljs-comment"># Initialize the Gemini LLM with Langchain</span>
<span class="hljs-comment"># You can choose different models like "gemini-2.0-flash" etc.</span>
<span class="hljs-comment"># Check the Google AI documentation for the latest model names and capabilities.</span>
llm = ChatGoogleGenerativeAI(model=<span class="hljs-string">"gemini-2.5-flash-preview-04-17"</span>, google_api_key=google_api_key)

<span class="hljs-comment"># Define our roles with System and Human messages</span>
system_prompt_text = <span class="hljs-string">"""I am writing a series on Learning Gemini in form of blogs.
I am writing these blogs while I am learning myself.
You are an expert in using Python, Langchain and Gemini APIs.
Help me write blogs on topics that I give."""</span>

user_prompt_text = <span class="hljs-string">"Write a short blurb on Google Gemini."</span>

<span class="hljs-comment"># Create the messages</span>
messages = [
    SystemMessage(content=system_prompt_text),
    HumanMessage(content=user_prompt_text)
]

<span class="hljs-comment"># Let's get the response!</span>
response = llm.invoke(messages)

print(<span class="hljs-string">"Assistant's Response:"</span>)
print(response.content)
</code></pre>
<p>Run this script from your activated virtual environment: <code>python gemini_chat.py</code></p>
<p>And voila! You should see Gemini, guided by your system prompt, generating a blurb about itself. Talk about meta!</p>
<h3 id="heading-hold-on-isnt-this-recursion-deja-vu">Hold on, isn't this recursion? Deja Vu!</h3>
<p>You got me! Asking an AI that I'm learning about to help me write a blog about learning that AI... but don't you worry your human heads; I'm still the one typing these blogs out, adding my own (questionable) humour and insights. No infinite AI loops here... yet! 😉</p>
<h3 id="heading-lets-get-streamy-implementing-streaming-responses">Let's Get Streamy: Implementing Streaming Responses</h3>
<p>Sometimes, you don't want to wait for the whole answer to generate. You want it to flow, like a good conversation. Langchain and Gemini support streaming responses beautifully.</p>
<p>Here's how you can modify the code to get a streaming response:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
<span class="hljs-keyword">from</span> langchain_google_genai <span class="hljs-keyword">import</span> ChatGoogleGenerativeAI
<span class="hljs-keyword">from</span> langchain.schema <span class="hljs-keyword">import</span> HumanMessage, SystemMessage

<span class="hljs-comment"># Load environment variables from .env file</span>
load_dotenv()

google_api_key = os.getenv(<span class="hljs-string">"GOOGLE_API_KEY"</span>)
<span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> google_api_key:
    <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"GOOGLE_API_KEY not found in environment variables."</span>)

llm = ChatGoogleGenerativeAI(model=<span class="hljs-string">"gemini-2.5-flash-preview-04-17"</span>, google_api_key=google_api_key, stream=<span class="hljs-literal">True</span>) <span class="hljs-comment"># Note: stream=True can often be inferred</span>

system_prompt_text = <span class="hljs-string">"""I am writing a series on Learning Gemini in form of blogs.
I am writing these blogs while I am learning myself.
You are an expert in using Python, Langchain and Gemini APIs.
Help me write blogs on topics that I give."""</span>

user_prompt_text = <span class="hljs-string">"Write a short blurb on Google Gemini, and make it snappy!"</span>

messages = [
    SystemMessage(content=system_prompt_text),
    HumanMessage(content=user_prompt_text)
]

print(<span class="hljs-string">"Assistant's Streaming Response:"</span>)
<span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> llm.stream(messages):
    print(chunk.content, end=<span class="hljs-string">""</span>, flush=<span class="hljs-literal">True</span>)
print() <span class="hljs-comment"># For a new line at the end</span>
</code></pre>
<p>When you run this, you'll see the response appear chunk by chunk, which is pretty neat for more interactive applications.</p>
<p>And there you have it! Our first foray into coding with Gemini and Langchain. We've set up our environment, installed the necessary tools, had a (slightly recursive) chat with Gemini, and even made it stream its wisdom.</p>
<p>Adding the GitHub repo as well: <a target="_blank" href="https://github.com/Sirsho29/gemini_blog">https://github.com/Sirsho29/gemini_blog</a></p>
<p>Stay tuned for the next blog, where we'll dive deeper into more advanced features. Until then, happy coding, and don't let the AIs write <em>all</em> your content!</p>
]]></content:encoded></item><item><title><![CDATA[My First Week with Vertex AI & Gemini]]></title><description><![CDATA[First things first:If you’re wondering where to begin —👉 Head to Google Cloud Console → Vertex AI

Over the last few days, I’ve been diving into Gemini and Vertex AI — trying to figure out what’s changed, what’s possible, and how to actually use thi...]]></description><link>https://blogs.sirsho.xyz/my-first-week-with-vertex-ai-and-gemini</link><guid isPermaLink="true">https://blogs.sirsho.xyz/my-first-week-with-vertex-ai-and-gemini</guid><category><![CDATA[gemini]]></category><category><![CDATA[Google]]></category><category><![CDATA[GCP]]></category><category><![CDATA[google cloud]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Thu, 15 May 2025 05:14:33 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747285828316/5d80b3cb-af24-4e2f-a517-64c3e7d65738.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>First things first:</strong><br />If you’re wondering where to begin —<br />👉 <a target="_blank" href="https://console.cloud.google.com/vertex-ai/dashboard">Head to Google Cloud Console → Vertex AI</a></p>
<hr />
<p>Over the last few days, I’ve been diving into <strong>Gemini and Vertex AI</strong> — trying to figure out what’s changed, what’s possible, and how to actually use this stack to build useful stuff.</p>
<p>Spoiler: A lot has changed.<br />I started out watching some YouTube videos on fine-tuning Gemini, and half the UI references were already outdated. Turns out, <strong>Google AI Studio</strong> (which used to handle this) now mostly focuses on chat + API keys, while <strong>Vertex AI</strong> has become the main control room.</p>
<p>So, this blog is the <strong>first in a series</strong> I’ll be writing while I learn. I’m not an expert — just documenting what I’m trying, what’s working, and what’s breaking. If you’re building with Gemini or exploring Vertex AI, you might find this helpful (or at least relatable).</p>
<hr />
<h2 id="heading-step-zero-setup">✅ Step Zero: Setup</h2>
<p>Once you're inside the Vertex AI dashboard, the first thing you need to do is click<br /><strong>“Enable all recommended APIs.”</strong><br />No fancy config required. That one click gets the engine running.</p>
<hr />
<h2 id="heading-what-youll-find-inside-vertex-ai-as-of-now">🧭 What You’ll Find Inside Vertex AI (as of now)</h2>
<p>Here’s a quick overview of what I’ve explored so far. I’ll be digging into each of these in later posts — this is just a surface-level map to orient myself (and maybe you too).</p>
<hr />
<h3 id="heading-model-garden">🌱 <strong>Model Garden</strong></h3>
<p>This is where most of your playing around will start.</p>
<p>You get access to:</p>
<ul>
<li><p><strong>Gemini (of course)</strong></p>
</li>
<li><p><strong>Claude (Anthropic)</strong></p>
</li>
<li><p><strong>LLaMA models</strong></p>
</li>
<li><p><strong>DeepSeek</strong></p>
</li>
<li><p>And a ton of open models from <strong>HuggingFace</strong></p>
</li>
<li><p>Sadly, no OpenAI (they are “non-profits”)</p>
</li>
</ul>
<p>You can try these out in the UI or deploy them into your own GCP setup.<br />⚠️ Just be careful with deployments — these models eat credits for breakfast.</p>
<hr />
<h3 id="heading-prompt-management">🧠 <strong>Prompt Management</strong></h3>
<p>This one’s underrated.</p>
<p>It’s like a <strong>CMS for prompts</strong> — super useful if you’re experimenting a lot.<br />What it helps with:</p>
<ul>
<li><p><strong>Version control</strong> for your prompts</p>
</li>
<li><p><strong>Decoupling prompts from your code</strong></p>
</li>
<li><p><strong>Performance tracking</strong> of different versions</p>
</li>
<li><p>And clean <strong>integration into your stack</strong></p>
</li>
</ul>
<hr />
<h3 id="heading-prompt-gallery">📚 <strong>Prompt Gallery</strong></h3>
<p>This feels like a little idea board — a public collection of prompts to:</p>
<ul>
<li><p>Learn better prompt writing</p>
</li>
<li><p>Find working examples</p>
</li>
<li><p>Share your own experiments</p>
</li>
</ul>
<p>It’s basically a way to <strong>avoid reinventing the wheel</strong>, especially when you’re stuck.</p>
<hr />
<h3 id="heading-finetuning">🛠️ <strong>Finetuning</strong></h3>
<p>This is where things get deeper.<br />You can <strong>actually fine-tune models</strong>, including Gemini, by feeding in your own data or instructions. I’m still exploring this — but the idea is to shape the model’s behaviour beyond just writing clever prompts.</p>
<p>Expect a detailed post on this soon, as I am currently working on this!</p>
<hr />
<h3 id="heading-agent-garden">🤖 <strong>Agent Garden</strong></h3>
<p>Think of this as a <strong>library of pre-built agents</strong> and toolkits.<br />Great for inspiration or if you want to fast-track a prototype without starting from zero.</p>
<hr />
<h3 id="heading-agent-engine">⚙️ <strong>Agent Engine</strong></h3>
<p>Once you’ve built your agent, this is where you deploy and manage it.</p>
<p>You get:</p>
<ul>
<li><p>Fully managed infra</p>
</li>
<li><p>Built-in testing + monitoring</p>
</li>
<li><p>Support for any framework you use</p>
</li>
</ul>
<p>Basically, this takes care of the backend mess so you can focus on logic and UX.</p>
<hr />
<h3 id="heading-datasets">🧾 <strong>Datasets</strong></h3>
<p>Vertex AI also provides a <strong>managed dataset service</strong>, so you can store, access, and train on your data directly inside GCP. No bucket shuffling or permission hell.</p>
<hr />
<h3 id="heading-model-development">🧪 <strong>Model Development</strong></h3>
<p>This covers the full journey — from:</p>
<ol>
<li><p>Defining your problem</p>
</li>
<li><p>Prepping data</p>
</li>
<li><p>Training + evaluating</p>
</li>
<li><p>Deploying your model</p>
</li>
</ol>
<p>Whether you’re building an agent, a classifier, or something niche — this is where the actual ML workflow lives.</p>
<hr />
<h2 id="heading-whats-next">💡 What’s Next?</h2>
<p>This blog was just me getting familiar with the Vertex AI ecosystem.<br />I’ll be learning <strong>finetuning, prompt design, agent workflows, and Gemini-specific tricks</strong> next — and documenting all of it as I go.</p>
<p>If you’re exploring this space too, feel free to build alongside. Ping me if you get stuck or figure out something I haven’t covered yet — I would love to include it in future posts.</p>
<hr />
<h3 id="heading-follow-the-series">Follow the Series →</h3>
<p>I’ll be publishing everything here on Hashnode as I go. No fluff — just real-time learning, mistakes, and progress.</p>
<p>And maybe at the end of this series, we’ll both have built something cool.</p>
]]></content:encoded></item><item><title><![CDATA[Quick AI Model Cost Estimator]]></title><description><![CDATA[Like many of you building with LLMs, I often found myself jumping between multiple documentation pages just to figure out how much a certain query would cost across different models.
And let’s be honest — no one has time to memorize OpenAI’s per-mill...]]></description><link>https://blogs.sirsho.xyz/quick-ai-model-cost-estimator</link><guid isPermaLink="true">https://blogs.sirsho.xyz/quick-ai-model-cost-estimator</guid><category><![CDATA[#ai-tools]]></category><category><![CDATA[aitools]]></category><category><![CDATA[AI]]></category><category><![CDATA[openai]]></category><category><![CDATA[#anthropic]]></category><category><![CDATA[claude.ai]]></category><category><![CDATA[gemini]]></category><category><![CDATA[Deepseek]]></category><category><![CDATA[chatgpt]]></category><dc:creator><![CDATA[Sirsho Chakraborty]]></dc:creator><pubDate>Thu, 15 May 2025 04:23:06 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747282803601/6072846e-4820-4c17-8849-08cd1f3dacfd.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Like many of you building with LLMs, I often found myself jumping between multiple documentation pages just to figure out <strong>how much a certain query would cost</strong> across different models.</p>
<p>And let’s be honest — no one has time to memorize OpenAI’s per-million token costs, compare them with Anthropic, DeepSeek, or Gemini, and then mentally compute costs based on input/output tokens and query types.</p>
<p>So… I built myself a <strong>simple cost calculator</strong> that does exactly what I need:<br />📍 <strong>Give me an approximate cost of running a specific type of query on a selected model.</strong></p>
<hr />
<h2 id="heading-why-i-built-it">💡 Why I Built It</h2>
<p>I was repeatedly:</p>
<ul>
<li><p>Searching for the latest OpenAI pricing</p>
</li>
<li><p>Comparing it with Claude or Gemini</p>
</li>
<li><p>Trying to remember if 75 words = 100 tokens or the other way around</p>
</li>
<li><p>Doing math in my head or a notepad every time I needed to estimate costs</p>
</li>
</ul>
<p>It got old, fast.</p>
<p>So I made a <strong>small spreadsheet-based calculator</strong> that lets me:</p>
<p>✅ Pick a <strong>query type</strong> (normal, research, function calling, or custom)<br />✅ Choose a <strong>model</strong> from OpenAI, Anthropic, Google, or DeepSeek<br />✅ Instantly get a <strong>final estimated cost</strong> for a fixed number of queries (default: 10)</p>
<hr />
<h2 id="heading-what-it-does">🧮 What It Does</h2>
<p>Behind the scenes, the calculator:</p>
<ul>
<li><p>Uses a standard 75 words = 100 tokens conversion</p>
</li>
<li><p>Maps query types to <strong>average input/output token counts</strong></p>
</li>
<li><p>Pulls in model-specific <strong>cost per million tokens</strong></p>
</li>
<li><p>Computes the total cost for the number of queries selected</p>
</li>
</ul>
<p>And that’s it. <strong>Simple, fast, and surprisingly handy.</strong></p>
<hr />
<h2 id="heading-whats-inside">🔧 What's Inside?</h2>
<p>Here’s a peek at what powers the tool:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Component</td><td>Purpose</td></tr>
</thead>
<tbody>
<tr>
<td>Query Type</td><td>Sets average input/output tokens</td></tr>
<tr>
<td>Model Selection</td><td>Pulls cost per million tokens</td></tr>
<tr>
<td>Token Math</td><td>Computes cost per query type/model</td></tr>
<tr>
<td>Final Cost</td><td>Combines everything × number of queries</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-why-approximate">📌 Why Approximate?</h2>
<p>This tool isn’t meant to be 100% accurate to the last decimal — it’s designed for:</p>
<ul>
<li><p><strong>Quick ballparks</strong> during architecture discussions</p>
</li>
<li><p><strong>Budget estimates</strong> before production usage</p>
</li>
<li><p><strong>Cost comparisons</strong> across providers/models</p>
</li>
</ul>
<p>In real-world use, actual token counts vary due to system messages, model verbosity, and temperature settings. But this gives you a <strong>solid directional cost estimate</strong>.</p>
<hr />
<h2 id="heading-whats-next">🛠️ What's Next?</h2>
<p>I'm thinking of extending it with:</p>
<ul>
<li><p>Support for embedding models</p>
</li>
<li><p>Somehow integrate caching values</p>
</li>
<li><p>Batch API estimations</p>
</li>
<li><p>Adjustable verbosity (to estimate more or fewer output tokens)</p>
</li>
<li><p><strong>Overhead token buffers</strong> for function calling</p>
</li>
</ul>
<p>Let me know if that would be useful — or if you want a copy to use or contribute to!</p>
<hr />
<h2 id="heading-want-to-try-it">🔗 Want to Try It?</h2>
<p><a target="_blank" href="https://docs.google.com/spreadsheets/d/18-TPKGPVEiYMH5I0dt9YMRQuL_--mqE-HbdxtW4vkzc/edit?gid=0#gid=0">https://docs.google.com/spreadsheets/d/18-TPKGPVEiYMH5I0dt9YMRQuL_--mqE-HbdxtW4vkzc/edit?gid=0#gid=0</a></p>
<p>Any feedback? Just drop a comment or DM me on Twitter/X.</p>
<hr />
<p>💬 Ever built your own utility out of frustration? Share your tool or workflow below!</p>
]]></content:encoded></item></channel></rss>