Bhaumik's Notes ・ Bhaumik's Notes

Agents vs Sub-Agents: Should You Care?

Claude Chat and Claude Code use the same model but serve completely different purposes — one keeps your entire conversation history, the other spins up with just a repo and leaves when done. The real power of sub-agents is context isolation: a research specialist can read ten papers without polluting a writer's draft with GPU specs and citation noise. At scale, this matters — a single agent that has recently debugged Python, planned a trip, and written a LinkedIn post will subtly contaminate every future task with leftover context.

Apr 23, 2026

The AI 5-Layer Cake

Jensen Huang's AI stack — Energy, Chips, Infrastructure, Models, Applications — is usually discussed in trillion-dollar data center terms, but the same framework reveals exactly where personal AI setups bleed money and lose freedom. Running six agents on a Mac Mini M4 at 15W ($1.40/month in electricity) versus renting through Claude or ChatGPT exposes the real cost of letting one company own the whole stack: you pay hidden margins on every layer and cannot leave because they own your workflow, your prompts, and your data. The composable alternative — owning your application layer in OpenClaw, mixing local inference (60% of workload at $0) with cloud APIs only when needed — cuts costs by 77% while giving you model portability and data sovereignty that no closed platform can match.

Apr 20, 2026

Saving with Model Tiering and Token Optimization

Running six agents with heartbeats every 30 minutes and daily cron jobs hit Anthropic's usage cap twice in one month — the default of 'always use the smartest model' doesn't scale. The fix is three levers: model tiering, where GLM 5.1 handles routine summarization at 3-5x cheaper rates while Sonnet and GPT-5.4 serve as fallbacks for complex reasoning; token optimization, trimming context so you're not paying for 10,000 tokens of workspace files on every single message; and call frequency, questioning whether every agent really needs to check in every 30 minutes. The result was a significantly lower bill with zero loss in actual capability.

Apr 16, 2026

Why I Stopped Typing to My AI Agents

Switching from typing to voice input fundamentally changed how I build context with AI agents — not because dictation is faster, but because voice naturally produces richer, more textured prompts without the friction of composition. The key insight is that LLMs perform best with more words and more context, not with beautifully structured prompts; voice delivers that effortlessly. The secondary lesson is 'plan first, command second' — build the full picture through conversation before letting an agent execute, something voice makes natural and typing makes a chore.

Apr 16, 2026

Avoiding the road not taken - my OpenClaw setup journey

I tried WSL2 on a remote Windows desktop first — it worked, but the security boundaries were blurry and debugging meant guessing whether the issue was Windows, WSL2, Ubuntu, or OpenClaw. Switching to a dedicated Mac Mini with a separate standard user account gave me clean access controls, working GUI tools, and an environment I could actually reason about. The simpler setup won not because it had more features, but because I could explain to myself exactly what the agent could and couldn't touch.

Apr 13, 2026

How I Gave My AI Setup a Real Memory

Running six specialized agents 24/7 revealed two hard limits of LLMs: conversations degrade once they hit context limits, and every session starts from zero with no memory of yesterday's insights. I fixed both with Lossless Claw, which compresses older conversation turns into structured summaries so context never silently drops, and an Obsidian Knowledge Wiki where agents incrementally compile persistent knowledge following a shared schema. The result is agents that remember your preferences from last week and can connect insights across months instead of re-deriving everything from scratch.

Apr 6, 2026

Notes on tech, design, and AI.

These are my notes — like a semi-frequently updated journal. Everything I write is born out of empirical learnings and everyday observations. There's obviously bias in these notes, but since honest personal accounts from people in tech are rare, I hope this serves as a useful one.

Senior Product Designer at Microsoft · Based in Toronto, Canada