Assah Bismark

AI

Stabilize Claude Code for Open-Weight Models

A three-layer proxy stack to keep Claude Code stable when routing through LiteLLM to open-weight models.
Running Claude Code with open-weight models like DeepSeek or Qwen through a LiteLLM proxy works. Until it doesn't. The port changes every time LiteLLM restarts because it picks a random dynamic port. The request payload grows until backends reject it with 400 errors. Usage stats come back null and t...More ›

Factory Droid with Any OpenAI-Compatible Endpoint

Use LiteLLM as a local proxy to run Factory droid with any OpenAI-compatible model endpoint.
Factory droid supports custom models through BYOK (Bring Your Own Key), but environment variables don't work for the interactive CLI or desktop app. The droid daemon has its own API client and ignores and env vars. The working path is adding your model to the array in and pointing it at a local ...More ›

Hacker News Is High Friction

HN is the best signal source in tech. Reading it shouldn't cost an hour.
Hacker News is where the best technical conversations happen. When a new database gets traction, the thread on HN is where the people who actually built competing systems show up and explain the tradeoffs. When a paper drops, the comments are often better than the paper. The signal is real and it's ...More ›

Context Is What Makes Code Review Work

AI code review is a systems problem, not a prompt problem. Here is how I built Snif to stay useful.
I spent some time researching every AI code review tool I could find. CodeRabbit, Greptile, Qodo merge, Copilot and lots of others, half a dozen smaller ones. The pattern I saw seems to be the same, teams install these tools, get excited with the feedback they provide, then disable them after some t...More ›

Using Claude Code with any model

Claude Code's toolset works with non-Anthropic models through a local proxy. Here's the setup, the bug, and the tradeoffs.
Claude Code has the best agentic coding toolset available right now. File editing with diffs, bash execution, grep, glob, MCP server integration, subagents, plan mode. It runs in your terminal, reads your entire codebase, and orchestrates multi-step changes autonomously. It only speaks Anthropic Mes...More ›

AI Shipped Faster Than Anyone Can Verify

AI systems ship at exponential speed. The governance and security infrastructure around them is still moving at human speed. That mismatch is the actual problem.
I've been tracking what's actually happening in the AI space day to day. Not product launches or funding rounds. The pain. What's breaking, what's failing, what practitioners are complaining about. The same five problems keep showing up. Agent overload. AI-generated code debt. OSS maintainer burnout...More ›

AI coding agents and the cloud provider gap

Desktop AI coding agents don't work with cloud provider auth. A local proxy fixes all of them at once.
The current generation of AI coding agents ships with a shared assumption: you have an API key from a model provider, and you paste it into a settings field. That works if you pay for a direct subscription. It does not work if you access models through a cloud provider like AWS Bedrock, Google Verte...More ›

Architecture Is the New Product

AI writes the code now. The architecture is what actually ships.
What do software teams actually produce? Most people say features. Ship features, hit deadlines, move the roadmap. For a long time that was close enough. But now AI can generate features. It can scaffold services, write CRUD endpoints, wire up frontends, produce tests. The code itself is approaching...More ›

Taste, Judgment, and the Thing AI Cannot Do

AI handles the logic. But logic was never the hard part.
AI has been around since the very beginning of computers. This is not some new phenomenon. Computing has steadily evolved from binary code and assembly language to the high-level languages we use today, all of it trying to bridge the gap between what a human wants and what a machine can execute. Pre...More ›

There Is No AI Thinking (And You Can't Outsource It)

The machines got faster. The real question is whether we got lazier.
There Is No AI Thinking (And You Can't Outsource It) We have officially entered the era of Inference-at-Scale. The marketing hype surrounding "Artificial Intelligence" has never been louder, bolstered by the deployment of NVIDIA's Rubin) architecture and the rise of Agentic Ecosystems. These systems...More ›

The Infinite Software Crisis

What Happens When AI Writes Faster Than We Can Think
The Infinite Software Crisis: What Happens When AI Writes Faster Than We Can Think There's been a lot of talk lately about what's being called the "Infinite Software Crisis". It's not a new observation - people have been warning about this for a while - but it's been circulating more in engineering ...More ›

Solving the Context Problem - A Local RAG System for Code

Building Local Semantic Code Search - 11ms Queries Across 7,620 Files
Solving the Context Problem: A Local RAG System for Code Yesterday I was using Kilocode to help refactor some authentication code in a 7,620-file Java codebase. The AI agent kept giving me generic advice because it couldn't see the actual implementation patterns in the project or some constraints. I...More ›