Writing

Thoughts on software engineering, AI, side projects, and the things I learn along the way.

A Small LLM and a Point-to-Click Model Beat a Big Multimodal Agent

I built a computer-use agent by splitting it in two: a small reasoning model that plans, and a vision-grounding model that turns 'click the blue Sign In button' into pixel coordinates. Cheaper, faster, and weirdly more reliable than a single big multimodal call.

May 8, 2026→

Understanding NATs in LLM Training

A rough guide to what NATs actually mean in LLM training loss, because I kept seeing this term everywhere and finally decided to understand it.

March 21, 2026→

Fine-Tuning a 2B Vision Model for PDF-to-Markdown with GRPO

I spent a day fine-tuning Qwen3-VL-2B for PDF-to-markdown conversion using SFT + GRPO. Total cost: $4. Here's what worked, what didn't, and why GRPO alone fails on vision models.

March 1, 2026→

I Talk to Claude More Than Humans (And What That Taught Me)

Three weeks running Claude 24/7 taught me how to make coding agents actually work: verification loops, team standards, and the right tooling setup.

February 9, 2026→

On Trust, Context, and Language

Scattered thoughts on the nature of trust in LLMs, how context gives meaning, and language as a facilitator of senses.

June 11, 2025→

Building for the Future

Striving for the best in the age of Gen AI, and preparing for the unknown future.

February 19, 2025→

Notes: DeepSeek-R1

Notes on the DeepSeek-R1 paper — how pure reinforcement learning with GRPO enables emergent reasoning in LLMs without supervised data.

January 28, 2025→