Engineering notes.

Thoughts on software, hardware, and building things that scale.

We benchmarked an RTX 3090 over USB4. The engineering is brilliant. The numbers aren't there yet.

Why the 24GB VRAM limit is the hardest bottleneck in consumer AI, and how quantization techniques are attempting to bypass it.

Is the 4090 worth the premium for local inference, or is the 3090 still the reigning champion of value?