Thoughts on software, hardware, and building things that scale.
How we matched Apple Silicon efficiency on an older RTX 3090 by aggressively fusing kernels.