I put LLMs in production and build the full-stack product around them: serving, MLOps/LLMOps, evals, observability. I founded and run Magoquiz, a profitable B2B recommendation-quiz SaaS for LATAM e-commerce, built solo while finishing my CS degree. 46 paying customers, 128k quiz-takers.
Inside it I shipped a self-hosted LLM scoring service that runs about 200x cheaper per call than the hosted API. I distilled a Sonnet judge into a fine-tuned Llama 3.1 8B with LoRA/QLoRA on Modal, then compiled it to a TensorRT-LLM engine behind NVIDIA Triton on a scale-to-zero GPU, serving at roughly $0.0003 per score. Around the models I run the LLMOps that keeps them honest: a CI gate that blocks regressions with an LLM-as-judge, a W&B model registry, and a daily drift monitor over Langfuse traces.
I'm a full-stack engineer too. One integration (OAuth plus webhooks) drove 47x ROI and a 236% conversion lift in 10 days, and the platform has served 30k requests a day for 18 months with no customer-facing outages. Before Magoquiz I spent three years at SYDLE: multi-tenant rewrites, a 60k-page SEO platform that pulled over a million organic impressions a month, and a year as the technical point of contact between engineering, product, design, and execs.