Building SLM: Small Models, Big Impact

On a Saturday morning in April, 94 developers, product managers, founders, and designers packed into Kovan Labs, IndiQube Echo for a session that made the case for thinking small. The premise: in a landscape obsessed with scale, the most practical AI breakthroughs are happening at the edge.

Setting the Stage

The session opened with a question the room had clearly been sitting with: why reach for a 70-billion-parameter model when a well-tuned 7-billion one can outperform it on your specific task — at a fraction of the cost and latency? Speaker Sanju Sivalingam, founder of dun and ThisUX and co-founder of Unitedby.AI, framed the morning around that tension.

Sanju brought a rare combination of product intuition and technical depth. Before the first slide, he polled the room on how many had deployed a model in production. The answer shaped how he pitched the rest of the session — less theory, more decisions.

The AI Landscape and the SLM Shift

The first segment traced how the industry arrived here. Large language models grabbed the headlines, but the operational realities — GPU bills, latency, data privacy requirements — have quietly pushed practitioners toward compact alternatives. Sanju walked through the architecture differences that matter in practice: how parameter count, quantisation, and context window interact to determine what a model is actually good for.

The key insight from this segment: model selection isn't a benchmarks exercise. It's a product decision. The right question isn't "which model scores highest?" but "which model fails least badly on my worst-case inputs?"

Strategy and Selection

The second segment got tactical. Sanju walked through a framework for evaluating SLMs for real deployments — covering latency budgets, privacy constraints, offline requirements, and cost per inference. He demonstrated how to read benchmarks critically, flagging the gap between leaderboard performance and production behaviour.

For the product and design folks in the room, this segment landed differently: SLM selection is a design constraint, not a backend decision. The model's capabilities define what your product can promise users.

Data Curation and Fine-Tuning

The third segment dove into what separates a capable model from a useful one: fine-tuning. Sanju covered dataset curation — why quality consistently beats quantity — before moving to optimisation techniques. LoRA and QLoRA got hands-on treatment, with real examples of how parameter-efficient fine-tuning lets teams adapt foundation models without the infrastructure overhead of full retraining.

This was the most technically dense part of the session, and it showed in the room. Questions started stacking up.

Hands-On Implementation

The final segment was a practical fine-tuning lab. Attendees who had brought laptops worked through a structured implementation — from infrastructure choices to a first deployment. Sanju and the AI Weekends team circulated to help debug environment issues and talk through architecture tradeoffs in real time.

The session closed with a conversation about where SLMs are headed: on-device inference, domain-specific foundation models, and the emerging pattern of composing small specialists rather than relying on one large generalist. For a room full of builders, it was a practical north star.

Speaker

Sanju Sivalingam

Founder, dun & ThisUX · Co-founder, Unitedby.AI