The Icon Saga: From LLM Generation to Lucide Lookup

In early January 2026, during our consumer-agent investigation, we wanted every grocery item, todo, and weather card to have an icon that looked like the thing. We spent 40+ commits and a six-hour sprint making an LLM draw them. Then we deleted the pipeline and replaced it with Lucide's 1,666 icons and a keyword match. This is the commit-by-commit record.

Tags	IconsLLMPragmatic Engineering

Updated

July 13, 2026

Reading Time

8 min

An icon for every item

In early January 2026, AI Hero was still a consumer-product investigation: agents for groceries, todos, weather, medications, habits, shopping lists. Each agent managed items people touched every day. A grocery list had "organic bananas" and "oat milk." A todo list had "call dentist" and "review quarterly report." A weather widget showed conditions for "San Francisco" or "Tokyo."

We wanted each item to have its own icon, one that actually looked like the thing. The banana should look like a banana. The phone call should look like a phone. The rain in Tokyo should look different from the sunshine in San Francisco.

The obvious solution was to let an LLM generate SVG icons on the fly. Feed it "almond milk," get back a carton. Feed it "rainy Tokyo," get a moody skyline with raindrops. This post is the record of how that went: 40+ commits and a six-hour sprint making a model draw icons, before we admitted the right tool was a 1,666-icon library and a keyword match.

The LLM Approach

On January 1st, 2026, at 12:31 PM, we committed what seemed like a brilliant feature: dynamic SVG icon generation powered by GPT-4o-mini.

When a user added "almond milk" to their grocery list, the frontend would call our new /v1/agents/grocery/{agent_id}/visualize endpoint with the item name. The backend would canonicalize the name (so "2% milk" and "whole milk" would both become "milk"), then ask the LLM to generate an SVG icon. The result would be cached in Redis for 30 days and returned to the frontend, which would render it with proper sanitization to prevent XSS attacks.

REQUEST

CANONICALIZE

LLM GENERATE

SANITIZE

CACHE

RENDER

The LLM icon generation pipeline: request → canonicalize → generate → sanitize → cache → render

The initial results were genuinely exciting. We had icons! Real, contextual icons that matched our items. A banana looked like a banana. A phone looked like a phone. The caching meant we only hit the API once per unique item, and the sanitization kept us safe from malicious SVG content.

But there were problems. The icons took 500-2000ms to generate on cache miss. The quality varied wildly: sometimes we got beautiful minimalist icons, sometimes we got cluttered messes. And every API call cost money. Still, we were optimistic. Surely we could fix this with better prompts?

The Optimization Rabbit Hole

What followed was an intense six-hour sprint of optimization. Over 40 commits, we tried everything we could think of to make LLM icon generation work well.

Model Upgrades

We started with GPT-4o-mini, then upgraded to GPT-5.2 hoping for better quality. We enabled reasoning mode, then disabled it. We tried different temperature settings. Each change helped a little, but never enough.

Prompt Engineering Adventures

We developed what we called "joyful minimalism," a set of five design principles for our icons:

One gesture per icon
Round over sharp
Let it breathe (70% empty canvas)
Subtract until it breaks
No text ever

We reverse-engineered the Lucide icon style and fed those instructions to the LLM. We provided reference SVGs for common icon types. We tried fixed icon vocabularies where the LLM could only choose from predefined shapes. We experimented with different viewBox sizes (24x24, 256x256, 128x128), each with proportionally scaled stroke widths.

The Image Prompt Era

We added an image_prompt field that generated detailed visual descriptions before the SVG. For weather icons, we tried location-specific skylines: the Golden Gate Bridge for San Francisco, the Empire State Building for New York. For groceries, we described containers and identifying marks. The prompts got increasingly elaborate.

## Weather Icon: San Francisco, Rainy
- Silhouette of Golden Gate Bridge in background
- Gentle rain drops falling at 15-degree angle
- Low clouds obscuring bridge towers
- Minimalist, monochrome style
- No text or labels

Caching and Logging

We implemented LLM prefix caching to speed up responses. We added client-side caching to prevent duplicate API calls. We logged every generated icon to MongoDB so we could analyze patterns and improve over time. We added cache versioning so we could invalidate old icons when we improved the prompts.

Looking back at the commit log, we can see the growing desperation in the commit messages: "improve icon quality," "simplify prompts for cleaner icons," "constrain to fixed icon vocabulary," "enhance icon prompts with richer descriptions." Each one a small step forward, but the fundamental problems remained.

The icons were still slow. They were still inconsistent. And we were still paying for every generation. Worse, debugging was nearly impossible: when an icon looked wrong, we couldn't easily explain why the LLM had made that choice.

The Pivot

On January 2nd at 11:59 AM, we committed what might be our most important change: we deleted almost everything we had built and replaced it with a keyword-based Lucide icon lookup.

Lucide is an open-source icon library with 1,666 beautifully designed, consistent SVG icons. Each icon has semantic keywords: "apple" maps to fruit, food, grocery; "check-square" maps to task, complete, todo. Instead of generating icons, we would find the best matching icon from this library.

REQUEST

KEYWORD MATCH

SCORE ICONS

RENDER

The Lucide lookup pipeline: request → keyword match → score → render

The transformation was dramatic:

Speed: From 500-2000ms to ~5ms
Cost: From API calls per item to zero
Consistency: From variable to deterministic
Debugging: From "why did the LLM do that?" to "these keywords matched this icon"

We added two JSON files to our backend: icons.json (46,356 lines containing all 1,666 Lucide icons with their SVG element definitions) and icons-keywords.json (keyword mappings for semantic search). The endpoint extracted keywords from the item name, scored each icon by how many keywords matched, and returned the best match.

We sometimes got a generic icon when a perfect match didn't exist. "Organic free-range eggs from local farm" just got the egg icon, with nothing organic or free-range about it. But it was instant and consistent, and it was good enough.

Making It Better

The pivot wasn't the end of the story. Over the following days, we refined the Lucide-based system to make it even smarter.

Multi-Keyword Scoring

Instead of single-keyword matching, we implemented an inverted index where each icon maps to multiple keywords. When searching for "almond milk," we scored icons by how many relevant keywords they matched. An icon with keywords ["almond," "milk," "carton," "dairy"] scored higher than one with just ["milk"].

Agent-Kind Disambiguation

The same word can mean different things in different contexts. "Apple" in a grocery list should show fruit. "Apple" in a todo about tech should maybe show a laptop. We added an agent_kind parameter that boosted icons matching the agent's domain. Grocery agents boosted food-related icons; tech-related agents, device icons.

Safe Rendering

We changed the API response from raw SVG strings to structured element arrays. Instead of returning <svg><path d="..."/></svg>, we returned {elements: [{tag: "path", attrs: {d: "..."}}]}. The frontend rendered these using React.createElement, which eliminated any need for dangerouslySetInnerHTML.

The system we ended up with was simple and fast. It showed a reasonable icon for every item, across every agent, without a single model call.

What We Learned

The honest lesson is sunk cost. We had a six-hour sprint and 40+ commits invested in the LLM approach, and every small improvement made it a little harder to walk away. The right answer wasn't a better prompt; it was a different tool, and the commit log shows how long we took to admit that.

Users didn't need the perfect icon for "organic almond milk from Trader Joe's." They needed an icon that was good enough, instantly. A 5ms lookup that returns a milk carton every time is better than a 2-second generation that might return a beautiful custom icon or might return something weird. And Lucide's 1,666 icons already covered almost every concept we needed, drawn by designers who had spent far more time on icon design than we ever would.

We make the same call today in the bespoke software we design, build, and operate for customers (the longer argument is in the bespoke SaaS post): reach for a model where the problem is genuinely open-ended, and reach for the deterministic tool everywhere else. If your team is 40 commits into making an LLM do something a lookup table could do, it's worth pausing to check.

Article by

Rahul Parundekar

San Francisco-based consultant specializing in cutting-edge Generative AI (GenAI). I partner with organizations to pinpoint high-impact opportunities, streamline AI operations, and accelerate the launch of innovative products—efficiently, cost-effectively, and with controlled risk. Founder of Elevate.do and A.I. Hero, Inc.