ProductLens
Agentic AI demo: LLM-driven product comparison tool that ranks products based on user priorities.
Live Demo · Source · Back to Projects
- Problem: Product research is slow and inconsistent when criteria are vague or multi-objective.
- Approach: Plan from user intent → research candidates in parallel → normalize findings → score + recommend with tradeoffs.
- Outcome: Deployed a tool-enabled comparison workflow with a simple UI; produces decision-ready summaries from live web sources.
Snapshots
Quick look at the UI/workflow.
ProductLens turns a natural-language request (e.g., “best noise-cancelling headphones for travel and calls”) into a structured comparison: it plans what to evaluate, gathers evidence per product, and returns a ranked recommendation with tradeoffs.
Key components:
- Planner: converts intent into clear evaluation criteria and constraints.
- Parallel research: spawns one researcher per product to speed up gathering and reduce omissions.
- Normalizer: aligns findings into consistent fields, then scores with transparent tradeoffs.
- Light tool use: web browsing tool as an input to the comparison.
- UI: Gradio demo interface; deployed on Hugging Face Spaces.
Workflow
User Request
→ Orchestrator (Comparison Manager)
→ Planner Agent (criteria + candidates)
→ Research Agents (parallel per product)
→ Comparator / Decision Agent (score + tradeoffs)
→ Output (Ranked Recommendation + Table + Sources)
Tech stack
- Runtime: Python
- Agent framework: OpenAI Agents SDK (tool calls + traces)
- LLMs: local (Ollama) and/or cloud clients (OpenAI/OpenRouter)
- Tooling: Web search tool (or equivalent HTTP-based search)
- UI: Gradio
Engineering Notes
- Focused agents: each agent does one job (plan, research, compare) to reduce confusion.
- Structured intermediate outputs: planner produces a "contract” so downstream agents stay consistent and comparable.
- Source grounding: captures/returns sources used during research for transparency.
- Failure modes: conflicting specs, outdated reviews, and ambiguous user constraints (e.g., “best” without priorities).
Limitations
- Not a full price-tracker; results depend on what sources are available at query time.
- Not production-hardened (no persistent storage, ranking audits, or robust source filtering).
- Evaluation is qualitative; next step is a repeatable test set of queries and scoring.