ProductLens

Agentic AI demo: LLM-driven product comparison tool that ranks products based on user priorities.

Live Demo · Source · Back to Projects


  • Problem: Product research is slow and inconsistent when criteria are vague or multi-objective.
  • Approach: Plan from user intent → research candidates in parallel → normalize findings → score + recommend with tradeoffs.
  • Outcome: Deployed a tool-enabled comparison workflow with a simple UI; produces decision-ready summaries from live web sources.

Snapshots

Quick look at the UI/workflow.

ProductLens UI: search and criteria input Search + criteria
ProductLens UI: recommendation summary Recommendation
ProductLens UI: ranking view Ranking table

ProductLens turns a natural-language request (e.g., “best noise-cancelling headphones for travel and calls”) into a structured comparison: it plans what to evaluate, gathers evidence per product, and returns a ranked recommendation with tradeoffs.

Key components:

  • Planner: converts intent into clear evaluation criteria and constraints.
  • Parallel research: spawns one researcher per product to speed up gathering and reduce omissions.
  • Normalizer: aligns findings into consistent fields, then scores with transparent tradeoffs.
  • Light tool use: web browsing tool as an input to the comparison.
  • UI: Gradio demo interface; deployed on Hugging Face Spaces.

Workflow

            User Request
              → Orchestrator (Comparison Manager)
                → Planner Agent (criteria + candidates)
                  → Research Agents (parallel per product)
                    → Comparator / Decision Agent (score + tradeoffs)
                      → Output (Ranked Recommendation + Table + Sources)
            

Tech stack

  • Runtime: Python
  • Agent framework: OpenAI Agents SDK (tool calls + traces)
  • LLMs: local (Ollama) and/or cloud clients (OpenAI/OpenRouter)
  • Tooling: Web search tool (or equivalent HTTP-based search)
  • UI: Gradio

Engineering Notes

  • Focused agents: each agent does one job (plan, research, compare) to reduce confusion.
  • Structured intermediate outputs: planner produces a "contract” so downstream agents stay consistent and comparable.
  • Source grounding: captures/returns sources used during research for transparency.
  • Failure modes: conflicting specs, outdated reviews, and ambiguous user constraints (e.g., “best” without priorities).

Limitations

  • Not a full price-tracker; results depend on what sources are available at query time.
  • Not production-hardened (no persistent storage, ranking audits, or robust source filtering).
  • Evaluation is qualitative; next step is a repeatable test set of queries and scoring.