Projects — Linus Seah

Mar 2026 · Next.js · Supabase · Claude · Exa · Vercel

A production-deployed 3-agent system for automated B2B lead generation. Agent 1 discovers ICP-matching companies using Exa's semantic search (findSimilar). Agent 2 runs a 4-phase enrichment pipeline: company profiling, contact discovery, industry engagement mapping, and qualification scoring against a weighted rubric. Agent 3 drafts personalized outreach per decision-maker. Results are presented in a Next.js dashboard with HubSpot-style UI: filterable lead table, sliding detail panel, inline CRM editing. Deployed on Vercel with Supabase backend. Demonstrates multi-agent orchestration, search strategy design, qualification as explainability (not filtering), and production infrastructure decisions (Streamlit → Next.js, JSON → Supabase).

AI Agents Next.js Supabase Lead Generation Production Deploy

Live Demo → · GitHub → · Blog: When Your User Isn't You → · Blog: From Demo to Production →

Daily Digest v2 — Agentic AI News Pipeline

Feb 2026 · Python · Claude Agent SDK · Exa · GitHub Actions

An automated morning news digest that uses the Claude Agent SDK to orchestrate content fetching, curation, and delivery. The agent reads from 6+ sources (RSS, IMAP, web search), makes editorial decisions about relevance and theme, and sends a curated email every morning at 7am. Includes a deterministic fallback pipeline for production reliability.

AI Agents Claude Sonnet Exa Search Open Source

GitHub → · Blog: What "Agent" Means → · Blog: Evaluation Framework →

Daily Digest v1/v1.5 — Deterministic News Pipeline

Feb 2026 · Python · OpenRouter / Claude Haiku · feedparser

The earlier iterations of the daily digest project. v1 is a simple fetch-summarize-send pipeline using free LLMs. v1.5 adds a curation layer with relevance scoring and a user profile config. Built to understand the progression from deterministic to agentic architectures.

LLM Pipeline Claude Haiku RSS

GitHub → · Technical Writeup →

LLM-as-a-Judge Evaluation Framework — Custom AI Model Evaluation

Feb 2026 · Python · Claude Opus · Pearson Correlation · Streamlit

A production-grade evaluation system for AI agents using LLM-as-a-judge methodology. Features an 8-dimension rubric, Pearson correlation calibration to align judge scores with human taste (r=0.72), automated scoring pipeline via GitHub Actions, and a Streamlit dashboard for score tracking. Demonstrates task-specific eval design, rubric calibration techniques, and why evaluation costs more than generation.

LLM Evaluation LLM-as-a-Judge Claude Opus Calibration

GitHub (evals/) → · Blog: Part 1 → · Blog: Part 2 →