outreach

KOL Discovery

Find Key Opinion Leaders (KOLs) in a given domain by combining web research with LinkedIn post search. Given a company/idea and target domain, generates authority keywords, searches LinkedIn posts to find prolific authors with high engagement, and merges with web-researched influencers. Use when someone wants to "find influencers in X space" or "who are the KOLs for Y industry."

by Athina AI

Run in Gooseworks View skill in Github

Install

Terminal

npx gooseworks install --all

# then, in Claude Code, Cursor, or Codex:
/gooseworks use the kol-discovery skill

About This Skill

KOL Discovery

Find Key Opinion Leaders in any domain by searching LinkedIn posts for prolific, high-engagement authors and merging with web-researched influencers.

Core principle: Search for authority/thought-leadership keywords, not pain-language. We want people who shape conversation in the space — conference speakers, newsletter writers, podcast hosts, and prolific LinkedIn posters.

Phase 0: Intake

Ask the user these questions:

Domain & Audience

What does your company/product do? What space are you in?
What specific domain or topic are the KOLs you want to find expert in?
Who is your target audience? (The people the KOLs influence)
Any KOLs you already know about? (LinkedIn URLs — these become the baseline)
Anyone to EXCLUDE? (Competitors, your own team, irrelevant voices)

Phase 1: Generate Domain Keywords

Based on intake, generate 15-25 topic/authority keywords. These are NOT pain-language — they're the terms thought leaders use when sharing expertise:

Industry terms — "freight tech", "supply chain innovation"
Thought leadership signals — "lessons learned in logistics", "future of dispatch"
Conference/event terms — "supply chain summit keynote"
Content creator signals — "newsletter freight", "podcast logistics"

Also generate:

KOL title keywords — titles that signal thought leadership (vp, founder, analyst, editor, host)
Vendor exclusion keywords — titles to filter out (software engineer, recruiter, saas)
Domain relevance keywords — core industry terms for relevance scoring

Present keywords to user for approval before running.

Save config in the current working directory or wherever the user prefers:

Config JSON structure:

{
  "client_name": "example",
  "domain_keywords": ["\"freight tech\" thought leadership", "supply chain innovation"],
  "exclusion_patterns": ["hiring.*position", "we.re recruiting"],
  "kol_title_keywords": ["vp", "founder", "analyst", "editor", "host"],
  "vendor_exclude_keywords": ["software engineer", "saas", "recruiter"],
  "domain_relevance_keywords": ["freight", "logistics", "supply chain"],
  "country_filter": "",
  "max_posts_per_keyword": 50,
  "min_posts": 2,
  "min_total_engagement": 50,
  "top_n_kols": 50
}

Phase 2: Run KOL Discovery Pipeline

python3 skills/kol-discovery/scripts/kol_discovery.py \
  --config kol-discovery.json \
  --output-dir . \
  [--test] [--web-kols kol-web-kols.json] [--yes]

Flags:

--config (required) — path to client config JSON
--output-dir — directory for output CSV (default: current working directory)
--test — limit to 5 keywords (validation run)
--web-kols — path to web-researched KOL JSON (agent generates this)
--yes — skip cost confirmation prompts
--max-runs — override Apify run limit

What the script does:

Keyword search — apimaestro/linkedin-posts-search-scraper-no-cookies for each domain keyword
Author aggregation — Group posts by author, compute engagement metrics
Scoring — Composite KOL score: engagement volume (log-scaled) + consistency (post count) + quality (avg engagement) + relevance (keyword breadth) + web research bonus
Merge — Combine post-data KOLs with web-researched KOLs, flag overlaps
Export — Ranked CSV

Cost estimate: ~$0.10 per keyword. Full run with 20 keywords: ~$2-3.

Always run with --test first.

Phase 2b: Web Research (Agent-Driven)

Before or alongside the script, do web research to find known KOLs:

Search for "top [industry] influencers on LinkedIn"
Find conference speakers, newsletter authors, podcast hosts
Check industry publications for frequent contributors

Save as JSON in the current working directory:

[
  {
    "name": "Jane Doe",
    "linkedin_url": "https://www.linkedin.com/in/janedoe/",
    "source": "FreightWaves conference speaker 2025",
    "notes": "Hosts weekly logistics podcast"
  }
]

Pass to script via --web-kols.

Phase 3: Review & Refine

Present results:

Top 20 KOLs — rank, name, headline, KOL score, total engagement, top post
Source breakdown — how many from post-data vs web-research vs both
Keyword performance — which keywords surfaced the most KOLs

Common adjustments:

Too many irrelevant authors — refine domain keywords, add exclusion patterns
Missing known KOLs — add more keyword variants, expand web research
Too few results — lower min_posts or min_total_engagement thresholds

Phase 4: Output

CSV exported to the current working directory:

Column	Description
Rank	Overall rank by KOL Score
Name	Full name
LinkedIn URL	Profile link
Headline	From LinkedIn
KOL Score	Composite score
Total Posts	Posts found in search
Total Reactions	Sum of reactions across posts
Total Comments	Sum of comments across posts
Avg Engagement	Average reactions+comments per post
Top Post URL	Highest engagement post
Top Post Preview	First 100 chars of top post
Source	post-data / web-research / both

Tools Required

Apify API token — set as APIFY_API_TOKEN in .env
Apify actors used:
- apimaestro/linkedin-posts-search-scraper-no-cookies (keyword search)

Example Usage

Trigger phrases:

"Find KOLs in the freight/logistics space"
"Who are the influencers in [industry]?"
"Discover thought leaders for [domain]"
"Run KOL discovery for [client]"

With existing config:

python3 skills/kol-discovery/scripts/kol_discovery.py \
  --config clients/example/configs/kol-discovery.json \
  --output-dir clients/example/leads --yes

What's included

Industry terms* — "freight tech", "supply chain innovation"

Thought leadership signals* — "lessons learned in logistics", "future of dispatch"

Conference/event terms* — "supply chain summit keynote"

Content creator signals* — "newsletter freight", "podcast logistics"

KOL title keywords* — titles that signal thought leadership (vp, founder, analyst, editor, host)

Render Model Comparison Grid

Render a 'model comparison grid' video from a config — a fal-style "same prompt, N contenders" showcase — a dark real-DOM stage where per beat a monospace prompt fades in centered, docks to a small top strip, then a labeled 2-4 panel grid (static images OR muted video clips, mixable per cell) staggers in and holds for comparison, plus a minimal end card — frame-stepped via Playwright (video cells are frame-seeked deterministically) and encoded with FFmpeg. Deterministic assembly, FREE (cell media comes from create-image-fal / create-video-fal, music from create-music-elevenlabs), text stays pixel-crisp. Use for the model-comparison-grid format.

Render Offer Ad

Render a punchy ~12s vertical (9:16) music-only direct-response OFFER ad as a 4-beat kinetic-typography film — HEADLINE slam → real PRODUCT drop → CLAIM/proof → CTA pill — from one config of copy slots, a real product photo, a brand palette, fonts, bpm, and beat split. DETERMINISTIC + FREE (a bundled Remotion project; springs + interpolate, no AI-gen for visuals). Backgrounds are engine gradient divs off the palette, props are inline SVG, the ONLY composited bitmap is the REAL product photo (objectFit:contain, never stretched), and ALL headline/claim/CTA/URL/wordmark text is typeset in the engine — never AI-rendered (the format's credibility guard). A driver binds the config to Remotion input props, renders the 9:16 master, and derives a 1:1 center-crop with ffmpeg. Two gating checks run before render (claim verbs must match the product's physical format; the claim beat needs an edge-entry mechanism prop). Use for the motion-graphics-offer-ad format.

Render Myth Vs Fact

Assemble a myth-vs-fact kinetic-typography explainer video ad (≈29.5s, 9:16) from N myth/fact pairs + hook / turn / punch copy + palette + a brand end-card PNG + a VO track — a hook, 3 red-strike MYTH cards that flip to teal-check FACT cards (per-line strikethrough that crosses EVERY wrapped line), a "what actually works" turn, an optional proof reveal, a punch line, and a static end card. DETERMINISTIC assembly with ZERO AI-gen visuals — HTML hyperframes rendered frame-exact via Playwright (`window.renderAt(t)`, animation a pure function of beat-local time), Whisper beat-snap to VO word onsets, concat at a uniform fps, karaoke `.ass` captions burned last (suppressed on the proof + end-card beats), and a VO + optional music mix (music −20 dB, `amix normalize=0`, tail fade). FREE (Python + Playwright + ffmpeg); the recipe supplies the copy / palette / end-card / VO and gates the paid VO / music / Whisper calls to their own capabilities. Use for the myth-vs-fact format.

KOL Discovery

KOL Discovery

Phase 0: Intake

Domain & Audience

Phase 1: Generate Domain Keywords

Phase 2: Run KOL Discovery Pipeline

Phase 2b: Web Research (Agent-Driven)

Phase 3: Review & Refine

Phase 4: Output

Tools Required

Example Usage

What's included

Render Model Comparison Grid

Render Offer Ad

Render Myth Vs Fact

Learn to build Growth systems with AI