capabilities

Conference Speaker Scraper

Extract speaker names, titles, companies, and bios from conference websites. Supports direct HTML scraping and Apify web scraper fallback for JS-heavy sites. Use for pre-event research and outreach targeting.

by Athina AI

Run in Gooseworks View skill in Github

Install

Terminal

npx gooseworks install --claude

# Then in your agent:
/gooseworks <prompt> --skill conference-speaker-scraper

About This Skill

Conference Speaker Scraper

Extract speaker names, titles, companies, and bios from conference website /speakers pages. Supports direct HTML scraping with multiple extraction strategies, plus Apify fallback for JS-heavy sites.

Quick Start

No API key needed for direct scraping mode.

# Scrape speakers from a conference page
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \
  --url "https://example.com/speakers"
 
# Use Apify for JS-heavy sites
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \
  --url "https://example.com/speakers" --mode apify
 
# Custom conference name (otherwise inferred from URL)
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \
  --url "https://example.com/speakers" --conference "Sage Future 2026"
 
# Output formats
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output json     # default
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output csv
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output summary

How It Works

Direct Mode (default)

Fetches the page HTML and tries multiple extraction strategies in order, using whichever returns the most results:

Strategy A -- CSS class hints: Looks for speaker cards with class names containing "speaker", "presenter", "faculty", "panelist", "team-member"
Strategy B -- Heading + paragraph patterns: Looks for repeated <h2>/<h3> + <p> structures
Strategy C -- JSON-LD structured data: Checks for <script type="application/ld+json"> with speaker data
Strategy D -- Platform embeds: Detects Sched.com/Sessionize patterns used by many conferences

Apify Mode

Uses apify/cheerio-scraper actor with a custom page function that targets common speaker card selectors. Standard POST/poll/GET dataset pattern.

CLI Reference

Flag	Default	Description
`--url`	required	Conference speakers page URL
`--conference`	inferred	Conference name (otherwise inferred from URL domain)
`--mode`	direct	`direct` (HTML scraping) or `apify` (Apify cheerio scraper)
`--output`	json	Output format: `json`, `csv`, or `summary`
`--token`	env var	Apify token (only needed for apify mode)
`--timeout`	300	Max seconds for Apify run

Output Schema

{
  "name": "Jane Smith",
  "title": "VP of Finance",
  "company": "Acme Corp",
  "bio": "Jane leads the finance transformation at...",
  "linkedin_url": "https://linkedin.com/in/janesmith",
  "image_url": "https://...",
  "conference": "Sage Future 2026",
  "source_url": "https://sagefuture2026.com/speakers"
}

Cost

Direct mode: Free (no API, no tokens)
Apify mode: Uses apify/cheerio-scraper -- minimal Apify credits

Testing Notes

HTML scraping is inherently fragile across conference sites. The multi-strategy approach maximizes coverage, but JS-heavy sites will require Apify mode. When direct scraping returns 0 results, try --mode apify.

What's included

Direct mode:* Free (no API, no tokens)

Apify mode:* Uses apify/cheerio-scraper -- minimal Apify credits

Meta Ads Analyzer

Diagnose Meta Ads campaign performance using Meta's actual system mechanics — Breakdown Effect, Learning Phase, Auction Overlap, Pacing, and Creative Fatigue — and produce structured, testable recommendations that avoid judging segments by average CPA instead of marginal efficiency.

ads

Meta Ad Policy Checker

Pre-flight policy check for Meta ads. Takes ad copy plus advertiser context, resolves and fetches the relevant Meta transparency-center policy pages at runtime, and returns a Pass / Fix Required / Block verdict with cited findings and rewrites.

ads

Ad Lead Quality Analyzer

For paid lead-gen and participant-recruitment ads, replaces vanity CPA with true CAC per qualified lead by joining ad-platform data with downstream funnel events, surfaces tracking gaps, and classifies every creative into Scale / Keep / Investigate / Cut.

Conference Speaker Scraper

Conference Speaker Scraper

Quick Start

How It Works

Direct Mode (default)

Apify Mode

CLI Reference

Output Schema

Cost

Testing Notes

What's included

Meta Ads Analyzer

Meta Ad Policy Checker

Ad Lead Quality Analyzer

Learn to build Growth systems with AI