capabilities

Customer Discovery

Discover all customers of a given company by scanning websites, case studies, review sites, press, social media, job postings, and more. Use when you need competitive intelligence on who a company sells to.

by Athina AI

Run in Gooseworks View skill in Github

Install

Terminal

npx gooseworks install --claude

# Then in your agent:
/gooseworks <prompt> --skill customer-discovery

About This Skill

Customer Discovery

Find all customers of a company by scanning multiple public data sources. Produces a deduplicated report with confidence scoring.

Quick Start

Find all customers of Datadog

Who are Notion's customers? Use deep mode.

Inputs

Input	Required	Default	Description
Company name	Yes	—	The company to research
Website URL	No	Auto-detected	The company's website URL
Depth	No	standard	`quick`, `standard`, or `deep`

Procedure

Step 1: Gather Inputs

Ask the user for:

Company name (required)
Company website URL (optional — if not provided, WebSearch for it)
Depth tier — present these options, default to Standard:
- Quick (~2-3 min): Website logos, case studies, G2 reviews, press search
- Standard (~5-8 min): Quick + blog posts, Wayback Machine, LinkedIn, Twitter, Reddit, HN, job postings, YouTube
- Deep (~10-15 min): Standard + SEC filings, podcasts, GitHub, integration directories, BuiltWith, Crunchbase

Step 2: Create Output Directory

mkdir -p customer-discovery-[company-slug]

Step 3: Run Sources for Selected Tier

Collect all results into a running list. For each customer found, record:

name: Company name
confidence: high / medium / low
source_type: e.g., "logo_wall", "case_study", "g2_review", "press", "job_posting"
evidence_url: URL where the evidence was found
notes: Brief description of the evidence

Quick Sources

1. Website logo wall

Run the scrape_website_logos.py script:

python3 skills/capabilities/customer-discovery/scripts/scrape_website_logos.py \
  --url "[company-url]" --output json

Parse the JSON output and add each result to the customer list.

2. Case studies page

Use WebFetch on the company's case studies page (try /case-studies, /customers, /resources/case-studies). Extract customer names from page headings and content.

3. G2/Capterra reviews

If the review-site-scraper skill is available, use it to find reviewer companies:

python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \
  --platform g2 --url "[g2-product-url]" --max-reviews 50 --output json

First, WebSearch for the company's G2 page: site:g2.com "[company]". Extract reviewer company names from review author info.

4. Web search for press

WebSearch these queries and extract customer mentions from results:

"[company]" customer OR "case study" OR partnership
"[company]" "we use" OR "switched to" OR "chose"

Standard Sources (in addition to Quick)

5. Company blog posts

WebSearch: site:[company-domain] customer OR "case study" OR partnership OR "customer story"

6. Wayback Machine logos

Run the scrape_wayback_logos.py script:

python3 skills/capabilities/customer-discovery/scripts/scrape_wayback_logos.py \
  --url "[company-url]" --output json

Logos marked still_present: false are especially interesting — they indicate former customers.

7. Founder/exec LinkedIn posts

WebSearch: site:linkedin.com "[company]" customer OR "excited to announce" OR "welcome"

8. Twitter/X mentions

WebSearch: site:twitter.com "[company]" "we use" OR "just switched to" OR "loving"

9. Reddit/HN mentions

WebSearch these queries:

site:reddit.com "we use [company]" OR "[company] customer"
site:news.ycombinator.com "[company]" customer OR user

10. Job postings

WebSearch: "experience with [company]" site:linkedin.com/jobs OR site:greenhouse.io OR site:lever.co

Companies requiring experience with the product are likely customers.

11. YouTube testimonials

WebSearch: site:youtube.com "[company]" customer OR testimonial OR review

Deep Sources (in addition to Standard)

12. SEC filings

WebSearch: site:sec.gov "[company]" — Look for mentions in 10-K and 10-Q filings.

13. Podcast transcripts

WebSearch: "[company]" podcast customer OR transcript OR interview

14. GitHub usage signals

WebSearch: site:github.com "[company-package-name]" in dependency files, package.json, requirements.txt, etc.

15. Integration directories

WebFetch marketplace pages where the company lists integrations:

Salesforce AppExchange
Zapier integrations page
Slack App Directory
Any marketplace relevant to the company

16. BuiltWith detection

python3 skills/capabilities/customer-discovery/scripts/search_builtwith.py \
  --technology "[company-slug]" --max-results 50 --output json

17. Crunchbase

WebSearch: site:crunchbase.com "[company]" customers OR partners

Step 4: Deduplicate Results

Merge results by company name using fuzzy matching:

Normalize: lowercase, strip suffixes (Inc, Corp, LLC, Ltd, Co., GmbH)
Treat "Acme Inc" = "Acme" = "ACME Corp" = "acme.com" as the same company
When merging, keep the highest confidence level and all evidence URLs

Step 5: Assign Confidence

Apply these rules:

High confidence:

Logo on current website (from scrape_website_logos.py with confidence "high")
Published case study or customer story
Direct quote or testimonial on the company's site
Official partnership page listing

Medium confidence:

G2/Capterra review (reviewer's company)
Press article mentioning customer relationship
Job posting requiring experience with the product
YouTube testimonial or video review
Logo found only in Wayback Machine (was on site, now removed)

Low confidence:

Single social media mention (tweet, Reddit post)
Indirect reference ("heard good things about X")
BuiltWith detection only (technology on site doesn't mean they're a paying customer)
HN discussion mention

Step 6: Generate Report

Create two output files:

customer-discovery-[company]/report.md:

# Customer Discovery: [Company Name]
 
**Date:** YYYY-MM-DD
**Depth:** quick | standard | deep
**Total customers found:** N
 
## High Confidence (N)
 
| Customer | Source | Evidence |
|----------|--------|----------|
| Shopify | Case study | [link] |
| ... | ... | ... |
 
## Medium Confidence (N)
 
| Customer | Source | Evidence |
|----------|--------|----------|
| ... | ... | ... |
 
## Low Confidence (N)
 
| Customer | Source | Evidence |
|----------|--------|----------|
| ... | ... | ... |
 
## Sources Scanned
 
- Website logo wall: [url] — N customers found
- G2 reviews: N reviews analyzed — N companies identified
- Wayback Machine: N snapshots checked — N logos found (N removed)
- Web search: N queries — N mentions
- ...
 
## Methodology
 
This report was generated using the customer-discovery skill, which scans
public data sources to identify companies that use [Company Name]. Confidence
levels reflect the strength and directness of the evidence found.

customer-discovery-[company]/customers.csv:

CSV with columns: company_name,confidence,source_type,evidence_url,notes

Write the CSV using a code block or Python script.

Scripts Reference

Script	Purpose	Key flags
`scrape_website_logos.py`	Extract logos from current website	`--url`, `--output json\|summary`
`scrape_wayback_logos.py`	Find historical logos via Wayback Machine	`--url`, `--paths`, `--output json\|summary`
`search_builtwith.py`	BuiltWith technology detection (deep mode)	`--technology`, `--max-results`, `--output json\|summary`

All scripts require requests: pip3 install requests

External skill scripts (use if available):

skills/capabilities/review-site-scraper/scripts/scrape_reviews.py — G2/Capterra/Trustpilot reviews (requires Apify token)
skills/capabilities/linkedin-post-research/scripts/search_posts.py — LinkedIn post search (requires Apify token)

Cost

Quick / Standard: Free (uses WebSearch + free APIs like Wayback Machine CDX)
Deep: Mostly free. BuiltWith paid API is optional (--api-key flag); free scraping is used by default.
External skills (review-site-scraper, linkedin-post-research) may require paid API tokens.

What's included

name*: Company name

confidence*: high / medium / low

source_type*: e.g., "logo_wall", "case_study", "g2_review", "press", "job_posting"

evidence_url*: URL where the evidence was found

notes*: Brief description of the evidence

Meta Ads Analyzer

Diagnose Meta Ads campaign performance using Meta's actual system mechanics — Breakdown Effect, Learning Phase, Auction Overlap, Pacing, and Creative Fatigue — and produce structured, testable recommendations that avoid judging segments by average CPA instead of marginal efficiency.

ads

Meta Ad Policy Checker

Pre-flight policy check for Meta ads. Takes ad copy plus advertiser context, resolves and fetches the relevant Meta transparency-center policy pages at runtime, and returns a Pass / Fix Required / Block verdict with cited findings and rewrites.

ads

Ad Lead Quality Analyzer

For paid lead-gen and participant-recruitment ads, replaces vanity CPA with true CAC per qualified lead by joining ad-platform data with downstream funnel events, surfaces tracking gaps, and classifies every creative into Scale / Keep / Investigate / Cut.

Customer Discovery

Customer Discovery

Quick Start

Inputs

Procedure

Step 1: Gather Inputs

Step 2: Create Output Directory

Step 3: Run Sources for Selected Tier

Quick Sources

Standard Sources (in addition to Quick)

Deep Sources (in addition to Standard)

Step 4: Deduplicate Results

Step 5: Assign Confidence

Step 6: Generate Report

Scripts Reference

Cost

What's included

Meta Ads Analyzer

Meta Ad Policy Checker

Ad Lead Quality Analyzer

Learn to build Growth systems with AI