Automate dataset collection with Peacock Data

Peacock Data searches the world wide web, crawls and analyzes the semantic content of pages, and builds a specialized dataset of relevant online content for your application.

AI-Enhanced Web Crawling

Targeted Web Crawling

Crawl the web based on search queries or specific URLs to collect data for.

Semantic Filtering

Smart content extraction and filtering ensure your crawl datasets only include relevant content, saving time and money.

Captcha Bypass

Automatic captcha detection and resolution help to procure the highest quality data from your crawl processes.

Scheduled Execution

Schedule crawl subjects to run only once, or on a regular schedule to ensure your data stays current.

Integrated directly with Python

Import the peacockdata package to schedule and load your crawl datasets straight from your python environment.

Crawl Data Notebooks

Build peacockdata right into your pipelines to analyze web crawl data.

  • Download crawl artifacts
  • Replay web pages
  • Run SQL queries over crawl metadata


Use your crawl data to build a vector database for RAG applications, fine tune an LLM, or monitor the web for events.

Features

Find a plan that's right for you

Get started with a free plan to experiment with the API and start testing out your ideas. Scale up at low cost as you grow.

Free Bird
$ 0 /mo
For developers to experiment with developing crawl data pipelines.
Get Started ->
Features include:
  • 24 hours of crawling per month
  • 10 GiB of crawl data storage
  • Unlimited captcha bypass
  • Up to 1,000 crawl subjects
Most Popular
Peacock Pro
$ 39 /mo
For teams deploying crawl data pipelines into production.
Create Account ->
Everything in Free Bird, plus:
  • Crawl at $0.003/minute
  • Store data at $0.025/GiB
  • Up to 1,000,000 crawl subjects
  • Priority crawl scheduling
Big Data
$ 199 /mo
For teams running advanced crawl data pipelines.
Create Account ->
Everything in Peacock Pro, plus:
  • Bring your own S3 storage
  • Unload to Apache Iceberg
  • Unload to Snowflake
  • Unlimited Crawl Subjects

Resources to help you get the most out of Peacock Data

Get started with Peacock Data

It only takes a few minutes to get your first crawl dataset going.