Tavily, Exa, Firecrawl, and Parallel, compared

Tavily vs Exa vs Firecrawl vs Parallel: honest side-by-side comparison of the four serious AI search APIs in 2026. Pricing, semantic depth, when to pick which.

The most common comparison question in this category is the wrong question. Builders type "tavily vs exa" or "firecrawl vs tavily" into a search box expecting to find a winner, the way you would compare two laptops. There is no winner. These four tools do mostly different things, and a team that picks one and tries to make it do everything ends up rebuilding the missing capabilities badly.

We have spent the last year running production agents that touch the open web. A Telegram bot doing multi-agent search across Berlin culture, a stem-cell intelligence feed monitoring fresh research, a thesis dataset of 500+ naturalistic agent queries. Every one of those systems uses at least two of the tools below. The combinations are deliberate.

This guide is organised by the job you are doing, not by vendor. For each job, we name the tool we reach for first and the specific way the others fall short. At the end there is a decision-tree table to bookmark. We cover the four named tools in depth and bring in Perplexity Sonar, Jina, Brave, and Serper where they actually compete.

The thesis is short. "Which search tool is best?" is the wrong question. The right question is which job you are doing. The same team usually needs two of these tools, not one.

This article extends Garden's complete guide to agentic AI search, which covers all eight jobs and eleven tools. This piece is the practical decision layer on top.

The Four Tools, in One Paragraph Each

Before the per-job comparisons, here is the shortest honest description of what each of these companies is actually doing. Vendor pages will not say it this directly.

Exa runs a proprietary vector index of the open web. The only genuine semantic index in this category at web scale. Queries are descriptions of ideal content; the index returns pages whose embeddings are mathematically close. Also has a keyword mode, livecrawl (fetch a page right now, bypassing the index), findsimilar, and a separate Websets product for structured entity lists.

Tavily has no proprietary index. It is an aggregator over Google and Serper with an AI processing layer that returns clean ranked text in a single call. Strong on keyword queries and fresh news. No genuine semantic search. Acquired by Nebius (formerly Yandex N.V.) in February 2026 for up to $400M.

Firecrawl is not a search engine. It's the most reliable browser-based extractor in the category. Renders JS via headless Chrome by default, crawls whole sites, and is the only tool here that can interact with pages (clicking, filling forms, navigating). 106K+ GitHub stars; AGPL-3.0; self-hostable.

Parallel is a deep research engine with its own index. Built for hard multi-step questions, not fast lookups. On BrowseComp it scored 58% to GPT-5's 38% and Exa Research's 14%. Also ships a FindAll API for bulk entity collection and a Monitor product that watches sites for changes on a schedule.

That is the field. Now the jobs.

Job One: Find by Meaning

When the query describes an idea, a content type, or a theme, and the exact words you would search for do not appear in the pages you want.

Best tool: Exa.

This is the job Exa was built for. The mechanism is not marketing: Exa runs a vector index where pages and queries are encoded as embeddings, and proximity in vector space corresponds to proximity in meaning. When you search for "engineering blogs that explain ML intuitively," Exa finds pages whose embeddings cluster near that description. A different operation than counting word matches.

We tested this on a thesis sub-problem: finding essays that argue LLM evaluation benchmarks overestimate generalisation. None of those essays use the literal phrase. Exa neural mode found six of the eight we already had in our corpus, plus four we did not know about. Tavily, Brave, and Claude's built-in web search found zero of them. They returned pages about benchmarks generally, plus the usual SEO clutter.

How the runners-up fail on this job:

Tavily. Returns pages where the literal query terms appear (BM25 over Google with an AI summary on top). For conceptual queries you get whatever Google ranks for your keyword phrasing, usually wrong for ideas. The failure is silent.
Firecrawl. Has a search tool, but it is an aggregator without a semantic index.
Parallel. Understands meaning through its objective parameter, but the system is tuned for multi-step research. Using it for a single semantic lookup pays deep-research prices for a job Exa solves in under a second.
Perplexity Sonar. Hybrid BM25 plus vector retrieval over a 200B+ URL index, so genuinely semantic at the retrieval layer. The catch is that Sonar returns a finished LLM answer, not raw results. For research where you want to see the pages, not an opinion about the pages, wrong shape.
Brave, Serper, Claude native web search. All keyword-only.

One craft note. Semantic search requires a different query grammar. The mental model for Exa is: describe the ideal page. Not "Python ML tutorial" but "practical tutorials aimed at working software engineers applying ML without deep theoretical background, written in 2024 or 2025." Teams that switch to Exa and write keyword-style queries get keyword-style results and conclude semantic search does not work. It works. The query has to match the mode.

Garden's guide to semantic search covers this job in more depth: the silent failure modes, genuine versus shallow semantic indexes, and when not to use semantic search at all.

Job Two: Find by Exact Match

When you know exactly what you are looking for (a name, a brand, a product, a date, a price) and you need a page where those literal terms appear.

Best tool: Tavily.

This is where Tavily is genuinely best of these four. Tavily wraps Google in one API call, returns clean Markdown with a relevance score, and can include the full raw content of the top pages without a second extraction step. For exact factual queries like "Anthropic Claude API pricing 2026" or "Exa AI Series B funding amount", this is the most efficient call shape in the category. Basic search latency around 180 ms (p50); the topic: "news" plus days combination is the cleanest recency filter of the four.

How the runners-up fail on this job:

Exa. Set type: "keyword" and Exa runs BM25 over its index. Works, but Exa's index updates less frequently than Google's, so for breaking commercial queries Google has content Exa does not. Solid fallback, not first call.
Firecrawl. Not a search engine for this job. Two steps (search plus scrape) where Tavily is one.
Parallel. Massively overkill. Agentic mode runs multiple internal searches when you only needed one.
Brave Search API. Solid alternative if you specifically want an independent index outside Google and Microsoft. Use Brave when independence from Google matters; use Tavily when you just want the answer fast.
Serper. Cheapest raw Google SERP ($0.30 to $1.00 per 1,000) but no page contents. Serper plus downstream LLM reading raw SERP runs about 40% more tokens than Tavily for the same workflow. Wins on per-call price, loses on total cost in most real pipelines.
Perplexity Sonar. Finished answer with citations at 358 ms median latency. Trade-off: you do not see the raw results, which is a bad fit when you want to audit what the agent saw.

One structural risk on Tavily worth flagging. After the Google API shutdown, its dependency on Google through Serper is a real supply-chain exposure. The tool works beautifully today. If Google tightens its tolerance for SERP scraping, that ripples through Tavily with limited warning. Builders who care about resilience should not put 100% of their exact-match traffic on one Google-dependent aggregator.

Job Three: Deep Research

When the question is complex enough that no single query will answer it, and the agent has to iterate (search, read, evaluate, pivot, search again) across dozens of sources.

Best tool: Parallel.

This is Parallel's home turf and the benchmarks back it up. On BrowseComp, Parallel scored 58% against GPT-5's 38% and Exa Research's 14%. On DeepResearch Bench, Parallel's Ultra8x processor scored a 96% win rate against the reference. Customers include Clay, Starbridge, Sourcegraph, and Fortune 100/500 firms.

The architecture is the differentiator. Parallel's Task API exposes nine depth tiers from Lite to Ultra8x, priced $5 to $2,400 per 1,000 tasks. Every result includes a Basis: citations, reasoning, and confidence scores. The separate FindAll API uses a three-stage candidate-verify-enrich pipeline for bulk entity collection ("find every dental clinic in Ohio with a 4+ Google rating").

How the runners-up fail on this job:

Exa. Has a /research endpoint but underperforms Parallel on hard tasks (14% vs 58% on BrowseComp). Fine for exploratory research with a few sources; not the right tool for long-tail synthesis.
Tavily. Has a /research endpoint with pro and mini models, useful for lighter multi-hop questions. Built for "search then summarise," not the 100-source synthesis tier.
Firecrawl. /agent (FIRE-1) is better at multi-step browsing tasks involving interaction (navigating filters, multi-step forms) than at breadth-first source gathering. Complementary, not a substitute.
Perplexity Sonar Deep Research. Genuinely the fastest in the category: 3 to 5 minute runs against Parallel Ultra8x's potentially 30-minute runs, on a 200B+ URL proprietary index. Trade-off: speed at the cost of inspectability. You see citations but not the reasoning chain. For a daily competitive monitor, Sonar wins. For research where methodology has to be defensible, Parallel's Basis framework gives you more to audit.
Claude Research (native). Runs on Brave with Anthropic's orchestration, 5 to 15 minutes per run. Good for casual deep research inside Claude.ai. Less depth than Parallel, less speed than Sonar, less control than either.

Deep research is the job where failures are hardest to catch, and output polish is inversely correlated with audit difficulty. A Parallel Ultra8x report looks like a McKinsey deliverable. That polish is a feature when the underlying work is sound and a bug when you are tempted to skip verification because the report sounds confident. Treat every deep research output as the first draft of an investigation, not the investigation.

Job Four: Monitoring and Current Events

When you need to react to fresh content (news, releases, mentions, changes) within minutes or hours of publication.

Best tool: depends on what "monitoring" means.

This is the one job where there is no single winner among the four named tools, because "current events" and "monitoring" are actually two different jobs that sound alike.

For one-shot fresh-content queries ("what was published about X today"): Tavily with topic: "news" and days: 1 is the cleanest call shape. The recency filter is reliable, the index has good news coverage, and you get clean text back in one call. Latency is around 180 ms.

For continuous monitoring (watching sites or topics over time, getting notified when something changes): Parallel Monitor is purpose-built for this. You configure a topic or URL, set a cadence, and it watches and delivers via webhook. None of the other four tools ship this as a native product.

For genuine real-time streaming (delivery in minutes after publication across thousands of sources): you are out of the four-tool comparison. The right tool is NewsCatcher, whose USP is exactly this: 140,000+ news sources, 100+ countries, delivery in minutes after publication via streaming or polling. NewsCatcher is enterprise-priced with no public tariffs, which is the reason it is not in most builders' default stack.

How the runners-up fail on this job:

Exa. Vector indexes update more slowly than keyword indexes. Exa's index is not real-time. You can use livecrawl: "always" to bypass the index and fetch a page right now, which is excellent when you know the URL. But for "what news broke today" with no URL in hand, Exa is not the right call. Note: livecrawl is API-only, not available through Exa's MCP connector.
Tavily for sustained monitoring. Tavily is a search tool, not a monitoring product. You can poll it on a schedule yourself, but you are building the monitoring layer that Parallel ships natively. For a one-off agent that does a fresh-news sweep, fine. For a production monitor, build it on Parallel Monitor or NewsCatcher.
Firecrawl. Not a search tool. You could schedule a recurring crawl of a known site, which is a different (and useful) shape of monitoring: watching a specific company's blog or pricing page for changes. But Firecrawl is not the tool for "what is happening in the world right now."
Parallel for one-shot news queries. Reverse of the Tavily problem. Parallel's strength is depth. Sending it a single "what happened today" query is paying for a depth tier you do not need.

A note on what "real time" actually means. In agentic search it is minutes, not seconds. Even livecrawl is a synchronous HTTP fetch with rendering overhead. For news that broke 90 seconds ago, none of these tools reliably has it. NewsCatcher's "minutes after publication" claim is the strongest in the category and the most honest. Anyone selling sub-minute delivery is selling a story.

Job Five: Extract Structured Data from a Known Page

When you already have the URL (from a search, from the user, from another tool) and you need clean text or structured fields from that specific page.

Best tool: Firecrawl, with caveats.

The question splits into two cases depending on how the page is built.

Case A: the page is static HTML. Most documentation, blog posts, news articles, arXiv papers. Any of the four named tools can handle this, plus Jina Reader is free for basic use and excellent on this case. Tavily's tavily_extract reads up to 20 URLs in one call and returns clean Markdown. Exa's web_fetch_exa does the same. There is no quality difference worth defending between them on static pages.

Case B: the page is JavaScript-rendered. This is most of the modern web: React, Vue, Angular, single-page apps where the content only appears after the JS executes. A static HTTP fetch (curl, the simple extractors) returns an empty page. Firecrawl is the tool that handles this reliably. It renders JS via headless Chrome by default, and where Tavily and Jina Reader return empty pages, Firecrawl returns the full content.

For structured extraction specifically (you want named fields like "price," "rating," "release date" from a product page) Firecrawl's /extract API takes a JSON schema and returns structured data via an LLM. Pass schema={name, price, rating} and get a typed object back. This is the cleanest way to extract structured fields from arbitrary pages in the category.

How the runners-up fail on this job:

Tavily extract. Solid on static pages. Less reliable on JS-heavy sites. The failure mode is silent: you get back a page that looks valid but is missing the dynamically rendered content.
Exa web_fetch. Same story as Tavily. Excellent for clean static pages, weaker on JS-heavy ones. No structured extraction by schema.
Parallel Extract. Available, works, but Parallel is priced for deep research workloads. Using its extract endpoint for routine page reading is overpaying.
Jina Reader. Free, fast, excellent for static pages. Cannot handle JS-rendered content. Cannot do structured schema extraction.
Claude Code with Playwright. A consequence of Claude Code being able to write and run code. You can write a script that renders any page. The cost is engineering time and ongoing maintenance. Useful when you need behaviour Firecrawl cannot do (very custom selectors, specific session state). Not the default.

One thing worth saying directly. Firecrawl is the most powerful page-interaction tool in this category, which means it is also the tool with the most potential to do things publishers would not consent to if asked. Be deliberate about what you point it at.

The Supporting Cast: Perplexity, Jina, Brave, Serper

These four tools are part of the live conversation but are not really in the same comparison as Tavily, Exa, Firecrawl, and Parallel. Quick honest descriptions.

Perplexity Sonar API. A proprietary 200B+ URL index with hybrid BM25 plus vector retrieval. Returns finished LLM answers with citations at 358 ms median latency. No MCP connector. Best when you want the fastest path from question to cited answer and do not need to inspect raw results. A Cloudflare investigation surfaced evidence of stealth crawlers using forged user agents on Perplexity's side; valuation sits above $21B. If source consent matters to your stack, this matters.

Brave Search API. The only major independent keyword index outside Google and Microsoft with an open API. 40B+ pages, 100M+ daily updates. Pure index, no AI layer. $5 per 1,000 calls. After the August 2025 Google and Bing API shutdowns, half the agentic search ecosystem now leans on Brave for the "independent keyword index" slot. No MCP connector.

Serper. The cheapest way to access Google SERP at $0.30 to $1.00 per 1,000 calls. Pure proxy, no AI layer, no page contents. Tavily itself uses Serper as a backend. As a standalone, Serper makes sense for high-volume keyword scraping where you already have downstream extraction infrastructure. For most builders, Tavily is the better total cost once you account for downstream LLM tokens.

Jina AI. Not a search system but a model stack: Reader API (clean text extraction from any URL, free for basic use), Embeddings API (the v5 line is SOTA in class), Reranker API. The right components when building your own RAG pipeline. Acquired by Elastic in October 2025.

MCP versus API: The Access Mode That Quietly Changes the Answer

A pattern shows up across most of these tools that buyers often miss: the MCP version is significantly stripped down compared to the direct API. Not a defect. A deliberate product decision, and it affects which tool wins for which job.

MCP (the Model Context Protocol that Anthropic introduced) is what lets Claude (and other LLMs) talk to external tools through a standard interface. When you click "Connect Exa" or "Connect Parallel" inside Claude.ai, MCP is the plumbing. The benefit is real: one click, no API keys in your code, no infrastructure to manage. The trade-off is real too. Compare the surface areas:

Tool	API surface / MCP surface / What is missing in MCP
01Exa	`/search` with neural/keyword toggle, `/findsimilar`, `/answer`, `/research`, Websets, livecrawl, domain and date filters \| Two tools: `web_search_exa`, `web_fetch_exa` \| No filters, no livecrawl, no Websets, no `/research`, no explicit neural/keyword toggle
02Parallel	6 products: Search, Task (9 depth tiers), Extract, Monitor, Chat, FindAll \| One tool: `web_search_preview`, agentic mode only \| No processor selection (Lite, Standard, Pro, Ultra8x), no FindAll, no Monitor, no domain or date filters
03Firecrawl	`/extract` with JSON schema, `/agent` (FIRE-1), Spark AI models \| scrape, crawl, map, interact, search \| No `/extract` with schema, no autonomous agent, no Spark
04Tavily	Adds `include_answer`, `/research` endpoint, higher rate limits \| tavily_search, tavily_extract, tavily_crawl, tavily_map \| Mostly parity; missing `/research`

The practical implication. For casual use, MCP is plenty. For production agents that need fine-grained control (date filters, domain scoping, livecrawl, depth selection, structured extraction), the API is the only way.

There is a subtler point. MCP tends to push tools toward an agentic mode: the tool decides how to do the work. APIs tend to expose explicit knobs. That trade-off (autonomy versus control) is the deeper story behind the surface gap. When Parallel exposes only web_search_preview through MCP and lets the tool itself choose depth and breadth, that is coherent with how MCP is meant to be used. But if you need Ultra8x specifically (say, for a research engagement where the methodology has to be defensible) you need the API.

The honest framing: choosing MCP is choosing convenience and giving up auditability. The tool is more autonomous, which means the failure modes are more autonomous. For research workflows where you have to defend the methodology, that trade is worth thinking about carefully, not defaulting through. For internal experimentation and "let me see what this tool can do" exploration, MCP is exactly right.

One more catch. Several of these tools mark MCP as the canonical "consumer" path and only document the full API in developer docs. If you sign up, install MCP, and conclude "this tool does not have livecrawl" or "this tool does not let me pick depth," check the API docs before you write it off. The feature is probably there. It is just not exposed through MCP.

Pricing, Honestly

A direct comparison of prices across these tools is misleading without a fixed workload. The pricing models diverge enough that the cheapest tool per call can be the most expensive total cost in your actual pipeline.

Tool	Pricing model / Typical price / What you actually pay for
01Exa	Per call, separate for search and contents \| $5/1K search; $1/1K contents; Websets separate \| Each search call; each page fetched
02Tavily	Per credit \| $0.005 to $0.008 per credit; 1 credit = 1 search; free tier 1,000/month \| Each search call (with content included in advanced mode)
03Firecrawl	Per credit; 1 credit = 1 page \| From $16/month for 3,000 credits \| Each page scraped or crawled
04Parallel	Variable by product \| Search $4 to $9/1K; Task $5 to $2,400/1K depending on processor \| Depth, not just call count
05Perplexity Sonar	Per token plus per search \| $1/M input + $5/M output + $5/1K searches \| Tokens flowing through the LLM answer
06Brave	Per call \| $5/1K; free tier 2,000/month \| Raw search calls
07Serper	Per call \| $0.30 to $1.00/1K; free tier 2,500/month \| Raw SERP calls only
08Jina Reader	Per token; basic free \| Free tier; token-based premium \| Page extraction tokens

Two things to internalise from this table. Cheapest per call is not cheapest per workflow. Serper at $0.30/1K is the lowest sticker price in the category, but a RAG pipeline built on raw Serper consumes roughly 40% more downstream LLM tokens than the same pipeline built on Tavily, because Serper does not return page contents. Depth costs more than breadth. Parallel's Ultra8x at up to $2.40 per single task is not in the same pricing category as Tavily at $0.008 per search. Sending the wrong volume of queries through the wrong depth tier is the most common cost mistake we see in production agents. Pricing also shifts after acquisitions (Tavily by Nebius, February 2026; Jina by Elastic, October 2025). Architect your system so you can swap tools at the abstraction layer.

When This Comparison Is the Wrong Question

This is the section the rest of the internet skips. There are real cases where the answer is "none of these tools," and being honest about them saves builders months of misdirected effort.

You actually need a vector database, not a web search tool. If your data is your own (your company's documents, your customer support history, your internal wiki) none of the tools above is the answer. You need an embeddings pipeline plus a vector store (Pinecone, Weaviate, Qdrant, pgvector). Confusing "semantic search of the open web" with "semantic search of my own data" is the most common architectural mistake we see in audits. Exa indexes the public internet. It does not index your Notion.

Your primary sources are behind paywalls or in proprietary databases. Vector indexes of the open web cannot reach paywalled academic platforms, clinical trial databases, regulatory filings, financial databases, primary legal documents. If your research workflow lives in arXiv, PubMed, SEC EDGAR, FRED, patents, or clinical trials, Valyu is the only tool that gives unified API access to both open web and these proprietary databases. None of Tavily, Exa, Firecrawl, or Parallel solves this problem.

You need to interact with a logged-in product (Salesforce, Notion, Linear). None of the tools above does authenticated browser automation against your own SaaS stack. The right answer is purpose-built MCP servers for those products, or in-house Playwright.

The job is one specific website, scraped daily. If you just need one site (a competitor's pricing page, one e-commerce vendor) crawled cleanly, you do not need an agentic search stack. A small scraper or Firecrawl in single-site mode is enough.

You need sub-minute freshness. Tavily's 180 ms latency is the response time of a single search; the content underneath was indexed minutes to hours ago. If your job genuinely requires sub-minute freshness (algorithmic trading, breaking-news arbitrage) you are in a different infrastructure category entirely.

Your team is two people and you have not built anything yet. Do not start with the comparison. Pick one tool that fits your most pressing job, ship it, and only expand when you hit a specific limit. Builders who pre-optimise tool selection before they have a working system usually never ship.

The Decision-Tree Table

This is the table to bookmark. Match your job in the left column. The right columns name the tool we would reach for first and what we would use it for. Italics flag the runner-up worth knowing.

Your job	First choice / Runner-up / Skip these for this job / Quick note
01Find by meaning (conceptual query)	Exa (neural mode) \| Perplexity Sonar (if you want a finished answer) \| Tavily, Brave, Serper, Claude native \| Only Exa has a real semantic web index
02Find by exact match (named entity, fact)	Tavily \| Brave (if independence from Google matters) \| Parallel (overkill), Exa neural (use keyword mode if you stay in Exa) \| One call, clean text back
03Deep research (multi-step synthesis, 20+ sources)	Parallel (Ultra8x for hardest tasks) \| Perplexity Sonar Deep Research (3 to 5x faster) \| Tavily, Firecrawl, Exa \| Parallel wins on hard benchmarks; Sonar wins on speed
04One-shot news lookup (today, this week)	Tavily with `topic: news` \| Exa with livecrawl (API only) \| Parallel (depth you do not need), Brave (works but no AI layer) \| Tavily's `days` filter is reliable
05Continuous monitoring (track topic or site over time)	Parallel Monitor \| NewsCatcher (if scale and real-time matter) \| Tavily, Exa, Firecrawl \| Tavily can be polled, but you are rebuilding what Parallel ships
06Real-time news streaming (140K+ sources)	NewsCatcher \| — \| All four named tools \| Out of the four-tool comparison
07Read a known URL (static HTML)	Tavily extract or Exa web_fetch \| Jina Reader (free tier) \| Firecrawl (works, but slower and pricier for static pages) \| Any of them works; pick by stack
08Read a known URL (JavaScript-heavy site)	Firecrawl \| Claude Code with Playwright (custom cases) \| Tavily, Exa, Jina \| Firecrawl renders JS by default
09Extract structured fields by schema	Firecrawl `/extract` \| Parallel Extract (premium) \| Tavily, Exa \| Pass a JSON schema, get a typed object
10Crawl an entire site	Firecrawl \| Tavily crawl (less reliable on JS) \| Exa, Parallel \| Firecrawl is purpose-built for this
11Interact with a page (click, fill, navigate)	Firecrawl `interact` \| Claude Code with Playwright \| All search tools \| Only Firecrawl ships this natively
12Build structured lists of entities (e.g. all SaaS companies in healthcare)	Exa Websets or Parallel FindAll \| Firecrawl extract plus an LLM \| Tavily, Perplexity \| Both are purpose-built for entity collection
13Semantic search over your own data	None of these tools \| — \| All of them \| You need a vector DB (Pinecone, Weaviate, Qdrant, pgvector)
14Open web plus paywalled databases (arXiv, PubMed, SEC) in one call	Valyu \| — \| All four named tools \| Only tool with unified open + closed access

Three cross-cutting principles that come out of this table:

Most teams need two tools, not one. The split between "find by meaning" and "find by exact match" alone forces a two-tool stack for any team doing both kinds of research. Add deep research or extraction and you are at three. Trying to make one tool cover all five jobs is the most common architectural mistake in this category.

Match the job to the tool's actual capability, not its marketing. Tavily's homepage emphasises "AI-optimised search." Exa's emphasises "semantic search." Parallel's emphasises "deep research." These are accurate but lossy descriptions. The fine-grained truth is in the API docs. Read those before you decide.

Verify outputs across all tools. Every tool here can fail silently. Every one of them returns confident output even when the underlying retrieval missed important sources. Build verification into your workflow: sample claims against known ground truth, spot-check citations, monitor for behaviour changes over time. The industry will not do this for you, because the demos look better without it.

FAQ

Which is better, Tavily or Exa? Different jobs. Tavily wins for exact-match queries where you know the words. Exa wins for conceptual queries where you do not. Most production agents need both. If you can only pick one, most teams overestimate how much of their work is conceptual. Tavily is the more common right answer for first-tool teams.

Is Firecrawl a search engine? No. It is an extraction and interaction tool. It scrapes pages, crawls sites, and drives browsers. It does have a search tool, but that is an aggregator, not a proprietary index. Use Firecrawl when you have a URL and need its contents, especially if the site is JavaScript-heavy.

Is Parallel worth the price for normal research tasks? Usually not. Parallel's depth tiers go up to $2.40 per Ultra8x task, built for genuinely hard multi-step questions. For day-to-day lookups, Tavily or Exa are an order of magnitude cheaper and faster.

Do I need a separate tool, or can Claude search the web on its own? Claude's native web_search runs on Brave (keyword index) and is enough for simple lookups. For semantic search, JS-heavy pages, deep research, or structured entity collection, the specialised tools are significantly better.

What is the difference between MCP and API for these tools? MCP gives you one-click Claude integration with a narrower feature surface. The direct API exposes filters, depth controls, and specialised endpoints not available through MCP. For Exa, Parallel, and Firecrawl in particular, MCP is significantly stripped down. For production work, you almost always need the API.

What changed in the market in 2025 and 2026? Google and Bing shut down their search APIs in August 2025, making independent indexes (Exa, Perplexity, Brave, Parallel, NewsCatcher) strategically more valuable. Tavily was acquired by Nebius in February 2026 for up to $400M. Jina was acquired by Elastic in October 2025. The infrastructure layer is consolidating.

Closing

One principle from this guide is load-bearing: the same team usually needs two of these tools, not one. The question is which two, and the answer depends on the job mix in your actual workflow.

Most builders we talk to have already picked a default (usually Tavily, because it was the first one with a clean Claude integration) and are about to hit the moment when that default does not work for a specific query type. Adding a second tool that fits the missing job (usually Exa for conceptual queries, Firecrawl for JS-heavy sites, or Parallel for deep research) takes their agent from "works for the easy half" to "actually does the job."

The harder choice, the one this guide cannot make for you, is whether you have correctly identified the jobs your agent actually does. That is what a Garden knowledge-layer audit is for. We map the queries your team runs, the sources they need to reach, the failures you have not noticed yet, and we tell you which tools fit which slot, without selling them. We do not take referral fees from any vendor in this guide.

If you are figuring out which combination of these tools your team needs, that is the conversation we have in a Garden audit. Email a@gardenresearch.eu.

This guide is part of Garden Research's investigation into how AI agents navigate the open web. The complete agentic AI search guide covers all eight jobs and eleven tools in the current infrastructure layer. The guide to semantic search goes deeper on Job One.