Top
Best
New

Posted by mtricot 4 days ago

Show HN: Airbyte Agents – context for agents across multiple data sources

I’m Michel, co-founder and CEO of Airbyte (https://airbyte.com/). We’ve spent the last six years building data connectors. Today we're launching Airbyte Agents (https://docs.airbyte.com/ai-agents/), a unified data layer for agents to discover information and take action across operational systems.

Here’s a quick walkthrough: https://www.youtube.com/watch?v=ZosDytyf1fg

As agents move into real workflows, they need access to more tools (e.g. Slack, Salesforce, Linear). That means a ton of API plumbing: authentication, pagination, filters, handling schema, and matching entities across systems.

Most MCPs don’t fix this. They’re thin wrappers over APIs, so agents inherit their weak primitives and still get it wrong most of the time, especially when working across tools.

An even deeper issue is that APIs assume you already know what to query (think endpoints, Object IDs, fields), whereas agents usually start one step earlier: they need first to discover what matters before they can even start reasoning.

So we built Airbyte Agents to be a context layer between your Agents and all of your data. The core of this is something we call Context Store: a data index optimized for agentic search, populated by our replication connectors. All that work on data connectors the last six years comes in handy here!

This gives agents a structured way to discover data, while still allowing them to read and write directly to the upstream system when needed.

What got us working on this was an insane trace from an agent we were migrating to our new SDK. It was supposed to answer "which customers are at risk of leaving this quarter?" The trace had 47 steps. Most were API calls. The agent first had to find a bunch of accounts, then map them to the right customers, then look for tickets, bla bla... and when the Agent finally responded, the answer sounded ok, but was wrong. Not only that, it was excruciatingly slow. So we had to do something about it.

That 47-step agent is one example of a question where Airbyte Agents does particularly well. Other examples: - “Show me all enterprise deals closing this month with open support tickets." - “Find every support ticket that doesn’t have a Github issue opened”

Some of these might sound simple, but the quality of the answer changes dramatically when the agent doesn’t have to assemble all that context at runtime.

Once we had an early version of the product, I spent a weekend building a benchmark harness to see if it worked. Also for fun, I like writing benchmarks :). I compared calling the Airbyte Agent MCP vs calling a bunch of vendor MCPs directly. I tested retrieval, and search.

For the sake of simplicity, I used token consumption as a unit of measure. I think that’s a good proxy for how well agents are working. A failing agent (like the one that took 47 steps), will churn through lots of tokens while getting nowhere, while a successful one will get straight to the point.

Here's what I found when measuring: for Gong, it used up to 80% fewer tokens than their own MCP, for Zendesk up to 90% fewer, for Linear up to 75%, and for Salesforce up to 16% (Salesforce’s own SOQL does a good job here).

Of course there is the usual obvious bias: we are the builders of what we are benchmarking. So we made the test harness public: https://github.com/airbytehq/airbyte-agents-benchmarks. Feel free to poke at it, and please tell us what you find if you do!

It's still early and some parts are rough, but we wanted to share this with the community asap. We'd love to hear from people building agents: - Are you indexing data ahead of time, or letting the agent call APIs live? - How are you matching entities across systems?

Would also love to hear any thoughts, comments, or ideas of how we could make this better, and if there are obvious things we’re missing. For now, we’re excited to keep building!

149 points | 47 commentspage 2
ck_one 3 days ago|
More and more SaaS companies like ServiceNow or Hubspot are creating new tollgates for agent api calls. How do you think will this impact Airbyte Agents? I guess that replicating data locally will be harder since the platforms will try to protect it or charge for it.
aaronsteers 2 days ago|
It's a good question and I won't pretend to predict the future on this one. I will say, I think Airbyte Agents is in a good position because our core Data Replication product has always had to mitigate the impacts of rate limiting and cumbersome upstream APIs. The new Agents toolset gives you the ability to query the upstream APIs directly (read: as a passthrough) while also letting you bypass them entirely when your agent can answer its question via the Context Store directly. Time and feedback from our users will confirm, but I do think this gives our customers a good balance of control - when to query upstream directly and when to utilize the Context Store to work around API limitations - whether inherent or artificially enforced by the vendor.
mtricot 4 days ago||
Just want to call out a couple of nuances in our methodology. In general, we tried our best to do apples-to-apples comparisons where we could, and gave ourselves a discount where we couldn’t. Unsurprisingly, it’s a challenge to find MCPs for various vendors (which is another reason we are trying to solve this). Here’s a video walkthrough of the benchmark harness:https://www.loom.com/share/9d96c8c64c1a4b7fad0356774fc54acc

Where the comparison wasn't valid or not apples-to-apples:

Gong and Zendesk: no official native MCP exists, so we used the most popular community implementations we could find. We were only able to benchmark Gong Search as the Gong MCP does not have a Get tool call.

While our Search testing yielded the same number of records on either path, vendor-specific search implementations means results aren’t identical. Contents are similar in general, so the ratios remain directionally correct.

The general test set:

2 scenarios (Retrieval and Search) across 4 connectors isn’t a huge test set. While we hope to extend this over time, we’ve made the harness public so anyone can contribute in the meantime. Let us know if you find any MCP with better results!

Where the vendor MCP wins or ties:

Salesforce showed the smallest win at 16%. This is primarily because Salesforce, unlike many vendors, uniquely provides great search support out of the box with their SOQL.

We see identical records for Get. As noted, Search returns different sets of identical counts. Airbyte uses fewer tokens because the Salesforce records contain mandatory metadata (type and url).

Where the vendor MCP is costly to context:

Zendesk is a great example of this. The extreme gap is because the Zendesk MCP (reminder - a community alternative) returns the entire API response in search results. This averages to 9KB per record against our production Zendesk account!

Airbyte’s implementation provides filtering, which allows agents to retrieve the minimal data needed to achieve the outcome, explaining the drastic gap.

ecares 4 days ago||
Did you find that some data model patterns were easier to detect for some LLM ? I am curious on how training might have made some agents better at graph navigation for instance?
aaronsteers 4 days ago|
AJ here, from Airbyte.

Yes, we've definitely found that some API data models are easier for models to navigate than others.

The largest factors of Agent inefficiency we've identified so far are: 1. Many APIs lack robust-enough search, forcing agents to page through hundreds or thousands of paginated responses until they find the record they are looking for (our Context Store addresses this). 2. Many APIs have HUGE response sets. Our MCP helps handle this by letting the agent decide exactly what fields they can return. 3. With our SDK, you can literally build your own MCP on top of any source we support (50+ right now and will grow). This is super powerful, and allows you to build more ergonomic MCP servers and tools - even if the models themselves are not intuitive or easy for the LLM to leverage directly.

Combining all three of these together, we see the vast majority of challenges can be addressed via a strong system prompt for guidance. Fine tuning could get you further but anyway, you'd still want your fine tuned model to build on this same foundation, since the efficiences will transfer across use cases and models.

@ecares - Does this answer your question? What do you think?

woeirua 4 days ago||
Your point about search being a bottleneck is spot on. IMO, search APIs should return guidance to agents to help them winnow down the results faster. For example, if your query returns 1000 results, then it should tell the agent, "too many results, we recommend you filter on column X because of Y to improve your search. Here are the possible values in column X: ..."
carefulfungi 4 days ago|||
There are a lot of APIs like this that I really wish would expose downloading a parquet file instead of trying to implement server-side filtering and reporting query features.
aaronsteers 3 days ago||
+1

Working with APIs is often frustrating and the worst ones are terribly ineficient and frustrating. Our Agent SDK and Agent Context Store insulates you and your agent from this headache, allowing you to query from those synced datasets directly.

The feedback about wanting to download a parquet file is super interesting...

aaronsteers 3 days ago|||
Glad to hear this resonates with you also. We're aiming to give agents more control over their context, and easier access paths regardless of the source system.
afxuh 3 days ago||
Congrats, you built an ETL pipeline and called it an agent. The industry has come full circle.
davinchia 3 days ago|
Haha indeed!

On a more serious note, just as swyx mentioned in a comment further up, we do believe a lot of the challenges of reliably operationalising agents boil down to data. All of which is non-obvious to AI engineers (besides Frontier Labs gathering/generating data for model training).

What the right shape is - we are all figuring it out. Happy to trade notes.

xcf_seetan 3 days ago||
Shameless plug: I have written a paper about using the MCP server architecture to enable agents to overcome the knowledge cutoff, to work with software released after the training stop.

[https://zenodo.org/records/19925469]

ritonlajoie 4 days ago||
Hi Michel, congrats and I have nice memories of working with you in lafayette street !! Keep up the good work on airbyte ! :)
mtricot 3 days ago|
Great to see you here!
smadam9 2 days ago||
What are the main differences to Glean? My company is evaluating Glean and I feel like Airbyte is a strong alternative (at least for some use cases).

How does Airbyte handle data authorization?

Tsarp 3 days ago||
Doesn't Skills solve all of this?

OpenClaw, Hermes and other agents have already made skill adoption mainstream?

Are you guys still seeing a future where people are dumping entire MCP tool defs into context?

aaronsteers 3 days ago|
Great question, @Tsarp - Skill and tools work great together. What we've found is that agents generally need both to achieve great results. We're actually not trying to replace skills, but to give them new super powers.

Are there any examples you've run into where skills were missing tools (or data) that they needed for a specific task?

Tsarp 3 days ago||
Hmm, hoping this isn't a generic LLM generated response.

Skills have the scripts folder and you can precisely describe when and when not to use a script. This can end up directly wrapping API(s), CLIs, generic scripts or even other MCP servers.

CC and codex both have the skill creator and you can have them build the skill for you.

Havent run into any scenarios where skills were missing tools. 1-2 iterations and its usually taken care off quite quickly.

aaronsteers 3 days ago||
Hey, fair enough. (100% human here, btw.) I think I misread your original question to be asking "why do we need a service (whether accessed via API/SDK/MCP/etc.)" vs just having skills (markdown + scripts)".

If you are already leveraging skills as scripts and APIs in your skills, then you understand the distinction. I'll attempt to re-answer your question with now hopefully a better understanding:

I think Airbyte Agents helps your agent by giving access to data across any and all of the systems it may need to get data from, or write data to. While you could hit the service APIs directly (via REST/CLI/etc.), in practice we find that not all use cases are amenable to this. Airbyte Agents does have REST APIs as well as SDKs and of course the MCP interface - so it's not really about MCP tools specifically, more about how you can access the data. The Airbyte Agents interface also reduces the number of creds that the agent needs to handle, giving a single portal (with logging and audit capabilities) for all the actions your agent is taking.

Sorry for the red herring of skills-v-tools. Let me know if you have any additional questions!

pjm331 4 days ago||
sounds very familiar to what I ended up doing on my internal system - especially anything to do with search - much better to just sync everything to a DB and give the agent access to the DB
aaronsteers 3 days ago|
That's great to hear - great minds think alike!

> give the agent access to the DB

This is where Airbyte really can shine, I think, and the total can be more the sum of the parts. Because Airbyte excels at data replication already, we can populate your the Agent Context Store without users or agents ever needing to think about the words "ELT" or "ETL".

We're listening carefully to feedback so we hope you will give it a try and let us know how it goes! Thanks!

pjm331 3 days ago||
yeah this is one of the few AI-related products that I have seen that make sense to me

but i also wonder to what extent this needs to be its own thing or if this is just something that it looks like we need but really people just need to shovel more stuff into their data warehouse / data lake that you never had reason to before, because now that's all fodder for agentic search

aaronsteers 2 days ago||
Great point. Many of Airbyte's customers are doing just that - adding new sources to their warehouses - like Google Drive, Gong, and a ton of sources that weren't as interesting previously for data analytics. But this creates a ton of work for the data engineering teams - to not only load all that extra data, but to deal with rate limits and then to conform the schemas into a usable format after loading.

For now, I think its 100% appropriate to think of the Context Store complementing the Warehouse and not replacing it per se. We're evaluating future integration options between the new Context Store and the traditional data warehouse, but nothing we have publicly announced as of now. I think both approaches have their strengths and killer use cases.

tomrod 3 days ago|
What actions does agents enable that weren't already available from Airbyte?
aaronsteers 3 days ago|
The new Airbyte Agents offering brings a ton of new capabilities actually.

1. Programmatic Interfaces: Including a new REST API, SDK, and MCP Server. 2. New action verbs: Not just replication anymore. We have get/set/list/update/upload, and more! 3. New credentials passthrough: For all the above, you OAuth to Airbyte and we OAuth on your behalf to the systems your agent needs. No need to provide your agents dozens of different secrets in order to access the systems it needs. 4. Context Store. Like your agents' own data warehouse, but completely automatic and hands-free. For those use cases that just aren't possible when calling the REST API directly.

Again - thanks for your comment and sorry for the longwinded response. More info here: https://docs.airbyte.com/ai-agents/

More comments...