Top
Best
New

Posted by sbpayne 6 hours ago

If DSPy is so great, why isn't anyone using it?(skylarbpayne.com)
172 points | 104 commentspage 2
tcdent 3 hours ago|
DSPy is cool from an integrated perspective but as someone who extensively develops agents, there have been two phases to the workflow that prevented me from adopting it:

1. Up until about six months ago, modifying prompts by hand and incorporating terminology with very specific intent and observing edge cases and essentially directing the LLM in a direction to the intended outcome was somewhat meticulous and also somewhat tricky. This is what the industry was commonly referring to as prompt engineering.

2. With the current state of SOTA models like Opus 4.6, the agent that is developing my applications alongside of me often has a more intelligent and/or generalized view of the system that we're creating.

We've reached a point in the industry where smaller models can accomplish tasks that were reserved for only the largest models. And now that we use the most intelligent models to create those systems, the feedback loop which was patterned by DSPy has essentially become adopted as part of my development workflow.

I can write an agent and a prompt as a first pass using an agentic coder, and then based on the observation of the performance of the agent by my agentic coder, continue to iterate on my prompts until I arrive at satisfactory results. This is further supported by all of the documentation, specifications, data structures, and other I/O aspects of the application that the agent integrates in which the coding agent can take into account when constructing and evaluating agentic systems.

So DSPy was certainly onto something but the level of abstraction, at least in my personal use case has, moved up a layer instead of being integrated into the actual system.

sbpayne 3 hours ago|
I think many people have the same experience! And that's the point I'm trying to make. There are patterns here that are worth adopting, whether or not you're using Dspy :)
ndr 5 hours ago||
It's not as ergonomic as they made it to be.

The fact that you have to bundle input+output signatures and everything is dynamically typed (sometimes into the args) just make it annoying to use in codebases that have type annotations everywhere.

Plus their out of the box agent loop has been a joke for the longest time, and writing your own if feasible but it's night and day when trying to get something done with pydantic-ai.

Too bad because it has a lot of nice things, I wish it were more popular.

sbpayne 5 hours ago|
Yeah! I can agree with this. There's some improved ergonomics to get here
verdverm 5 hours ago||
Have you looked at ADK? How does it compare? Does it even fit in the same place as Dspy?

https://google.github.io/adk-docs/

Disclaimer, I use ADK, haven't really looked at Dspy (though I have prior heard of it). ADK certainly addresses all of the points you have in the post.

sbpayne 5 hours ago||
I personally haven't looked super closely at ADK. But I would love if someone more knowledgeable could do a sort of comparison. I imagine there are a lot of similar/shared ideas!
verdverm 5 hours ago||
There are dozens if not 100s of agent frameworks in use today, 1000s if you peruse /new. I'm curious what features will make for longevity. One thing about ADK is that it comes in four languages (Py, TS, Go, Java; so far), which means understanding can transfer over/between teams in larger orgs, and they can share the same backing services (like the db to persist sessions).
sethkim 4 hours ago||
We build a product that's somewhat similar in spirit to DSPy, but people come to us for different reasons than the OP listed here.

1) It's slow: you first have to get acquainted with DSPY and then get hand-labeled data for prompt optimization. This can be a slow process so it's important to just label cases that are ambiguous, not obvious.

2) They know that manual prompt engineering is brittle, and want a prompt that's optimized and robust against a model they're invoking, which DSPy offers. However, it's really the optimizer (ex. GEPA) doing the heavy-lifting.

3) They don't actually want a model or prompt at all. They want a task completed, reliably, and they want that task to not regress in performance. Ideally, the task keeps improving in production.

Curious if folks in this thread feel more of these pains than the ones in the article.

sbpayne 4 hours ago|
I think in some sense, this is the real thing everyone wants. Everything else is kind of an implementation detail! Would be really curious to see what you're building!
sethkim 4 hours ago||
Feel free to shoot me a note at seth@sutro.sh if you want to check it out!
CraftingLinks 5 hours ago||
I used dspy in production, then reverted the bloat as it literally gave me nothing of added value in practice but a lot of friction when i needed precise control over the context. Avoid!
matusp 4 hours ago|
I enjoy working with it. I mostly just use it to define the input and outputs more programmatically compared to raw prompts.
Silamoth 2 hours ago||
Am I the only one disappointed this was about some LLM slop and not digital signal processing? DSP is a well-established technical acronym, so I expected to hear about a new Python DSP library. Oh well.
whinvik 3 hours ago||
I don't get it. All these are provided by many different agent libs like langgraph, Pydantic AI etc. I thought DSPy was for prompt optimization but I could never wrap my head around that aspect since like Langchain, DSPy seems to hide stuff a bit too much.

So this article seems surprising since it emphasizes more the non prompt optimization aspects. If that was the selling point I would rather use something like Pydantic AI when I already use Pydantic for so much of the rest.

sbpayne 3 hours ago|
I think the reality is that prompt optimization is one of the only "legible benefits" (ie easy to understand why its valuable).

But I think it misses the point of what Dspy "is". It's less that Dspy is about prompt optimization and more that, Dspy encourages you to design your systems in a way that better _enables_ optimization.

You can apply the same principles without Dspy too :)

panelcu 5 hours ago||
https://www.tensorzero.com/docs has similar abstractions but doesn't require Python and doesn't require committing to the framework or a language. It's also pretty hard to onboard, but solves the same problems better and makes evaluating changes to models / prompts much easier to reason about.
TheTaytay 1 hour ago||
Yes, I was more impressed with their decoupling of prompts from parameters!
sbpayne 5 hours ago||
I saw this some time ago! I personally have a distaste for external DSLs as I think it generally introduces complexity that I don't think is actually worthwhile, so I skipped over it. Also why I'm very "meh" on BAML.
GabrielBianconi 4 hours ago||
TensorZero works with the OpenAI SDK out of the box:

```

from openai import OpenAI

# Point the client to the TensorZero Gateway

client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")

response = client.chat.completions.create(

    # Call any model provider (or TensorZero function)

    model="tensorzero::model_name::anthropic::claude-sonnet-4-6",

    messages=[

        {

            "role": "user",

            "content": "Share a fun fact about TensorZero.",

        }

    ],
)

```

You can layer additional features only as needed (fallbacks, templates, A/B testing, etc).

pjmlp 5 hours ago||
Never heard of it, that is already a reason.
sbpayne 5 hours ago|
hahaha this is true!
deepsquirrelnet 4 hours ago||
Good article, and I think the "evolution of every AI system" is spot on.

In my opinion, the reason people don't use DSPy is because DSPy aims to be a machine learning platform. And like the article says -- this feels different or hard to people who are not used to engineering with probabilistic outputs. But these days, many more people are programming with probability machines than ever before.

The absolute biggest time sink and 'here be dragons' of using LLMs is poke and hope prompt "engineering" without proper evaluation metrics.

> You don’t have to use DSPy. But you should build like someone who understands why it exists.

And this is the salient point, and I think it's very well stated. It's not about the framework per se, but about the methodology.

sbpayne 4 hours ago|
yeah this is the main point I wanted to get across! I rarely recommend people to use Dspy; but I think Dspy is often so polarizing that people "throw out the baby with the bathwater". They decide not to use Dspy, but also don't learn from the great ideas it has!
_andrei_ 4 hours ago|
Almost all the points are not about what DSPy is mainly supposed to offer. What's supposedly great at is automatic optimization, for everything else... who the hell puts Python in production just to make some API calls? There are "frameworks" available in all the better languages, but the constructs behind are not that complicated. And why does DSPy even try to compete with LangChain/Graph/crap?
sbpayne 3 hours ago|
I think automatic optimization is valuable, but it's not what Dspy "is"; you can see this consistently through @lateinteraction's tweets.

And hopefully it's clear enough from the post: I'm not necessarily suggesting people use Dspy, just that there are important lessons to take with you, even if you don't use it :)

More comments...