Top
Best
New

Posted by nurimamedov 2 hours ago

Auto-compact not triggering on Claude.ai despite being marked as fixed(github.com)
164 points | 119 comments
btown 48 minutes ago|
Claude Code's only saving grace is that it's pretty good from a fresh session - it can largely find and re-load into context what it needs to load. If I see my context ticking down, I ask it to give me a summary and TODO list, and either copy it, or have it put that into a docstring of what it's working on. Then just start a fresh session on that file. Shouldn't need to do this, for sure, but it gets it done in a pinch.

My largest gripe with Claude Code, and with encouraging my team to use it, is that checkpoints/rollbacks are still not implemented in the VS Code GUI, leading to a wildly inconsistent experience between terminal and GUI users: https://github.com/anthropics/claude-code/issues/10352

nojs 22 minutes ago||
> checkpoints/rollbacks are still not implemented in the VS Code GUI

Rollbacks have been broken for me in the terminal for over a month. It just didn’t roll back the code most of the time. I’ve totally stopped using the feature and instead just rely on git. Is this this case for others?

gpm 15 minutes ago||
I've been using /rewind in claude code (the terminal, not using vscode at all) quite a bit recently without issue - if that's the feature you're asking about.

Not discounting at all that you might "hold it" differently and have a different experience. E.g. I basically avoid claude code having any interaction with the VCS at all - and I could easily VCS interaction being a source of bugs with this sort of feature.

nojs 12 minutes ago||
I mean double tapping escape, going back up the history, and choosing the “restore conversation and code” option. Sometimes bits of code are restored, but rarely all changes.

It worked when first released but hasn’t for ages now.

kuboble 44 minutes ago||
Anecdotal evidence of course but I have one long-running session in a terminal for over a month now. I work with it daily, compacts several times a day, I rollback conversation sometimes. All with no issues.
system2 31 minutes ago||
Unsure what your use case is, but compaction makes it lose anormous amount of context. Claude code is better used on a task-by-task basis; things get bad. The whole purpose of init and CLAUDE.md is to prevent long chats from losing context and approach more surgically.
kuboble 10 minutes ago||
I'm fully aware of that.

For the last month I've been working on a relatively big feature in a larger project.

I often compact the session when starting a new feature, often have to remind claude to read the claude.md etc. I still use it as if it was a new session regularly, it frequently doesn't remember what it did an hour ago, etc.

But the compact seems to work which is a very different experience than the one of the GP, who kills the session when it reaches the context limit and writes explicit summary files.

jampa 1 hour ago||
Slightly off topic, but does anyone feel that they nerfed Claude Opus?

It's screwing up even in very simple rebases. I got a bug where a value wasn't being retrieved correctly, and Claude's solution was to create an endpoint and use an HTTP GET from within the same back-end! Now it feels worse than Sonnet.

All the engineers I asked today have said the same thing. Something is not right.

eterm 1 hour ago||
That is a well recognised part of the LLM cycle.

A model or new model version X is released, everyone is really impressed.

3 months later, "Did they nerf X?"

It's been this way since the original chatGPT release.

The answer is typically no, it's just your expectations have risen. What was previously mind-blowing improvement is now expected, and any mis-steps feel amplified.

quentindanjou 1 hour ago|||
This is not always true. LLMs do get nerfed, and quite regularly, usually because they discover that users are using them more than expected, because of user abuse or simply because it attract a larger user base. One of the recent nerfs is the Gemini context window, drastically reduced.

What we need is an open and independent way of testing LLMs and stricter regulation on the disclosure of a product change when it is paid under a subscription or prepaid plan.

landl0rd 1 hour ago|||
There's at least one site doing this: https://aistupidlevel.info/

Unfortunately, it's paywalled most of the historical data since I last looked at it, but interesting that opus has dipped below sonnet on overall performance.

Analemma_ 1 hour ago|||
> What we need is an open and independent way of testing LLMs

I mean, that's part of the problem: as far as I know, no claim of "this model has gotten worse since release!" has ever been validated by benchmarks. Obviously benchmarking models is an extremely hard problem, and you can try and make the case that the regressions aren't being captured by the benchmarks somehow, but until we have a repeatable benchmark which shows the regression, none of these companies are going to give you a refund based on your vibes.

jampa 1 hour ago||||
I usually agree with this. But I am using the same workflows and skills that were a breeze for Claude, but are causing it to run in cycles and require intervention.

This is not the same thing as a "omg vibes are off", it's reproducible, I am using the same prompts and files, and getting way worse results than any other model.

eterm 1 hour ago||
When I once had that happen in a really bad way, I discovered I had written something wildly incorrect into the readme.

It has a habit of trusting documentation over the actual code itself, causing no end of trouble.

Check your claude.md files (both local and ~user ) too, there could be something lurking there.

Or maybe it has horribly regressed, but that hasn't been my experience, certainly not back to Sonnet levels of needing constant babysitting.

spike021 29 minutes ago||||
Eh, I've definitely had issues where Claude can no longer easily do what it's previously done. That's with constant documenting things in appropriate markdown files well and resetting context here and there to keep confusion minimal.
kachapopopow 37 minutes ago|||
They're A/B testing on the latest opus model, sometimes it's good sometimes it's worse than sonnet annoying as hell. I think they trigger it when you have excessive usage or high context use.
landl0rd 1 hour ago|||
I've observed the same random foreign-language characters (I believe chinese or japanese?) interspersed without rhyme or reason that I've come to expect from low-quality, low-parameter-count models, even while using "opus 4.5".

An upcoming IPO increases pressure to make financials look prettier.

epolanski 1 hour ago||
Not really.

In fact as my prompts and documents get better it seems it does increasingly better.

Still, it can't replace a human, I really need to correct it at all, and if I try to one shot a feature I always end up spending more time refactoring it few days later.

Still, it's a huge boost to productivity, but the time it can take over without detailed info and oversight is far away.

paulhebert 1 hour ago||
I tried using Claude Code this week because I have a free account from my work.

However when I try to log in via CLI it takes me to a webpage with an “Authorize” button. Clicking the button does nothing. An error is logged to the console but nothing displays in the UI.

We reached out to support who have not helped.

Not a great first impression

hobofan 1 hour ago||
Sadly their whole frontend seems to be built without QC and mostly blindly assuming a happy path.

For the claude.ai UI, I've never had a single deep research properly transition (and I've done probably 50 or so) to its finished state. I just know to refresh the page after ~10mins to make the report show up.

roywiggins 1 hour ago||
It's had enormous problems in Firefox. For me it would reliably hang the entire tab.

https://github.com/anthropics/claude-code/issues/14222

attheicearcade 1 hour ago||
Do you have API access (platform.claude.com) rather than Claude code (claude.ai)? I had similar issues trying to get Claude CLI working via the second method, not knowing there’s a difference
smithkl42 15 minutes ago||
This is an N of 1, of course, but I can relate to the other folks who've been expressing their frustration with the state of Claude over the last couple weeks. Maybe it's just that I have higher expectations, but... I dunno, it really seems like Claude Code is just a lot WORSE right now than it was a couple weeks ago. It has constant bugs in the app itself, I have to babysit it a lot tighter, and it just seems ... dumber somehow. For instance, at the moment, it's literally trying to tell me, "No, it's fine that we've got 500 failing tests on our feature branch, because those same tests are passing in development."
copirate 1 hour ago||
There's also this issue[1] with about 300 participants about limits being reached much more quickly since they stopped the 2x limit for the holidays. A few people from Anthropic joined the conversation but didn't say much. Some users say they solved the issue by creating a new account or changing their plan.

[1] https://github.com/anthropics/claude-code/issues/16157

codazoda 1 hour ago||
Something to check is if you’re opted into the test for the 1M context window. A co-worker told me this happened to them. They were burning a lot more tokens in the beta. Seems like creating a new account could track with this (but is obviously the Nuclear option).

I recently put a little money on the API for my personal account. I seem to burn more tokens on my personal account than my day job, in spite of using AI for 4x as long at work, and I’m trying to figure out why.

cheschire 32 minutes ago|||
Super, and for those of us on the annual plan, I guess I just accept my new reality
MicKillah 1 hour ago||
I can definitely vouch for that being the case, for me.
boringg 1 hour ago||
Oh is this whats been happening? I've been trying to ask question on a fairly long context window and history -- but it fails. No response it kind of acknowledges it received the input but then reprints the last output and then that whole dialogue is essentially dead ... same issue? Happened multiple times - quite frustrating.

Just a pro sub - not max.

Most of the time it gives me a heads up that I'm at 90% but a lot of the times it just failed, no warning, and I assumed it was I hit max.

kilroy123 1 hour ago||
Same here. Very bad and frustrating user experience to see nothing.
kingkawn 1 hour ago||
I’ve also been encountering this behavior, coupled with rapidly declining length of use for a pro account now below an hour, and weekly limits getting hit by Wednesday despite achieving very little other than fixing its own mistakes after compressions.
jimnotgym 12 minutes ago||
I have not been coding for a few years and wondered if Vibe Coding would help me get part procrastination. Is Claude code the best option this week?
jvanderbot 11 minutes ago|
Depends, do you like CLI tools? Or IDE integration?

I like cli tools, and claude is generally considered a very good option for that.

I have a coworker who likes codex better.

jimnotgym 4 minutes ago||
I like a light editor with syntax highlighting and basic linting. Last time I was coding regularly I used VS code, but had only the default plugins. I only used it for basic text input. I always ran git and my code from the terminal. Does that help?
VerifiedReports 1 hour ago||
The VS Code plug-in is broken on Windows. The command-line interface is broken on Windows.

I just signed up as a paying customer, only to find that Claude is totally unusable for my purposes at the moment. There's also no support (shocker), despite their claims that you'll be E-mailed by the support team if you file a report.

brookst 39 minutes ago|
I use the CLI on two different windows machines for many hours a day, and have seen no sign of being broken.

What symptoms do you see? There are some command line parameters for reinstall / update that might be worth trying.

Retr0id 28 minutes ago||
Tangentially related: I would like to report a low-severity security vulnerability in Claude (web version), but I can't be bothered to go through the Hackerone formalities, since I don't care about a bounty.

Right now I'm defaulting to "do nothing" because I'm lazy, but if any Anthropic staff are reading this I'm happy to explain the details informally somewhere.

cheriot 1 hour ago|
I love CC, but there's so many bugs. Even the intended behavior is a mess - CC's VS Code UI bash tool stoped using my .zshrc so now it runs the wrong version of everything.
kilroy123 41 minutes ago|
This is the case for all AI tools right now. Sooo bad.

Cursor, Claude code, Claude in the browser, and don't even get me started on Gemini.

mbm 27 minutes ago||
Codex is a bit better bug-wise but less enjoyable to use than CC. The larger context window and superiority of GPT 5.2 to Opus makes it mostly worth it to switch.
More comments...