I don't think LLM for coding productivity is all hype but I think for the people who "see the magic" there are many illusions here similar to those who fall prey to an MLM pitch.
You can see all the claims aren't necessarily unfounded, but the lack of guaranteed reproducibility leaves the door open for many caveats in favor of belief for the believer and cynicism for everybody else.
For the believers if it's not working for one person, it's a skill issue related to providing the best prompt, the right rules, the perfect context and so forth. At what point is this a roundabout way of doing it yourself anyway?
It's important to always maintain the developer role, don't ever surrender it.
Edit: In concrete terms the workflow is to allow Copilot to make changes, see what's broken, fix those, review the diff against the goal, simplify the changes, etc, and repeat, until the overall task is done. All hands off.
The hard parts of engineering have always been decision making, socializing, and validating ideas against cold hard reality. But writing code just got easier so let's do that instead.
Prior to LLMs writing 10 lines of code might have been a really productive day, especially if we were able to thoughtfully avoid writing 1,000 unnecessary lines. LLMs do not change this.
At first I was very enthusiastic and thought Codex is helping me multiplex myself. But you actually spend so much time trying to explain Codex the most obvious things and it gets them wrong all the time in some kind of nuanced way that in the end you spend more time doing things via Codex than by hand.
So I also dialed back Codex usage and got back to doing many more things by hand again because its just so much faster and much more predictable time-wise.
I don’t have it write of my Python firmware or Elixir backend stuff.
What I do let it rough in is web front end stuff. I view the need for and utility of LLMs in the html/css/tailwind/js space as an indictment of complexity and inconsistency. It’s amazing that the web front end stuff has just evolved over the years, organically morphing from one thing to another, but a sound well engineered simple-is-best set of software it is not. And in a world where my efforts will probably work in most browser contexts, no surprise that I’m willing to mix in a tool that will make results that will probably work. A mess is still a mess.
I wrote about it: https://kamens.com/blog/code-with-ai-the-hard-way
Whether someone’s litmus test is well-developed is another matter.