Posted by shmublu 2 hours ago
This new discovery is that gearbox problems mess up a machine learning system. It's trying to track gearbox noise and is using up all its learning capacity on that. This discovery means that robotics people can tap machine learning funding for motor and gearbox development. Robotics labs used to be really low-budget operations. No longer.
What you really want is a direct drive motor, but those have to be large-diameter. They can be flat; that's a pancake motor. That's too large for fingers. So their compromise moves partly in that direction; the rotor is flatter, torques are higher, speeds are slower, and gearbox ratios are lower. As they point out, reflected inertia is the square of the gear ratio, because the gear ratio gets you both going out and coming back. So this is a bigger than linear win.
Good back-drivabiilty means much less risk of gear breakage on overload. Some of the academic designs, such as harmonic drives and series elastic actuators, have huge gear ratios in a small space. That's OK for prototypes but not production. As I've mentioned before, "you cannot strip the teeth of a magnetic field", a line from a GE electric locomotive salesman around 1900. If an overload forces a motor backwards, nothing breaks.
Would have been nice to hear more about the motor design. That's the real achievement here. There are CAD tools which understand electromagnetic fields now, so strange motor geometries are not as much of a trial and error and experience process as it once was. It's also respectable for an EE to work on rotating machinery again. That field matured around the 1960s, and until computers took over motor control, didn't change much.
Key concept: force-based motor control works quite well. Preserve that property through the gear train and force-based hand control works.
What? An ideal capstan drive can be backdriven perfectly fine. You only run into problems once it stops being ideal (e.g. built out of heavy parts, high gear ratio, etc.)
Is it? The title is "The Robotic Dexterity Deadlock". For all I know, it's a joke about what deadlock looks like for robots, showing what could be interpreted as a deadlock in a webserver. At a glance, I can't tell if the site is down, or if it's up and correctly showing its very short message.
So, yeah, in reality, I'm 99% sure it really is an error message. That's only because I've seen similar error messages in the past and can infer how to interpret it.
"This deployment is temporarily paused", if anything, sounds like the people who put the site up took it down again. That sends the wrong message.
Personally, if my hosting provider took my post down, I'd want them to make that obvious to my visitors. Or at the very least make it look like a technical issue. Not make it look like I took it down.
Robotics doesn't have a single silver bullet - the design space is vast and underexplored.
Multiple times, over and over.
We need to stop with the AI stuff.
I now scroll any AI-adjacent article I see and just read headings and if I see this I know what I'm getting into:
The Dexterity Deadlock
The Problem
The Geometric Curse
The Sim-to-Real Gap
The Structural Gap f(⋅)
Seeing It in Motion
The N^2 Impedance Mismatch
The Chaos Term ϵchaos
The Information Wall
The Weakest Link
Why Manipulation Needs Better
What We Built
From 288 to 15
Does It Work?
Hardware Validation
Robot Hand Landscape
The Take-Home
The fundamentals of an LLM is to statistically match their output with the corpus. The tics they have are really common in natural human usage too.
In this day and age, I wish people would ask any model OTHER than ChatGPT to rewrite their shit. At least we'd get a different flavor of slop.
I continue to be amazed that the wrong form factor keeps being pursued. Though I suppose I shouldn't be too surprised given the parade of failed "AI devices."
Its a bit like choosing JS / python -- of course performance is inferior to a compiled language with highly tailored code, but they are flexible and have an ecosystem that might do 99% of the lifting for you.
But in isolation, I agree with your idea that specialized robots with form fitted specifically to task will likely outperform a more generalized solution in a specific domain of behavior, the more generalized will likely outperform in flexibility and reusability (e.g. capable of reusing the human ecosystem).
You don’t need a human-like hand to hold a tool made for humans. As an extreme example, you can make a robot operate a power drill with strap to hold it and a servo with a small bit of wood to operate the trigger mechanism.
But for a robot operating in a space made for humans there certainly are some physical requirements which are based on the human form: maximum volume and clearances, stairs, fragile fixtures that can’t be operated with too much force, etc.
Ever walk through some over-crowded antique shop where you need to twist and lean your body to avoid knocking into thing?
What makes human hands especially suitable for e.g. assembling a phone or installing a door handle onto a car?
yes. do you think it's safe to just plug usb into some hole and type? the safest option for a robot is typing with fingers
I personally am not bullish on 1:1 human hands either, but IMO the question shouldn't be $100k 2 ton Kuka arm vs biped with hands, it's overactuated robotics (build it from the floor with hard coded operations) vs underactuated (build it from the contact point of the work backwards with ML and sensors). We shall see which form factors prevail, but the type of robotics development posted here seems like the way forwards regardless, an ecosystem of small, power dense, reliable, accurate QDD actuators will lead to many general purpose robot applications. I recognize I am not using underactuated vs overactuated in their strict definition here but if you are familiar with robots I think you'll understand where I am coming from as far as a robot design ethos.
I will say though in designing robots of this type without necessarily being bound by trying to make a robot look like a human, I have often found myself accidentally recreating human arm DOF in a round trip way, it does just end up being well packaged beyond the "world designed for humans" talking point. Maybe hands will end up being a similar situation.
Not to dismiss the value of LLMs in those cases as an interface/interpretation layer.
If grandma goes into the windowless surgery factory, I just want the best bots working on her. There is value in having Dr. Bot the replicant give me the face-to-face status updates. We are not breaking out those layers as much, anymore, as the focus becomes minimizing FOMO.
Similar to how claude code gained so much traction in terminal by just leveraging the command line interface that already exists for humans, no need to invent a domain specific MCP to just run shell commands.
I agree with you that it's far from the most efficient approach for specific tasks. But the analogy would be that you also generally don't want to use LLMs to do something you can "just" write a script for... that doesn't make LLMs useless though.