I started working on a similarly sized arm. I've got a use-case, long time friends with a glass blower. I was thinking of using it to make faceted glass pendants. They've got a faceting machine but it is manually operated.
The hard part is repeatability. You need tight tolerances and each joint in the arm adds inaccuracy the further you get from the base. If the base has 1mm of wiggle, the 20cm arm has 4mm wiggle at the end, and the arm beyond it has even more.
You also, for faceting purposes, need much finer resolution than an ungeared servo will have. Gearing it is tricky because you want backlash to keep the join tight, but not so much that it has high friction when moving. You don't really want to use a worm gear because they're both slow and overly rigid. So a cycloidal gear is the best bet for the gears in the arm. You also need real servos with some amount of feedback because grabbing at glass is sketchy at best.
I was estimating 1-2k build cost, bulk of that is in the gearboxes.
One thing that is amazing about industrial robots is how rigid they are when at standstill. Breaking systems start to be a challenge too at high speeds and loads.
And once you manage to get the hardware working, getting a kinematic solver to really work is a massive challenge. Tons of edge cases, real-time feedback to handle and the need to balance usability with reliability. That's where robot companies charge a lot, and rightfully so.
Whenever you can avoid building a robot arm and replace it with simpler kinematics, you should. Hats off to you if you build that thing!
Why aren't all these applications that have only recently been examined for automation absolutely flush with cheaper lower axis count robotics, like SCARA or parallel manipulators?
I've done similar projects in the past (robot arms pushing performance limits in the few thousand $ range), and I found pretty good results with stepper motors, and gearboxes with sufficiently low backlash. For reference, these designs got to approx 1mm repeatability with 2.5kg payload at ~80cm reach, meant to model a human arm somewhat.
Here's some specifics if you're interested. Depending on the end effector payload requirements, a mix of NEMA34,24,17 can do this (bigger ones for earlier joints). You can go cycloidal/harmonic gears if you have the budget, otherwise each actuator (motor + driver + gearbox + shaft coupling) would run you something like $100-$200 depending heavily on supplier and exact requirements (+$50 or so for closed-loop systems). So not terrible on the price front. Then for the base joint you'd want some wider cylindrical gearbox that distributes the load across better.
If you're able to work with a machine shop I think you can put together something really high quality. Here's some example design inspirations, some of them even better than what I described I was able to put together as a hobbyist:
> The hard part is repeatability. You need tight tolerances and each joint in the arm adds inaccuracy the further you get from the base. If the base has 1mm of wiggle, the 20cm arm has 4mm wiggle at the end, and the arm beyond it has even more.
Could this be solved by software instead of expensive hardware?
Some idea I had a while ago was to build an arm out of cheap, "wobbly" components for the large-scale movements, but then add some stages at the end that have a small movement range, but can be controlled very precisely.
Finally, add a way to track the deviation of the tool's actual position from the desired position very precisely, maybe with a tool-mounted camera.
Then you could have a feedback loop in software which tracks the tool's deviation from the desired position and uses the "corrective" stages at the end to counteract it.
I'm not sure if this would work, however.
(There is also the question how long the "counteracting" would take. It's one thing to "eventually" arrive at the desired position at the end of the path - e.g. for pick-and-place - and another to stay below some maximum deviation for the entire path, e.g. for etching or welding.)
I'm a software engineer myself so I don't know a lot about this, but there are a few patterns that are not far off of what you're describing. For instance:
But do you need AI? How about this for a screw-in-hole machine:
Assuming the hole is in the Z plane: Camera in the X plane, observes the screw against a high contrast background. Camera in the Y plane, observes the screw against a high contrast background. The motors need not know their exact position, just be of controllable speed. As the screw gets close to alignment the speed on the motor is stepped down, it stops when it's aligned. When both cameras report that it's in position a motor in the Z plane pushes the screw towards the hole, stopping when a plunger next to the screw reports the correct depth.
If you have to be concerned with the Z axis alignment you make the X and Y backgrounds striped, the alignment of the screw is measured compared to those stripes and it's rotated accordingly.
This is how a human would handle it--we do not have anything like the motor precision to get the screw in the hole directly, but we can use our eyes to refine it without *needing* the motor precision. Reliably identifying the screw from the background is hard but this approach doesn't require *identifying* anything. You're just mapping the bounding box of the object of a very different color.
If you have a large movement field and a high precision requirement you might need two cameras, the second with a much narrower field of view.
Id consider it more a fundamental problem - lack of a way to introduce new data to a model without repeating training runs and waiting for the error to converge. Humans seem to have an understanding of things from a one-shot learning run (this is perhaps due to our vast experience with the world and ability to run simulations in our head, but a subset of that should be possible for ML quite easily)
If you solve this problem, teaching a robot arm to be accurate should be pretty easy. You would just have stereoscopic cameras that map to a 3d world, and "program" in a trajectory of the object, and the model should use that trajectory to figure out where to move and how to compensate based on visual feedback.
I basically agree, I dont think you understood my comment.
If you look at transformers in llm, you have a input matrix, some math in the middle (all linear), and an output matrix. If you take a single value of the output matrix, and write the algebraic expression for it, you will get something that looks like a linear layer transformation on the input.
So a transformer is simply a more efficient simplification of n connected layers, and thus is faster to train. But its not applicable to all things.
For the following examples, lets say you hypothetically had cheap power with good infrastructure to deliver it, and A100s that cost a dollar each, and same budget as OpenAI.
First, you could train GPT models as just a shitload of fully connected, massively wide deep layers.
Secondly, you could also do 3d mapping quite easily with fully connected deep layers.
First you would train a Observer model to take an image from 2 cameras and reconstruct a virtual 3d scene with an autoencoder/decoder. Probably through generating photorealistic images with raytracing.
Then you would train a Predictor model to predict the physics in that 3d scene given a set of historical frames. Since compute is so cheap, you just have rng initialization of initial conditions with velocities and accelerations, and just run training until the huge model converges.
Then you would train a Controller model to move a robotic arm, with input being the start and final orientation, and output being the motion.
Then hook them all together. For every cycle in the robot controller, Controller sends commands to move along a path, robot moves, Observer computes the 3d scene, history of scenes is fed to Predictor that generates future position, which gets some error, and controller adjusts accordingly.
My point is, until we reach that point with power and hardware, there have to be these simplification discoveries like the transformer made along the way. One of which is how to one shot or few shot adjust parameters for a set of new data. If we can do that, we can basically fine tune shitty models on specific data quite fast to make it behave well in a very limited data set.
I think about this regularly just don't have the time to pursue it:
Couldn't you build your arm in Nvidia Omniverse by also adding feedback like a cheap hig resolution distance or angle detector and train an ml model to compensate it?
Making and animating a 3D graphics robot arm is trivial compared to building it in real life. So not so much Omniverse, you would want to use a proper simulator like gazebo.
But beyond that, the kinematics as well as the force dynamics for controlling a serial manipulator are very well understood. So there aren't too many gains to be made by AI. It is difficult to implement in software due to some tricky situations about the nature of motion planning. Discontinuities around orientation approaches in 6-DOF systems for instance. But widespread use of serial manipulators is proof that, although challenging, they are relatively solved. It is always interesting to watch an AI model or genetic algorithm do some path planning, but this is a pretty well trod area of research at this point.
Now, when you want a robot to walk and pick things up at the same time... that is when AI becomes something to consider in order to figure out how the dynamics should work.
I'm not super familiar, but they say in the webpage that it is specifically for Universal Scene Description, which is formally for graphics. Although, after a quick google, it looks like they do have a simulation package which then runs on top of Omniverse (Isaac Sim?), so I guess that is Nvidia's robotics offering.
My general experience with other commercial offerings for simulation... is not great. In my experience, people usually end up migrating to gazebo, but I have been away from the field for a while now so it could be different. It is probably a situation where Nvidia will have a few coporate clients that they prioritize, and you are on your own to get it set up if you aren't on that lists. Pretty normal.
But motor and motion control isn't exactly so mysterious that we need AI for it. Inductance in electric motors can have some odd effects in the acceleration domain, but it generally boils down to a second to third order differential formula. Even when linking multiple together in a serial manipulator, the math is really well understood for modeling the motion output. Maybe there are some outputs to be gained implementing different drive trains in arbitrary circumstances, and monitoring how they fail and stuff like that. At that point you are really getting into the weeds of operations and maintenance more than actual motion control.
The situation that arises into a very complex n-dimensional problem that you would want AI to search through is the coordinated motion of multiple actuators to achieve a very complex output. Like, picking something up of unknown weight, running while carrying it up a steep hill, waving it around while doing all this. We take it for granted as humans with brains that can perform all this stuff trivially, but it is extremely complex motion.
Well, that doesn't really work. There is only so much electrical efficiency you can crank out of these motor. Essentially, there is a relationship between the current you pump through them, which is limited by their thermal characteristics, and the inductance of them. So you are trading off building more inductive motors that are more powerful but less reactive, and current draw which you can increase by putting more material in the motor and making it dispose heat much better but also bulkier. There are diminishing returns in many places in this process, and at a certain point you have to consider switching to hydraulics if you more force at a high reactivity, under essentially much less energy efficient conditions.
Maybe you could make a model that sizes motors correctly per application? But you are still much better hiring an engineer that knows what they are doing that can explain what is going on and trouble shoot things when they go wrong. At a certain point you are trying to figure out how to completely replace an engineer with a machine learning model, which I would like to think is a bad idea.
Uneducated in hardware, mostly a software guy for perspective, so I could be way off.
Would using something like a stepper motor geared way down with a cycloidal gear box work for a situation like this? (in my mind) It would give you a very controllable and repeatable way to position, with the backlash handled by the gearbox mainly.
Would love to know if I'm wrong though, like I said mostly a software guy trying to venture into hardware!
Servo + Cycloidal or Harmonic gears are usually the way to go, but to get them backlash-free is hard (or expensive, if you're buying). Once you got that down challenges include:
* How rigid are the links between my joints? Plastic will wobble, metal is better
* How heavy is my arm and how does that limit its movement? If you go for stiff metal castings, you add weight you need to move. The lever arm relative to the base can get really long
* Motors are heavy! Ideally you can mount them towards the base, but then you need drive shafts or belts, which again add flex. (See KUKA arms which have motors 4, 5 and 6 on the elbow often)
* How much payload do you need to move? 5kg is already challenging in ~1m arms and if you need to move it fast the problem gets even bigger.
* Where do you run your cables? Internal is tricky to build, external can get you tangled.
And so on. When approaching this you get a totally new appreciation for biological arms which arae insane in most aspects except for repeatability. And on the software side you can enjoy inverse kinematics :)
Do you think there is a way to take out backlash with sensors and software? Something like how additive manufacturing systems can use accelerometers to smooth artifacts from motor movement. [0] Let's say two cheaper cycloidal geared motors running in opposition with a load cell between them to maintain the materially compatible force.
I don't want to say no, however it seems very hard to do. You get feedback about motor position via an encoder, which is usually located on the motors axis and not the output element. Since the motor axis spins a lot more, you get more resolution. Backlash happens on the output, so you could add a second encoder there (but now you've got more complexity + cost). An oldschool CNC solution is to add brakes to lock an axis out, but this makes your system less flexible and doesn't prevent backlash during motion. A more modern solution might be to factor backlash in to your motion software so that you tell the kinematic solver what compliance is acceptable in some direction.
> Let's say two cheaper cycloidal geared motors running in opposition with a load cell between them to maintain the materially compatible force.
This might work, but now you have twice the amount of motors.
The problem with backlash comes into play when the direction of force on an axis changes. If you are applying force in one direction and all the backlash has been taken up, everything is fine -- any force you apply or movement you make will be transmitted to the tool like you'd expect. However, if you have to decelerate, or you've gone over-center, or the tool/load pulls harder than you're pushing, now you have to apply force in the other direction, which you can't do until you take up the backlash.
If your axis has high enough friction, then nothing will move when your actuator is in the decoupled backlash region, so you can compensate by adding the backlash amount to your target position whenever you switch directions. But that means you need more friction than tool force, with bigger motors and drivetrain to compensate. It's often easier just to build a system with zero backlash, then you can focus on tuning for system rigidity/resonance (as shown in your link).
That was why OP suggested to have 2 motors on each joint, going in opposite direction. The problem with this is that you now have twice the amount of motors.
Oof, I appreciate you pointing that out because somehow I got the first part and skipped that one. Yeah, I could see that working, but it sounds inefficient.
What kind of manipulation is needed?
If a 3 axis or 4d axis system can do the job, that is always much cheaper than a robot arm, for a given precision and load capacity.
tight tolerances and repeatability is mostly a combined rigidity and resolution issue, which is functionally equivalent to a cost issue. Add more money. Unfortunately there is a point where programming and hardware costs is higher than a skilled artisan ... hence the profession!
The hard part is repeatability. You need tight tolerances and each joint in the arm adds inaccuracy the further you get from the base. If the base has 1mm of wiggle, the 20cm arm has 4mm wiggle at the end, and the arm beyond it has even more.
You also, for faceting purposes, need much finer resolution than an ungeared servo will have. Gearing it is tricky because you want backlash to keep the join tight, but not so much that it has high friction when moving. You don't really want to use a worm gear because they're both slow and overly rigid. So a cycloidal gear is the best bet for the gears in the arm. You also need real servos with some amount of feedback because grabbing at glass is sketchy at best.
I was estimating 1-2k build cost, bulk of that is in the gearboxes.