Command gear torque A3/A2 error when using kuka_rsi_hw_interface #126

RhysMcK · 2018-04-17T06:03:25Z

KUKA System:

Controller- KRC4 compact
Robot- Kr3-540 Agilus

Environment

ROS Distro: Kinetic
Moveit

Problem

Hi all,
I have been struggling with this issue for a while now. While controlling the kuka robot through the RSI hardware interface , every now and again i will receive a command gear torque error, typically for axis A3 or A2. This is an issue which has been reported before (#89) however none of the comments have solved the problem for me personally. I don't believe this to be a communication issue, as this happens when i have not lost any communication packets ( according to the kuka RSI monitor). I have played with the HOLD_ON parameter and that does not seem to make any difference.

A possible suggestion which was bought up in #89 was the tool load data not being set correctly. It is possible that this could be the cause, as i must admit i have not set the payload 100% accurately ( due to not having any cad program currently available to me). I have however entered the mass and estimated the location of CoM and moment of inertia's. However, the reason i don't believe this to be at fault is that i have played with big and small values for the mass/ inertias and have seen no difference in the behavior. I would have thought that setting values significantly far away from the true values would cause this error to happen instantly or more frequently. To me it doesn't seem as if this data has any effect. But i could be very wrong.

A more likely possibility in my opinion is that the motion plans from moveit are not abiding by the acceleration limits, as previously mention in moveit/moveit#416. I will try and get some trajectory plots in the next day or so but in the mean time i just wanted to get this discussion going again and see if people had any other insights into this issue.

Cheers!

RhysMcK · 2018-04-18T01:30:36Z

Update: This issue persists after i have physically removed the tool from the robot arm. Which surely must indicate that this is not an issue with inaccurate tool load data.

RhysMcK · 2018-04-19T00:37:40Z

I have had a chance to do some further testing on this issue. It seems the problem is most likely to do with the trajectories that are being sent to the KUKA robot. I recorded all of the position commands that were being set from within the KukaHardwareInterface::write function. From this data i was then able to extrapolate the commanding velocities by taking the difference between the last two commanding positions and assuming cycle time of 12ms. From the figures below, it can be seen that just before i receive this error on A2 there is a noticeable velocity spike.
Trial 1:

Trial 2:

Trial 3:

The Moveit! motion planners definitely seem to be generating non-ideal trajectories, which leads to this issue. There has been discussion previously about the OMPL library (which i am using), potentially causing this issue on Kinetic (moveit/moveit#416). However i'm not sure if this issue has been further discussed/fixed.

A simple, first attempt fix might be to just adjust the command position if the change in velocity ( acceleration) is too great, i.e set a threshold for the acceleration. A very basic moving average filter could be used to determine the command position if this limit is exceeded. I will try this and see how it goes. Any further ideas on this issue would be greatly appreciated it!

Note that adjusting your payload data on the kuka controller will effect when you will receive this error, as the kuka motion planner is looking at the commanded motion, relative to the current inertial conditions, and predicting that the commanded motion will exceed torque limits. $LOAD is taken into account when making these predictions [1]. However, this does not really solve the root of the problem.

Cheers!

[1] https://www.robot-forum.com/robotforum/kuka-robot-forum/command-gear-torque-a23-with-rsi-control/

gavanderhoorn · 2018-04-19T13:57:56Z

Hi. Thanks for the detailed description.

I believe your analysis could be true: the OMPL version used in Kinetic MoveIt is known to generate (sometimes) discontinous paths, leading to time parameterisation artefacts such as the one you've plotted.

Unfortunately there isn't much we can do about that, other than implementing something like the filter you mention. I'm not entirely sure I'd be willing to do that in the driver though: perhaps an intermediate node should do the filtering before the trajectory gets passed to the driver (the driver (actually: RSI) just requires smooth trajectories, it is the producer's responsibility to provide those).

moveit/moveit#416 has not been resolved, and in fact, the linked issue on the OMPL bitbucket repository (ompl/ompl#309) is also still open. Perhaps commenting on that might help get this some more attention.

In the meantime MoveIt has also seen some changes to the time parameterisation components. moveit/moveit#441 got merged into Kinetic. The interplay with the OMPL issue is not clear to me, but perhaps one is related to the other.

An interesting PR that has not been merged yet could be moveit/moveit#809: this adds a parameterisation plugin based on TOTG. In my experience this produces rather impressively smooth trajectories, but I'm again not sure how the OMPL issue influences this.

From this data i was then able to extrapolate the commanding velocities

You're probably aware, but the driver only uses position control. Any velocities are only the distance travelled in one interpolation period.

gavanderhoorn · 2018-04-19T13:59:13Z

@rtonnaer

gavanderhoorn · 2018-04-20T14:24:56Z

Unfortunately there isn't much we can do about that, other than implementing something like the filter you mention. I'm not entirely sure I'd be willing to do that in the driver though: perhaps an intermediate node should do the filtering before the trajectory gets passed to the driver (the driver (actually: RSI) just requires smooth trajectories, it is the producer's responsibility to provide those).

I think I'll change my opinion on this: I would like to add this to the driver, but not to filter (ie: change) the trajectory, but to check it before accepting it for execution.

That way, we can at least avoid executing trajectories that will (probably) lead to problems some time later after we've started executing them.

A complicating factor is that we only provide a ros_control hardware_interface, and do not control any of the infrastructure that accepts and processes the trajectories. That would be joint_trajectory_controller.

Note: the JTC actually does the interpolation of the received trajectory. MoveIt et al influence the initial parameterisation, but the JTC is ultimately the one that interpolates the trajectories and sends the values to the hardware_interface.

RhysMcK · 2018-04-22T21:52:17Z

Possibly a more robust solution could be to also add a RSI FILTER object on the Kuka side. I have not experimented with this before and the documentation is a bit scarce, but it is possible to implement a bunch of standard filters (butterworth, low-pass, high-pass, etc.). A simple low-pass filter should help omit any random jerks and inconsistency outputted from the JTC.

cschindlbeck · 2018-05-02T09:19:39Z

So apparently this issue has been solved, see moveit/moveit#416 moveit/moveit#869

Can you confirm that this solves your problem?

gavanderhoorn · 2018-05-02T10:02:04Z

@cschindlbeck: I'm not sure it has been really solved. The MoveIt changes are related, but do not address the cause afaict. Changes in OMPL are needed as well, and until those are released into Kinetic the situation as-is remains.

See also the bitbucket issue.

gavanderhoorn · 2018-05-03T11:00:26Z

See moveit/moveit#416 (comment) if you can't wait :)

RhysMcK · 2018-05-03T23:39:17Z

Thanks @gavanderhoorn, i've been keeping a close eye on that thread!.... I'll try this out and see if it makes any improvements. Although i now believe the weird velocity spikes seen the plots above is more likely due to the current default method of time parameterization used in moveit. I'm eager to try either TOPP and/or TOTG when its ready for kinetic :) moveit/moveit#809

gavanderhoorn · 2018-05-04T08:13:31Z

Getting moveit/moveit#809 to work on Kinetic should not too difficult. It's fairly stand-alone. You would just need to add the plugin and update your MoveIt config to load that parameterisation adapter instead of the default one (it's configured in the OMPL planning adapter section).

gavanderhoorn · 2018-05-15T20:12:05Z

@RhysMcK: have you had any opportunity to test the fixes in OMPL+MoveIt?

RhysMcK · 2018-05-15T21:33:06Z

Haven't had a chance yet @gavanderhoorn. This is top of my todo list though, hopefully i get around to it in the next couple days.

RhysMcK · 2018-05-18T04:31:32Z

Had a chance to test today. Unfortunately the problem still persists after using the latest OMPL + Moveit updates. I expect this problem as i stated above is most likely due to the current method of time parameterization. Which i will confirm once i get a spare moment!

Despite the velocity spikes between consequence points, the latest updates definitely remove some jerkiness to the planned paths. In generally, the trajectories are a lot smoother from a "macro-view". i.e less zero-crossings.

gavanderhoorn · 2018-05-18T05:55:27Z

@RhysMcK wrote:

Unfortunately the problem still persists after using the latest OMPL + Moveit updates.

just making sure: you compiled everything from source, correct? There hasn't been an updated release yet.

I expect this problem as i stated above is most likely due to the current method of time parameterization. Which i will confirm once i get a spare moment!

I would suggest to try and use moveit/moveit#809 to see whether that improves things.

One thing we keep ignoring in this context (or at I believe we've not given it much attention) is that ros_control's joint_trajectory_controller also interpolates. The data you're plotting is what comes out of that, not MoveIt itself. It could be that the two agree with each other, but would still be good to check.

RhysMcK · 2018-05-18T09:11:10Z

Not everything, I have Moveit! installed from source, and i am using the debian binary for OMPL, which as far as i can see is up-to-date and includes the above mentioned fixes...Or am i wrong?

yep, moveit/moveit#809 is my next step :)

And yes, good point about the joint_trajectory_controller interpolation... I will investigate this.

gavanderhoorn · 2018-05-18T09:13:51Z

ros/rosdistro#17747 was merged two weeks ago. Kinetic sync was yesterday so you'll only have OMPL updated if you updated your ROS pkgs just now (or at least: after the sync).

RhysMcK · 2018-05-18T09:36:08Z

I did update my ROS packages, so i am using the latest ompl! Thanks for clarifying

RhysMcK · 2018-05-18T10:44:56Z

sigh.....

This is TOTG (moveit/moveit#809 ) implemented in the kinetic/devel branch. Admittedly, the profiles on general look smother... but still, the bug persists.

gavanderhoorn · 2018-05-18T10:47:27Z

But this is still taken from the hardware_interface, right?

If so, we're looking at JTC output, which is influenced by TOTG/IPTP, but does not have a 1-to-1 relationship with it.

I would be really interested in a plot of velocities coming out of MoveIt (that is hard(er) to get hold of, but would help diagnosing whether this is still a MoveIt/OMPL/IPTP/TOTG issue, or is in JTC).

gavanderhoorn · 2018-05-18T10:48:09Z

Also: I'm assuming you're using a real-time kernel and that the driver/hardware_interface has real-time priority. If not, then scheduling could be an issue here.

RhysMcK · 2018-05-18T11:06:47Z

But this is still taken from the hardware_interface, right?

Yep, thats correct

I would be really interested in a plot of velocities coming out of the MoveIt (that is hard(er) to get hold of, but would help diagnosing whether this is still a MoveIt/OMPL/IPTP/TOTG issue, or is in JTC).

Yes i agree, I will attempt to do this ASAP.

Also: I'm assuming you're using a real-time kernel and that the driver/hardware_interface has real-time priority. If not, then scheduling could be an issue here.

I am not at the moment, however i have previously... and this problem still persisted. I believe the need for a RT / low latency kernel would only really be an issue if i was experiencing communication issues, i.e packet losses. According to the Diagnostic monitor i always have connection quality of 100. With no more than about 1 or 2 contiguous packet loss.

cschindlbeck · 2018-05-23T12:44:54Z

I switched to moveit from source and upgraded the ROS packages via apt-get.
I am a little bit confused, i still see libompl.so.1.2.1 in /opt/ros/kinetic/lib/x86_64-linux-gnu. Shouldn't that be
libompl.so.1.2.3?

@RhysMcK Can you tell me how you obtained the velocity profiles? I might be interested in recording these by myself. Is this a numerically differentiated ROS topic?

cschindlbeck · 2018-05-24T07:05:43Z

So today i tested my setup again as in #123 where i got the same error with moveit from source and (hopefully) the new OMPL (i used the default time parameterization)

So regarding your comment @RhysMcK

i fear you might once again encounter this error when you speed things back up to "normal" speed.

Now i am able to execute motions with normal speed without getting an error. However, i still hear "knocking" during the motions which i believe is likely caused by "jerky" motions as you already showed in the velocity profiles...so i can confirm that there is still an issue

RhysMcK · 2018-05-24T23:50:30Z

@cschindlbeck i just did a quick-and-dirty print out of the position commands i was sending from within /kuka_hardware_interface.cpp and then a python script to collect the data and plot it. The velocity profile is just the differences between the last two sent position commands.

gavanderhoorn · 2018-05-25T07:03:51Z

@RhysMcK wrote:

i just did a quick-and-dirty print out of the position commands

just a note: both printf(..) as well as std stream output are two of the worst cases when it comes to predictive scheduling of tasks. In a real-time system, they are absolutely off-limits. On a soft real-time system, they'll probably influence scheduling in a negative way.

If you have the option, and want to implement this nicely, use an internal buffer that stores n last samples, and use a ROS service (or topic) to retrieve the contents of that buffer after the trajectory has been executed.

That way we avoid any kind of timing influence that may be caused by console IO.

hartmanndennis · 2018-05-29T13:20:54Z

The velocity profile is just the differences between the last two sent position commands.

Without dividing by duration? The spikes in the TOTG graph always come in pairs with a point too fast, followed by a point too slow. This would happen if the first call to JTC is late and the second therefore has a smaller duration.

@gavanderhoorn Related: Should the call to controller manager update be changed to use a fixed duration and increment the time with fixed 4ms/12ms, since the robot expects points interpolated at this rate? Currently, if scheduling is a few ms late, JTC will return a point too far ahead, which could introduce micro-stutter even if the packet reaches the robot in time.

gavanderhoorn · 2018-05-29T17:48:09Z

@hartmanndennis wrote:

Related: Should the call to controller manager update be changed to use a fixed duration and increment the time with fixed 4ms/12ms, since the robot expects points interpolated at this rate? Currently, if scheduling is a few ms late, JTC will return a point too far ahead, which could introduce micro-stutter even if the packet reaches the robot in time.

This could be something to try. However, I think I can imagine a situation where it is possible for a 'desync' to build up between real-time and trajectory execution.

Implementation-wise I would still use the pkts from the controller to dictate control flow in the node, but then instead of using real time assume a fixed delta.

Is this something you could test?

Edit: it would be nice if we could use IPOC-deltas for the dt, instead of having to configure the dt somewhere.

hartmanndennis · 2018-05-30T15:39:59Z

I'll make a PR soon. I already tested a bit and updated #95 with velocity graphs.
I'm still unsure, if only the duration parameter should be set constant or if the time parameter should be increased by the constant IPOC. I don't know about possible consequences for the controllers, if the clocks do diverge.

gavanderhoorn · 2018-05-30T15:42:12Z

If possible, I would vote for using IPOC_(t_n+1) - IPOC_(t_n). That would seem to keep things in sync with the controller.

hartmanndennis · 2018-05-30T15:46:44Z

That should work fine, but doesn't address the time parameter in the update() call, which expects absolute time. I don't know if ros_control has a defined standard usage for absolute time and duration. Does time_now - time_last need to be equal to duration? If so, why have duration at all...

gavanderhoorn · 2018-05-30T15:52:43Z

In general it would be a problem, but in the case of kuka_rsi_hw_interface we're not using time. period (not duration) is calculated using std::chrono::steady_clock, which doesn't necessarily sync with walltime anyway.

The only really important thing is that any msgs broadcast with Headers in them use something that is globally comparable (ie: to other ros::Time instances fi), and not some logical clock. So we probably can't use IPOC for everything, but we don't have to: we're only interested in making sure position control is synced with the controller's internal clock.

Controller manager passes the time on to any loaded controllers (such as joint_state_controller) to publish msgs with.

If I understand everything correctly.

cschindlbeck · 2018-06-05T08:26:56Z

Would it make sense to append a low-pass filter node to the JTC as a temporary workaround until the real culprit has been found?
I image that these jerky movements cannot be too healthy for the robotic systems

gavanderhoorn · 2018-06-05T11:08:50Z

The JTC is not under our control, as it's a component in ros_controllers. You could add something there, but that would require a source build. If you're comfortable with that, then that could be an option.

Note that I believe that what @hartmanndennis suggests in his comments would probably be more straightforward.

gavanderhoorn · 2018-06-05T11:09:40Z

Additionally: I'd be surprised if it's actually a problem in the JTC. I suspect that it's actually jitter on the scheduling of the node, leading to different dts being used.

gavanderhoorn · 2018-06-12T11:38:14Z

We've recently run into this (again) here at TUD as well (@rtonnaer).

A different axis though (A1), but that is probably inconsequential.

Right now I'm plotting the values sent to the controller from a wireshark capture. Initial results don't look too bad (not very nice either, but nothing like the spikes that @RhysMcK shows in his plots). Unfortunately the way I process the data right now is synced with IPOC, which makes detecting late packets difficult / impossible. I'll have to process the raw pcap itself to check walltime progress to match it up with IPOC.

This is using #132 btw.

gavanderhoorn · 2018-06-12T11:38:50Z

Has anyone of you (@RhysMcK, @cschindlbeck, @destogl) ever used the RSI filtering components?

RhysMcK · 2018-06-13T07:17:56Z

No i haven't @gavanderhoorn . There doesn't seem to be too much information out there about how to configure them aswell. I was however recommended to try implementing a RSI low-pass filter when i first came across this issue from the folks over at 'robot-forum' as i previously mentioned (https://www.robot-forum.com/robotforum/kuka-robot-forum/command-gear-torque-a23-with-rsi-control/ ) .

Unfortunately, my time is being drained in other areas at work at the moment so haven't had a chance to continue this investigation.

BrettHemes · 2018-12-05T18:32:03Z

Out of curiosity, do these problems also exist in Melodic?

jbeck28 · 2024-05-15T17:37:08Z

I know this is very old, but I thought I'd share. The issue persists in the ROS2 version, what you're seeing is likely the result of latency spikes. Since the robot is position controlled - this results in sudden zero-velocity segments and tons of jerk. See this issue.

I haven't tested the fix as extensively as I'd like to, but certainly saw better performance from merging the pugixml branch into my fork.

cschindlbeck mentioned this issue Apr 17, 2018

"Failed to read state from robot. Shutting down!": Maximum kinetic energy exceeded #123

Closed

gavanderhoorn mentioned this issue Apr 24, 2018

val_3 driver: rough and jerky trajectories ros-industrial/staubli_experimental#25

Closed

hartmanndennis mentioned this issue Jun 5, 2018

Use fixed period for update call of controller manager #132

Open

cschindlbeck mentioned this issue Dec 5, 2018

"Jerky" motions while using kuka_rsi_hw_interface for a KUKA KR240 L210 MED #147

Open

gavanderhoorn mentioned this issue Mar 22, 2019

Does the rsi_hw_interface driver modify the target joint positions? #151

Closed

mamoll mentioned this issue Sep 20, 2019

Performance regression going from ompl-1.0.3094-0 to ompl-1.2.1-1 ompl/ompl#430

Closed

gavanderhoorn mentioned this issue Jan 25, 2020

Implementing PID or High Level Controller #164

Closed

gavanderhoorn mentioned this issue Mar 30, 2021

RSI Load Check #181

Closed

gavanderhoorn mentioned this issue Jun 23, 2021

Still active? #193

Closed

Command gear torque A3/A2 error when using kuka_rsi_hw_interface #126

Command gear torque A3/A2 error when using kuka_rsi_hw_interface #126

Comments

RhysMcK commented Apr 17, 2018 • edited Loading

KUKA System:

Environment

Problem

RhysMcK commented Apr 18, 2018

RhysMcK commented Apr 19, 2018 • edited Loading

gavanderhoorn commented Apr 19, 2018

gavanderhoorn commented Apr 19, 2018

gavanderhoorn commented Apr 20, 2018

RhysMcK commented Apr 22, 2018 • edited Loading

cschindlbeck commented May 2, 2018

gavanderhoorn commented May 2, 2018

gavanderhoorn commented May 3, 2018

RhysMcK commented May 3, 2018

gavanderhoorn commented May 4, 2018

gavanderhoorn commented May 15, 2018

RhysMcK commented May 15, 2018

RhysMcK commented May 18, 2018

gavanderhoorn commented May 18, 2018 • edited Loading

RhysMcK commented May 18, 2018

gavanderhoorn commented May 18, 2018

RhysMcK commented May 18, 2018

RhysMcK commented May 18, 2018

gavanderhoorn commented May 18, 2018 • edited Loading

gavanderhoorn commented May 18, 2018

RhysMcK commented May 18, 2018

cschindlbeck commented May 23, 2018 • edited Loading

cschindlbeck commented May 24, 2018 • edited Loading

RhysMcK commented May 24, 2018 • edited Loading

gavanderhoorn commented May 25, 2018 • edited Loading

hartmanndennis commented May 29, 2018

gavanderhoorn commented May 29, 2018 • edited Loading

hartmanndennis commented May 30, 2018

gavanderhoorn commented May 30, 2018

hartmanndennis commented May 30, 2018

gavanderhoorn commented May 30, 2018 • edited Loading

cschindlbeck commented Jun 5, 2018

gavanderhoorn commented Jun 5, 2018

gavanderhoorn commented Jun 5, 2018

gavanderhoorn commented Jun 12, 2018

gavanderhoorn commented Jun 12, 2018

RhysMcK commented Jun 13, 2018

BrettHemes commented Dec 5, 2018

jbeck28 commented May 15, 2024 • edited Loading

RhysMcK commented Apr 17, 2018 •

edited

Loading

RhysMcK commented Apr 19, 2018 •

edited

Loading

RhysMcK commented Apr 22, 2018 •

edited

Loading

gavanderhoorn commented May 18, 2018 •

edited

Loading

gavanderhoorn commented May 18, 2018 •

edited

Loading

cschindlbeck commented May 23, 2018 •

edited

Loading

cschindlbeck commented May 24, 2018 •

edited

Loading

RhysMcK commented May 24, 2018 •

edited

Loading

gavanderhoorn commented May 25, 2018 •

edited

Loading

gavanderhoorn commented May 29, 2018 •

edited

Loading

gavanderhoorn commented May 30, 2018 •

edited

Loading

jbeck28 commented May 15, 2024 •

edited

Loading