Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance with prpy 0.4.0 #26

Closed
Stefanos19 opened this issue Apr 5, 2015 · 41 comments
Closed

Slow performance with prpy 0.4.0 #26

Stefanos19 opened this issue Apr 5, 2015 · 41 comments

Comments

@Stefanos19
Copy link
Contributor

The two videos below run identical code, apart from the the first using prpy 0.3.1 and the second 0.4.0 For the first, the duration of grasp_bite - serve - look - ready-to-grasp is 25sec, for the second 49 secs https://www.youtube.com/watch?v=YkwINudZnOw https://www.youtube.com/watch?v=LbgmtMS4kpo
I took time stamps before and after each PlanToConfiguration and PlanToEndEffectorOffset call:

0.4.0 0.3.1. Call
935486 11590 PlanToLookingAtFace
905313 330854 PlanToLookingAtPlate
362034 82758 PlanToEndEffectorPose
937087 702920 PlanToEndEffectorOffset
263716 783048 PlanToEndEffectorOffset
166682 586631 PlanToServingConfiguration
3570318 2497801 Total

It appears though that the actual delays in the video are much larger than the ones in the time stamps
@siddhss5 @mkoval @jeking04

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

We recently fixed a but (see personalrobotics/prpy#79) that was causing environment cloning to leak a significant amount of memory. This could be the culprit if you're the demo in a loop in one process. It doesn't look like that commit made it into the 0.4.0 release. Can you try the latest master version of PrPy?

If that doesn't fix the problem, then you'll need to do some lower-level profiling to see what is actually the slow part. Is it planning or retiming/smoothing?

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

BTW, here is some documentation for how to make a proper table on Github using Markdown. The original timing information was very difficult to read, so I edited your post accordingly. 😄

@Stefanos19
Copy link
Contributor Author

Btw, here are the results from the collision and kinematics benchmarks of or_benchmarks:

[CollisionCheckingBenchmark.cpp:214 RunCollisionBenchmark] [CollisionCheckingBenchmark] Results:
[CollisionCheckingBenchmark.cpp:215 RunCollisionBenchmark]  Total collision checks: 50000
[CollisionCheckingBenchmark.cpp:216 RunCollisionBenchmark]  Total collisions found: 0
[CollisionCheckingBenchmark.cpp:217 RunCollisionBenchmark]  Total elapsed time: 0.15509
[CollisionCheckingBenchmark.cpp:218 RunCollisionBenchmark]  Average time per check: 0.0000031
[CollisionCheckingBenchmark.cpp:219 RunCollisionBenchmark]  Average checks per second: 322401.76

[DataUtils.cpp:62 GenerateRandomConfigurations] [CollisionCheckingBenchmark] Generating 50000 poses
[KinematicBenchmarks.cpp:123 RunForwardKinematicsBenchmark] [KinematicBenchmarks] Results for FK benchmark:
[KinematicBenchmarks.cpp:124 RunForwardKinematicsBenchmark]     Total queries: 50000
[KinematicBenchmarks.cpp:125 RunForwardKinematicsBenchmark]     Total elapsed time: 0.25551
[KinematicBenchmarks.cpp:126 RunForwardKinematicsBenchmark]     Average time per check: 0.0000051
[KinematicBenchmarks.cpp:127 RunForwardKinematicsBenchmark]     Average checks per second: 195684.76
[KinematicBenchmarks.cpp:128 RunForwardKinematicsBenchmark]     Average calls to SetDOFValues per second: 197768.39
[KinematicBenchmarks.cpp:129 RunForwardKinematicsBenchmark]     Average calls to GetEndEffectorTransform per second: 18573550.40
[INFO] [rospy.core:core.py:387]:signal_shutdown: signal_shutdown [atexit]
[BenchmarksModule.cpp:62 DestroyPlugin] [CollisionCheckingModule] Destroying plugin.

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

Those results don't mean much because there is (likely) nothing in the environment. It would be more informative to run the self-collision benchmark. I believe the run_all script runs all three benchmarks (environment collision, self collision, and kinematics).

@Stefanos19
Copy link
Contributor Author

Hmm the run_all script relies on some datasets that appear to have been generated for Herb:
https://github.com/personalrobotics/or_benchmarks/tree/master/datasets
Running them for Ada creates some joint-limits errors .

Thanks for the reference on git-tables :)

@Stefanos19
Copy link
Contributor Author

Also, checking out the latest prpy results in the following error:

environment-core.h:221 Destroy] destroy module
Traceback (most recent call last):
  File "/homes/snikolai/catkin_demo040/src/ada_meal_scenario/src/bite_serving_FSM_timeTest.py", line 280, in <module>
    adaManipulationTester.initSimple()
  File "/homes/snikolai/catkin_demo040/src/ada_meal_scenario/src/bite_serving_FSM_timeTest.py", line 162, in initSimple
    self.manip.PlanToConfiguration(lookingAtPlateConfiguration)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/manipulator.py", line 64, in wrapper_method
    return self._PlanWrapper(delegate_method, args, kw_args)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/manipulator.py", line 156, in _PlanWrapper
    cloned_env.Cloned(self).SetActive()
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/bind.py", line 93, in intercept
    clone_method(canonical_parent)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/mico.py", line 129, in CloneBindings
    self.__init__(True, None)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/mico.py", line 90, in __init__
    dof_indices=self.GetArmIndices(), affine_dofs=0, simulated=sim)        
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/robot.py", line 155, in AttachController
    self.multicontroller.AttachController(delegate_controller, dof_indices, affine_dofs)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/bind.py", line 113, in intercept
    return canonical_instance.__getattr__(name)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/robot.py", line 101, in __getattr__
    elif self.actions.has_action(name):
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/bind.py", line 113, in intercept
    return canonical_instance.__getattr__(name)
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/base/robot.py", line 101, in __getattr__
    elif self.actions.has_action(name):
.
.
.
.
  File "/homes/snikolai/catkin_demo040/src/prpy/src/prpy/bind.py", line 51, in intercept
    canonical_instance = InstanceDeduplicator.get_canonical(self)
RuntimeError: maximum recursion depth exceeded
[INFO] [rospy.core:core.py:387]:signal_shutdown: signal_shutdown [atexit]

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

In that case, you can just run the standalone run_collision_benchmark.py script with the --self option.

@jeking04 Could you take a look at making run_collision_benchmark.py less HERB-specific? It would be handy to run this script on other robots, too.

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

@Stefanos19 You should file an issue on PrPy. Even if the error is in AdaPy, there is no excuse for PrPy throwing a maximum recursion depth exceeded exception with a completely opaque stack trace.

@Stefanos19
Copy link
Contributor Author

Here is the output of run_collision_benchmark with the --self option:

[BenchmarksModule.cpp:40 CreateInterfaceValidated] [CollisionCheckingModule] Creating interface.
[BenchmarksModule.cpp:42 CreateInterfaceValidated] [CollisionCheckingModule] Generating collision checking benchmark plugin.
[CollisionCheckingBenchmark.cpp:36 RunBenchmark] [CollisionCheckingBenchmark] Running collision checking for body ADA
[CollisionCheckingBenchmark.cpp:60 RunBenchmark] [CollisionCheckingBenchmark] Sampling poses from cube with extent 2.000
[DataUtils.cpp:62 GenerateRandomConfigurations] [CollisionCheckingBenchmark] Generating 50000 poses
[CollisionCheckingBenchmark.cpp:281 RunSelfCollisionBenchmark] [CollisionCheckingBenchmark] Results:
[CollisionCheckingBenchmark.cpp:282 RunSelfCollisionBenchmark]  Total self collision checks: 50000
[CollisionCheckingBenchmark.cpp:283 RunSelfCollisionBenchmark]  Total collisions found: 2156
[CollisionCheckingBenchmark.cpp:284 RunSelfCollisionBenchmark]  Total elapsed time: 16.94463
[CollisionCheckingBenchmark.cpp:285 RunSelfCollisionBenchmark]  Average time per check: 0.0003389
[CollisionCheckingBenchmark.cpp:286 RunSelfCollisionBenchmark]  Average checks per second: 2950.79

@Stefanos19
Copy link
Contributor Author

@mkoval ok, done.

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

That is 2950.79 self-collision checks per second. This is quite a bit faster than HERB using ODE.

It's hard to draw any concrete conclusion from that, since this is partially expected:

  1. much more of HERB's configuration space is in self-collision
    • on HERB, 50% of the C-space is in self-collision per arm and over 75% for bimanual manipulation
    • on ADA, only 4% of sampled poses (5156 / 50000) is in self-collision
    • it's much more expensive to do a positive check (or a near miss) than a check in free space
  2. ADA has many fewer links to collision checks

We may want to try the latest version of PrPy (once personalrobotics/prpy#89 is fixed) and do some more profiling of the demo script before putting too much time into simplifying the model.

@cdellin
Copy link

cdellin commented Apr 5, 2015

it's much more expensive to do a positive check (or a near miss) than a check in free space

This may be true, but sometimes the opposite can be true for a different reason -- the checker can return False on the first collision it finds, whereas if the configuration is valid, it must finish checking all links before it returns True. I have no idea what the tradeoff is for the Ada model, though.

@mkoval
Copy link
Member

mkoval commented Apr 5, 2015

@cdellin Yes, that is definitely true. The worst case is definitely a near-miss, but it's not clear how "in collision" versus "not in collision" compare.

@Stefanos19
Copy link
Contributor Author

I ran the bite-serving code with the latest version of prpy. It is slightly faster than 0.4.0 (a bite cycle takes on average 37 secs with the latest master vs 41 secs for the 0.4.0), but it is still much slower than the prpy 0.3.1 (26 secs).

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

There shouldn't be any changes in PrPy between 0.3.1 and master that can account for that type of performance difference. The difference must be in which planners or smoothers are being called (or an inadvertent change in one of them).

Looking at your table of times, it looks like the big increase in time was split between PlanToLookingAtFace, PlanToLookingAtPlate, and PlanToEndEffectorPose. I can't find these calls in the demo script. Which planning method are you calling in each of these cases (e.g. PlanToConfiguration, PlanToEndEffectorPose, PlanToTSR, etc)? Also, do those times include execution time?

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

@Stefanos19 Re: the PlanToTSR error in CBiRRT, it looks like tsr_chains is None. This shouldn't happen in normal operation - is it possible that you're passing an invalid argument?

@Stefanos19
Copy link
Contributor Author

I uploaded the file that I used for testing.
https://github.com/personalrobotics/ada_meal_scenario/blob/master/src/bite_serving_FSM_timeTest.py
Note that this file does not have the latest changes (removing the configuration definitions etc,), but I use it for testing since it does not have any cached trajectories.

In all tests I used only the cbirrt planner for everything, as well as the openravepy smoother, for consistency. To ensure that only the cbirrt planner is called, I changed the Sequence in adarobot.py, as follows:

        actual_planner = Sequence(
            self.birrt_planner,
            MethodMask(
                FirstSupported(
                    self.cbirrt_planner,
                ),
                methods=['PlanToIK', 'PlanToTSR', 'PlanToEndEffectorPose', 'PlanToEndEffectorOffset']
            )
        )
        self.planner = FirstSupported(
            actual_planner,
            # Special purpose meta-planner.
         )

For the prpy 0.3.1 test, I also changed the sequence in adarobot.py so that
self.robot.planner = prpy.planning.Sequence(self.robot.cbirrt_planner)

The times are the times from the first PlanToEndEffectorOffset, when the robot starts to go grab a bite, until it reaches the same point again, after it serves the bite to the person.

@Stefanos19
Copy link
Contributor Author

@mkoval On the PlanToTSR error, I found the bug and deleted the comment, but it was too late :) Thanks!!

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

It's hard to make much of those numbers. Could you:

  • Measure the time spent in each planning call with execute=False, e.g.
with prpy.util.Timer('PlaningToXXX'):
    path = robot.PlanToXXX(..., execute=False)
  • Separately measure execution time:
with prpy.util.Timer('MovingToXXX'):
    robot.ExecutePath(path)

Also, I would expect CBiRRT to be very slow with PlanToEndEffectorOffset. VectorFieldPlanner, GreedyIkPlanner, or even MKPlanner should be much faster. Are you sure you were exclusively using CBiRRT in 0.3?

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

@Stefanos19 Was the PlanToTSR error in your code or in PrPy?

@Stefanos19
Copy link
Contributor Author

@mkoval It was in my code, I used the "old" sequence formatting which caused the error.
Yeah, I am positive. VectorFieldPlanner and GreedyIKPlanner are not supported in 0.3.1
Sure, I will do the tests that you mention.

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

@Stefanos19 What is the old sequence formatting? I don't think Sequence has changed in ages.

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

In general, we need to figure out which specific function call(s) in PrPy got slower in master. The only slow operations are really planning, smoothing, retiming, and trajectory execution. So it's just a matter of timing all of those calls and figuring out where all the time is going.

@Stefanos19
Copy link
Contributor Author

Sorry, I did not phrase this well. I meant that in 0.3.1, I did:
self.planner = Sequence(self.cbirrt_planner)
But the way we set the planner in the master prpy now is slightly different. Doing the above caused the error
( that was only for the testing, since I want the same planner in both conditions)

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

That should definitely not cause an error. This most likely means that CBiRRT is always failing because of a syntax error and another planner is picking up the slack. Can you file an issue on PrPy?

BTW: There is no reason to use Sequence if there is only one planner. Just set robot.planner = self.cbirrt_planner. See the PrPy documentation for more information.

@Stefanos19
Copy link
Contributor Author

Ok, I will replicate the error to make sure that this was it. It is possible that it was something else.
I will repeat the tests by setting robot.planner = self.cbirrt_planner
Still, if the robot plans and the only planner in the sequence is cbirrt, then there should not be any other planner picking up, right?

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

Yes, that is correct. But self.cbirrt_planner and Sequence(self.cbirrt_planner) are also definitely identical. So there is something else going on here (maybe robot.planner is being set elsewhere)?

@Stefanos19
Copy link
Contributor Author

Now that I think about it, it is possible that the TSR error was caused because I did not set up the robot.planner at all.
Note that this TSR error has nothing to do with the delay, as when that error occurred the planner failed and the program exited. The discussion with the delays using the latest prpy is after I fixed the error by settting the Sequence as

 actual_planner = Sequence(
            self.birrt_planner,
            MethodMask(
                FirstSupported(
                    self.cbirrt_planner,
                ),
                methods=['PlanToIK', 'PlanToTSR', 'PlanToEndEffectorPose', 'PlanToEndEffectorOffset']
            )
        )
        self.planner = FirstSupported(
            actual_planner,
            # Special purpose meta-planner.
         )

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

In any case, it is a bug if CBiRRT is crashing with that error. You should report it on PrPy. I also don't see why that sequence would fix anything - CBiRRT is the only planner in the list that implements PlanToTSR (which seems to be the source of the error).

BTW: That list is equivalent to:

self.planner = FirstSupported(
    self.birrt_planner,
    self.cbirrt_planner
)

@Stefanos19
Copy link
Contributor Author

Got it. I will replicate the error and file the bug, and redo the comparison using self.planner = self.cbirrt_planner in both and with the time-tests that you have above.

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

Make sure you get timing information for the individual planning calls. We really do need to figure out what part of PrPy got slower.

@Stefanos19
Copy link
Contributor Author

Right.

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

I think I fixed the CBiRRT bug here: personalrobotics/prpy@e7b48b8

@Stefanos19
Copy link
Contributor Author

That's interesting: Replacing

self.manip.PlanToConfiguration(servingConfiguration)

with

path = self.robot.PlanToConfiguration(servingConfiguration, execute=False)
self.robot.ExecutePath(path)

solved the timing issue. Aren't the two identical?

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

They should be identical. Passing execute=False just calls ExecutePath (see here). This shortcuts, smooths, and times the path and calls ExecuteTrajectory on the resulting trajectory (see here).

I don't know why there is a difference if you call it manually.

I am a bit surprised that this works at all. There is a bunch of HERB-specific code in the MicoRobot implementation of ExecuteTrajectory. I've been cleaning this up in personalrobotics/prpy#97.

@mkoval
Copy link
Member

mkoval commented Apr 6, 2015

BTW: Your first snippet calls the planner on self.manip and the second calls it on self.robot. This will only work if the manipulator DOFs happen to be active (and in the correct order!).

@siddhss5
Copy link

siddhss5 commented Apr 7, 2015

What?! How does manip even have any of those planning methods? Aren't they all inherited from robot base?

@mkoval
Copy link
Member

mkoval commented Apr 7, 2015

They're magically added to Manipulator using __getattr__ and __dir__. This is the same mechanism we use to add planning methods to Robot. Here is the relevant code in Robot and in Manipulator.

The only difference is that calling a planning method on a manipulator first sets itself as active.

@siddhss5
Copy link

siddhss5 commented Apr 7, 2015

Aha. So what is with this slowdown then?

@mkoval
Copy link
Member

mkoval commented Apr 13, 2015

Is this fixed? What was the problem?

@Stefanos19
Copy link
Contributor Author

I am not sure what the problem was. I no longer see the slowdown between calling path = PlanToConfiguration(config, execute=False), ExecutePath(path), and PlanToConfiguration(config). It might have been the ancient or_ompl version I used, or the smoothing/shortcutting parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants