Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: No module named catkin.environment_cache #806

Closed
mikepurvis opened this issue Jun 6, 2016 · 17 comments
Closed

ImportError: No module named catkin.environment_cache #806

mikepurvis opened this issue Jun 6, 2016 · 17 comments

Comments

@mikepurvis
Copy link
Member

We're getting an intermittent issue affecting our CI builds. It pops up at random on different packages, but always takes this form:

00:12:49.053   File "/tmp/buildd/workspace/debian/tmp/build/<pkg>/catkin_generated/generate_cached_setup.py", line 20, in <module>
00:12:49.053     from catkin.environment_cache import generate_environment_script
00:12:49.053 ImportError: No module named catkin.environment_cache

This is running inside a cowbuilder, on vmware, with no additional containerization or isolation. The overall build is a large parallel affair being managed by catkin_tools. I've reported this there too (catkin/catkin_tools#378), but @jbohren suggests (and I'm inclined to agree) that this may only be fixable within catkin itself.

Looking at the templated logic here, I'm wondering if we're looking at a race condition with the copy-on-write filesystem?

It's not clear to me where an additional sync call is required, or even how to reliably reproduce this problem, but I'd be delighted to accept counsel on either of those points. In the meantime, would we consider a workaround which, say, pauses for 100ms and then retries the whole import/try-block in the event of the second import failing?

@dirk-thomas
Copy link
Member

dirk-thomas commented Jun 6, 2016

The error message indicated that the Python package catkin is there. Otherwise the error would be:

ImportError: No module named 'catkin'

But the module environment_cache seems to not exist.

Please provide some logs about what happens in the CI build. What is in the workspace?

How is catkin_pkg being installed? (this shouldn't be relevant, sorry)

If you have control about the setup it might be valuable to add some debug logic like:

import catkin
print(catkin.__file__)

@mikepurvis
Copy link
Member Author

catkin_pkg (and friends) are installed from debs inside the cowbuilder.

The workspace is several hundred packages— it builds a nightly SDK bundle of our software.

I'll fork catkin and add the suggested lines; then we'll hopefully get some additional info the next time this triggers.

@dirk-thomas
Copy link
Member

Are you using the devel space or actually doing an install?

@mikepurvis
Copy link
Member Author

mikepurvis commented Jun 6, 2016

It's catkin config --install, so each package is installed into a merged install-space, which is the environment for successive builds. Write access to the merged install space is protected by a lock, but it's conceivable that reads could occur concurrent with re-writes.

That said, the environment_cache module is generated on a per-package basis, each to their own build space, so why would there be a collision happening there?

@dirk-thomas
Copy link
Member

Since catkin is only being installed once and that should happen I don't see how that can happen. Either the Python package catkin can't be found (e.g. if its not on the Python path) or the Python package including its module should be completely available.

@mikepurvis
Copy link
Member Author

Given that it's generated only a single time, some kind of problem with the PYTHONPATH does seem likely. Will continue to dig.

@mikepurvis
Copy link
Member Author

mikepurvis commented Jun 7, 2016

I'm not sure that this is a correct assertion:

The error message indicated that the Python package catkin is there.

Simple test:

$ python -c "from doesnt.exist import thing"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named doesnt.exist

I'll have it trap the error and and dump PYTHONPATH (mikepurvis@aaa7a3c); we'll see what comes up.

@dirk-thomas
Copy link
Member

Sorry, I was only testing with Python 3. Looks like Python 2 behaves differently.

That makes much more sense - not being available at all - rather then being available partially only.

@mikepurvis
Copy link
Member Author

Caught one:

00:08:55.823 Errors << rosgraph_msgs:cmake /tmp/buildd/workspace/debian/tmp/build/logs-nimbus/rosgraph_msgs/build.cmake.000.log
00:08:55.823 Traceback (most recent call last):
00:08:55.823   File "/tmp/buildd/workspace/debian/tmp/build/build-nimbus/rosgraph_msgs/catkin_generated/generate_cached_setup.py", line 23, in <module>
00:08:55.823     raise ImportError("Unable to import catkin.environment_cache.", "PYTHONPATH=%s" % ";".join(sys.path), "Looked for lib/python2.7/dist-packages/catkin in ")
00:08:55.823 ImportError: ('Unable to import catkin.environment_cache.', 'PYTHONPATH=/tmp/buildd/workspace/debian/tmp/build/build-nimbus/rosgraph_msgs/catkin_generated;/usr/lib/python2.7;/usr/lib/python2.7/plat-x86_64-linux-gnu;/usr/lib/python2.7/lib-tk;/usr/lib/python2.7/lib-old;/usr/lib/python2.7/lib-dynload;/usr/local/lib/python2.7/dist-packages;/usr/lib/python2.7/dist-packages;/usr/lib/python2.7/dist-packages/PILcompat;/usr/lib/pymodules/python2.7;/usr/lib/python2.7/dist-packages/wx-2.8-gtk2-unicode', 'Looked for lib/python2.7/dist-packages/catkin in ')
00:08:55.823 CMake Error at /tmp/buildd/workspace/debian/tmp/opt/clearpath/2.1devel/nimbus/share/catkin/cmake/safe_execute_process.cmake:11 (message):
00:08:55.823   execute_process(/usr/bin/python
00:08:55.823   "/tmp/buildd/workspace/debian/tmp/build/build-nimbus/rosgraph_msgs/catkin_generated/generate_cached_setup.py")
00:08:55.823   returned error code 1
00:08:55.823 Call Stack (most recent call first):
00:08:55.823   /tmp/buildd/workspace/debian/tmp/opt/clearpath/2.1devel/nimbus/share/catkin/cmake/all.cmake:186 (safe_execute_process)
00:08:55.823   /tmp/buildd/workspace/debian/tmp/opt/clearpath/2.1devel/nimbus/share/catkin/cmake/catkinConfig.cmake:20 (include)
00:08:55.823   CMakeLists.txt:4 (find_package)

Looks like @CATKIN_WORKSPACES@ evaluated to an empty string at the time this template was expanded, whereas if it had included the path to the current workspace, then the catkin/python/catkin path would have been found and added to the $PYTHONPATH. Indeed, when I run a similarly-configured local build, and inspect the contents of, say, build/genmsg/catkin_generated/generate_cached_setup.py, I see the installspace is included in the @CATKIN_WORKSPACES@ expansion:

# find the import for catkin's python package - either from source space or from an installed underlay
if os.path.exists(os.path.join('/Users/mikepurvis/isolate_install_experiments/install/share/catkin/cmake', 'catkinConfig.cmake.in')):
    sys.path.insert(0, os.path.join('/Users/mikepurvis/isolate_install_experiments/install/share/catkin/cmake', '..', 'python'))
try:
    from catkin.environment_cache import generate_environment_script
except ImportError:
    # search for catkin package in all workspaces and prepend to path
    for workspace in "/Users/mikepurvis/isolate_install_experiments/install".split(';'):
        python_path = os.path.join(workspace, 'lib/python2.7/site-packages')
        if os.path.isdir(os.path.join(python_path, 'catkin')):
            sys.path.insert(0, python_path)
            break
    from catkin.environment_cache import generate_environment_script

@jbohren Thoughts on all this?

@davetcoleman
Copy link

I just got something really similar in a freshly downloaded and setup Docker from today:

Starting  >>> moveit_msgs                                                                                             
Starting  >>> moveit_resources                                                                                        
Finished  <<< moveit_resources                      [ 0.1 seconds ]                                                   
______________________________________________________________________________________________________________________
Errors     << moveit_msgs:cmake /root/ws_moveit/logs/moveit_msgs/build.cmake.006.log                                  
Traceback (most recent call last):
  File "/root/ws_moveit/build/moveit_msgs/catkin_generated/generate_cached_setup.py", line 20, in <module>
    from catkin.environment_cache import generate_environment_script
ImportError: No module named catkin.environment_cache
CMake Error at /opt/ros/indigo/share/catkin/cmake/safe_execute_process.cmake:11 (message):
  execute_process(/usr/bin/python
  "/root/ws_moveit/build/moveit_msgs/catkin_generated/generate_cached_setup.py")
  returned error code 1
Call Stack (most recent call first):
  /opt/ros/indigo/share/catkin/cmake/all.cmake:186 (safe_execute_process)
  /opt/ros/indigo/share/catkin/cmake/catkinConfig.cmake:20 (include)
  CMakeLists.txt:14 (find_package)


cd /root/ws_moveit/build/moveit_msgs; catkin build --get-env moveit_msgs | catkin env -si  /usr/bin/cmake /root/ws_moveit/src/moveit/moveit_msgs --no-warn-unused-cli -DCATKIN_DEVEL_PREFIX=/root/ws_moveit/devel/.private/moveit_msgs -DCMAKE_INSTALL_PREFIX=/root/ws_moveit/install; cd -
......................................................................................................................
Failed     << moveit_msgs:cmake                     [ Exited with code 1 ]     

I catkin clean the workspace and now its building properly. Haven't looked any further into it.

@mikepurvis
Copy link
Member Author

@davetcoleman Whoa, interesting— especially as you hit this a) with catkin_tools, and b) in whatever Docker's COW filesystem is.

Did you happen to try re-invoking catkin build? I'd love to know if this would go away on a rebuild, or if cleaning (either the package or whole workspace) is necessary to recover from it.

@davetcoleman
Copy link

Yes, I tried running catkin build a bunch of times after seeing your post, but it didn't change anything. cleaning did.

@dirk-thomas
Copy link
Member

Since this looks like an environment problem with catkin_tools and not a problem in catkin itself I would suggest to close this ticket. There is already catkin/catkin_tools#378 to keep track of it.

@mikepurvis
Copy link
Member Author

mikepurvis commented Jun 28, 2016

Okay, will do. If it emerges that a change is required in ros/catkin to resolve it, I'll open a new ticket.

@mikepurvis
Copy link
Member Author

For any future travellers, this has been confirmed as a catkin_tools bug around inadequate filesystem mutexing for large parallel builds. The fix is in catkin/catkin_tools#391.

@guo1104b

This comment has been minimized.

@dirk-thomas

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants