Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve timeout handling on POSIX and Windows for #521 #600

Merged
merged 2 commits into from
Apr 20, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion src/scancode/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@
from scancode.interrupt import DEFAULT_TIMEOUT

from scancode import utils
from scancode.interrupt import TimeoutError

echo_stderr = partial(click.secho, err=True)

Expand Down Expand Up @@ -225,7 +226,7 @@ def validate_formats(ctx, param, value):
@click.option('--version', is_flag=True, is_eager=True, callback=print_version, help='Show the version and exit.')

@click.option('--diag', is_flag=True, default=False, help='Include additional diagnostic information such as error messages or result details.')
@click.option('--timeout', is_flag=False, default=DEFAULT_TIMEOUT, type=int, show_default=True, help='Stop scanning a file if scanning takes longer than a timeout in seconds.')
@click.option('--timeout', is_flag=False, default=DEFAULT_TIMEOUT, type=float, show_default=True, help='Stop scanning a file if scanning takes longer than a timeout in seconds.')

def scancode(ctx,
input, output_file,
Expand Down Expand Up @@ -626,13 +627,16 @@ def scan_one(input_file, scanners, diag=False):
if isinstance(scan_details, GeneratorType):
scan_details = list(scan_details)
scan_result[scan_name] = scan_details
except TimeoutError:
raise
except Exception as e:
# never fail but instead add an error message and keep an empty scan:
scan_result[scan_name] = []
messages = ['ERROR: ' + scan_name + ': ' + e.message]
if diag:
messages.append('ERROR: ' + scan_name + ': ' + traceback.format_exc())
scan_errors.extend(messages)

# put errors last, after scans proper
scan_result['scan_errors'] = scan_errors
return scan_result
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
about_resource: robotframework-3.0.2-py2-none-any.whl
about_resource: interrup.py
version: 3.0.2
download_url: https://pypi.python.org/packages/3e/79/d8b9a7ea833cf4f33d51c0d5f24b825ac72105bf30c147b472da10895143/robotframework-3.0.2.tar.gz#md5=ea49a54b9d7e38302712194e85c37eaa

name: robotframework
home_url: http://robotframework.org/
owner: Robot Framework Foundation
dje_license: apache-2.0
notice_file: robotframework.NOTICE
notice_file: interrupt-robotframework.NOTICE
license_text_file: apache-2.0.LICENSE

copyright: Copyright 2008-2015 Nokia Networks
Copyright 2016- Robot Framework Foundation

vcs_tool: git
vcs_repository: https://github.com/robotframework/robotframework.git

notes: some snippets and ideas were borrowed from the timeout implmentations.
7 changes: 7 additions & 0 deletions src/scancode/interrupt-thread2.ABOUT
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
about_resource: interrupt.py
author: Tomer Filiba
homepage_url: http://tomerfiliba.com/recipes/Thread2/
license: public-domain
notes: Per http://tomerfiliba.com/recipes/
All of the following are published as "public domain", or, if you prefer,
under the MIT license.
111 changes: 111 additions & 0 deletions src/scancode/interrupt-thread2.README
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
Killable Threads
August 13, 2006
By Tomer Filiba

from http://tomerfiliba.com/recipes/Thread2/

Per http://tomerfiliba.com/recipes/

> All of the following are published as "public domain", or, if you prefer,
under the MIT license.

# Killable Threads

The `thread2` module is an extension of the standard `threading` module, and provides
the means to raise exceptions at the context of the given thread. You can use
`raise_exc()` to raise an arbitrary exception, or call `terminate()` to raise
`SystemExit` automatically.

It uses the unexposed `PyThreadState_SetAsyncExc` function (via `ctypes`) to raise an
exception in the context of the given thread. Inspired by the code of Antoon Pardon
at http://mail.python.org/pipermail/python-list/2005-December/316143.html

### Issues

* The exception will be raised only when executing python bytecode. If your thread
calls a native/built-in blocking function, the exception will be raised only when
execution returns to the python code.

* There is also an issue if the built-in function internally calls `PyErr_Clear()`,
which would effectively cancel your pending exception. You can try to raise it again.

* Only exception **types** can be raised safely. Exception instances are likely to
cause unexpected behavior, and are thus restricted.

* For example: `t1.raise_exc(TypeError)` and not `t1.raise_exc(TypeError("blah"))`.
* IMHO it's a bug, and I reported it as one. For more info
http://mail.python.org/pipermail/python-dev/2006-August/068158.html

* I asked to expose this function in the built-in `thread` module, but since `ctypes`
has become a standard library (as of 2.5), and this feature is not likely to be
implementation-agnostic, it may be kept unexposed.

## Code

import threading
import inspect
import ctypes

def _async_raise(tid, exctype):
"""raises the exception, performs cleanup if needed"""
if not inspect.isclass(exctype):
raise TypeError("Only types can be raised (not instances)")
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, ctypes.py_object(exctype))
if res == 0:
raise ValueError("invalid thread id")
elif res != 1:
# """if it returns a number greater than one, you're in trouble,
# and you should call it again with exc=NULL to revert the effect"""
ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, 0)
raise SystemError("PyThreadState_SetAsyncExc failed")

class Thread(threading.Thread):
def _get_my_tid(self):
"""determines this (self's) thread id"""
if not self.isAlive():
raise threading.ThreadError("the thread is not active")

# do we have it cached?
if hasattr(self, "_thread_id"):
return self._thread_id

# no, look for it in the _active dict
for tid, tobj in threading._active.items():
if tobj is self:
self._thread_id = tid
return tid

raise AssertionError("could not determine the thread's id")

def raise_exc(self, exctype):
"""raises the given exception type in the context of this thread"""
_async_raise(self._get_my_tid(), exctype)

def terminate(self):
"""raises SystemExit in the context of the given thread, which should
cause the thread to exit silently (unless caught)"""
self.raise_exc(SystemExit)


## Example


>>> import time
>>> from thread2 import Thread
>>>
>>> def f():
... try:
... while True:
... time.sleep(0.1)
... finally:
... print "outta here"
...
>>> t = Thread(target = f)
>>> t.start()
>>> t.isAlive()
True
>>> t.terminate()
>>> t.join()
outta here
>>> t.isAlive()
False
162 changes: 123 additions & 39 deletions src/scancode/interrupt.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@
# Copyright (c) 2017 nexB Inc. and others. All rights reserved.
# http://nexb.com and https://github.com/nexB/scancode-toolkit/
# The ScanCode software is licensed under the Apache License version 2.0.
# Data generated with ScanCode require an acknowledgment.
# ScanCode is a trademark of nexB Inc.
#
# You may not use this software except in compliance with the License.
# You may obtain a copy of the License at: http://apache.org/licenses/LICENSE-2.0
Expand All @@ -12,58 +10,144 @@
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
# When you publish or redistribute any data created with ScanCode or any ScanCode
# derivative work, you must accompany this data with the following acknowledgment:
#
# Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
# OR CONDITIONS OF ANY KIND, either express or implied. No content created from
# ScanCode should be considered or used as legal advice. Consult an Attorney
# for any legal advice.
# ScanCode is a free software code scanning tool from nexB Inc. and others.
# Visit https://github.com/nexB/scancode-toolkit/ for support and download.

from __future__ import print_function
from __future__ import absolute_import

from scancode.timeouts import Timeout

from commoncode.system import on_windows

DEFAULT_TIMEOUT = 120 # seconds


"""
This modules povides an interruptible() function to run a callable and
stop it after a timeout with a windows and POSIX implementation.

Call `func` function with `args` and `kwargs` arguments and return a
tuple of (success, return value). `func` is invoked through an OS-
specific wrapper and will be interrupted if it does not return within
`timeout` seconds.

`func` returned results must be pickable.
`timeout` in seconds defaults to DEFAULT_TIMEOUT.

`args` and `kwargs` are passed to `func` as *args and **kwargs.

In the returned tuple of (success, value), success is True or False. If
success is True, the call was successful and the second item in the
tuple is the returned value of `func`.

If success is False, the call did not complete within `timeout`
seconds and was interrupted. In this case, the second item in the
tuple is an error message string.
"""

class TimeoutError(Exception):
pass


def interruptible(func, args=(), kwargs={}, timeout=DEFAULT_TIMEOUT):
if not on_windows:
"""
Call `func` function with `args` and `kwargs` arguments and return a tuple of
(success, return value). `func` is invoked through an OS-specific wrapper and
will be interrupted if it does not return within `timeout` seconds.

`func` returned results must be pickable.
`timeout` in seconds defaults to DEFAULT_TIMEOUT.

`args` and `kwargs` are passed to `func` as *args and **kwargs.

In the returned tuple of (success, value), success is True or False.
If success is True, the call was successful and the second item in the tuple is
the returned value of `func`.

If success is False, the call did not complete within `timeout`
seconds and was interrupted. In this case, the second item in the
tuple is an error message string.
Some code based in part and inspired from the RobotFramework and
heavily modified.

Copyright 2008-2015 Nokia Networks
Copyright 2016- Robot Framework Foundation

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License.
"""

runner = Timeout(timeout, TimeoutError)
import signal

def interruptible(func, args=None, kwargs=None, timeout=DEFAULT_TIMEOUT):
"""
POSIX, signals-based interruptible runner.
"""

def runnable():
return func(*args, **kwargs)
def handler(signum, frame):
raise TimeoutError

try:
signal.signal(signal.SIGALRM, handler)
signal.setitimer(signal.ITIMER_REAL, timeout)
return True, func(*(args or ()), **(kwargs or {}))
except TimeoutError:
return False, ('ERROR: Processing interrupted: timeout after '
'%(timeout)d seconds.' % locals())
finally:
signal.setitimer(signal.ITIMER_REAL, 0)

else:
"""
Run a function in an interruptible thread with a timeout.
Based on an idea of dano "Dan O'Reilly"
http://stackoverflow.com/users/2073595/dano
But not code has been reused from this post.
"""

import ctypes
import multiprocessing
import Queue
try:
return True, runner.execute(runnable)
except:
import traceback
traceback.print_exc()
return False, ('ERROR: Processing interrupted: timeout after '
'%(timeout)d seconds.' % locals())
import thread
except ImportError:
import _thread as thread


def interruptible(func, args=None, kwargs=None, timeout=DEFAULT_TIMEOUT):
"""
Windows, threads-based interruptible runner. It can work also on
POSIX, but is not reliable and works only if everything is pickable.
"""
# We run `func` in a thread and run a loop until timeout
results = Queue.Queue()

def runner():
results.put(func(*(args or ()), **(kwargs or {})))

tid = thread.start_new_thread(runner, ())

try:
res = results.get(timeout=timeout)
return True, res
except (Queue.Empty, multiprocessing.TimeoutError):
return False, ('ERROR: Processing interrupted: timeout after '
'%(timeout)d seconds.' % locals())
finally:
try:
async_raise(tid, Exception)
except (SystemExit, ValueError):
pass


def async_raise(tid, exctype=Exception):
"""
Raise an Exception in the Thread with id `tid`. Perform cleanup if
needed.

Based on Killable Threads By Tomer Filiba
from http://tomerfiliba.com/recipes/Thread2/
license: public domain.
"""
assert isinstance(tid, int), 'Invalid thread id: must an integer'

tid = ctypes.c_long(tid)
exception = ctypes.py_object(Exception)
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, exception)
if res == 0:
raise ValueError('Invalid thread id.')
elif res != 1:
# if it returns a number greater than one, you're in trouble,
# and you should call it again with exc=NULL to revert the effect
ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, 0)
raise SystemError('PyThreadState_SetAsyncExc failed.')
Loading