Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on windows when trying to deploy - [WinError 32] #494

Closed
amartins2imp opened this issue Sep 20, 2023 · 17 comments
Closed

Error on windows when trying to deploy - [WinError 32] #494

amartins2imp opened this issue Sep 20, 2023 · 17 comments

Comments

@amartins2imp
Copy link

amartins2imp commented Sep 20, 2023

I have successfully installed and run scrapyd on Windows. However, when i try to deploy to scrapyd I have the following error:

Traceback (most recent call last):
  File "C:\Users\myuser\AppData\Local\pypoetry\Cache\virtualenvs\myproject-J0q5INJf-py3.11\Lib\site-packages\scrapyd\runner.py", line 35, in project_environment
    yield
  File "C:\Users\myuser\AppData\Local\pypoetry\Cache\virtualenvs\myproject-J0q5INJf-py3.11\Lib\site-packages\scrapyd\runner.py", line 45, in main
    execute()
  File "C:\Users\myuser\AppData\Local\pypoetry\Cache\virtualenvs\myproject-J0q5INJf-py3.11\Lib\site-packages\scrapy\cmdline.py", line 162, in execute
    sys.exit(cmd.exitcode)
SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 198, in _run_module_as_main
  File "", line 88, in _run_code
  File "C:\Users\myuser\AppData\Local\pypoetry\Cache\virtualenvs\myproject-J0q5INJf-py3.11\Lib\site-packages\scrapyd\runner.py", line 49, in 
    main()
  File "C:\Users\myuser\AppData\Local\pypoetry\Cache\virtualenvs\myproject-J0q5INJf-py3.11\Lib\site-packages\scrapyd\runner.py", line 43, in main
    with project_environment(project):
  File "C:\Users\myuser\.pyenv\pyenv-win\versions\3.11.5\Lib\contextlib.py", line 155, in exit
    self.gen.throw(typ, value, traceback)
  File "C:\Users\myuser\AppData\Local\pypoetry\Cache\virtualenvs\myproject-J0q5INJf-py3.11\Lib\site-packages\scrapyd\runner.py", line 38, in project_environment
    os.remove(eggpath)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\myuser\\AppData\\Local\\Temp\\myproject-r23-8_8_ghhi.egg'

I have tried with simple scrapyd API (curl http://localhost:6800/addversion.json -F project=myproject -F version=r23 -F egg=@myproject.egg) and with scrapy-deploy from scrapy-client.

I am using Windows 11 with python 3.11.4

Any help will be appreciated!

@jpmckinney
Copy link
Contributor

Hmm

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\myuser\AppData\Local\Temp\myproject-r23-8_8_ghhi.egg'

Can you check that you aren't running multiple Scrapyd processes?

@jpmckinney jpmckinney added type: question a user support question topic: deployment labels Sep 20, 2023
@amartins2imp
Copy link
Author

Hmm

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\myuser\AppData\Local\Temp\myproject-r23-8_8_ghhi.egg'

Can you check that you aren't running multiple Scrapyd processes?

Yeah, i wasn't. I even rebooted my pc and retried everything with a clean environment

@jpmckinney
Copy link
Contributor

Does Windows have a utility to determine which processes are using a given file?

On Linux, lsof can be used for this purpose.

I don't think Scrapyd is causing the issue, as only one process would be trying to access the egg.

@sanzenwin
Copy link

Same issue in win11, Inserting import time;time.sleep(5000) before os.remove(eggpath), and then use Microsoft PowerToys / File Locksmith to check the egg file, It proving Scrapyd do caused this issue

@jpmckinney
Copy link
Contributor

I've committed a fix to HEAD. Can you test with the version of Scrapyd from GitHub?

@jpmckinney jpmckinney added type: bug and removed type: question a user support question topic: deployment labels Sep 22, 2023
@sanzenwin
Copy link

It seems to have no relation to tempfile, but to try finally

import os
import shutil
import sys
import tempfile
# from contextlib import contextmanager

from scrapy.utils.misc import load_object

from scrapyd import Config
from scrapyd.eggutils import activate_egg


def project_environment(project):
    eggversion = os.environ.get('SCRAPYD_EGG_VERSION', None)
    config = Config()
    eggstorage_path = config.get(
        'eggstorage', 'scrapyd.eggstorage.FilesystemEggStorage'
    )
    eggstorage_cls = load_object(eggstorage_path)
    eggstorage = eggstorage_cls(config)

    version, eggfile = eggstorage.get(project, eggversion)
    if eggfile:
        prefix = '%s-%s-' % (project, version)
        f = tempfile.NamedTemporaryFile(suffix='.egg', prefix=prefix, delete=False)
        shutil.copyfileobj(eggfile, f)
        f.close()
        activate_egg(f.name)
    else:
        f = None
    return f
    # try:
    #     assert 'scrapy.conf' not in sys.modules, "Scrapy settings already loaded"
    #     yield
    # finally:
    #     if f:
    #         os.remove(f.name)


def main_finally():
    project = os.environ['SCRAPY_PROJECT']
    f = None
    try:
        f = project_environment(project)
        from scrapy.cmdline import execute
        execute()
    finally:
        if f:
            os.remove(f.name)


def main():
    project = os.environ['SCRAPY_PROJECT']
    f = None
    f = project_environment(project)
    from scrapy.cmdline import execute
    execute()
    if f:
        os.remove(f.name)


if __name__ == '__main__':
    main() # work fine
    # main_finally()  # rasie

@jpmckinney
Copy link
Contributor

jpmckinney commented Sep 23, 2023

They are connected - if the tempfile is never created, then it can never be removed.

Anyway, can you add eggfile.close() after f.close() to see what happens?

@jpmckinney
Copy link
Contributor

The error doesn’t occur when you remove exception handling, because Scrapy raises SystemExit, which causes the process to end - but we’re trying to capture that

@sanzenwin
Copy link

@jpmckinney

import os
import sys
import tempfile
import shutil
import operator
import functools
import pkg_resources
import itertools
from importlib.metadata._itertools import unique_everseen
from importlib.metadata import distributions


def activate_egg(eggpath):
    """Activate a Scrapy egg file. This is meant to be used from egg runners
    to activate a Scrapy egg file. Don't use it from other code as it may
    leave unwanted side effects.
    """
    try:
        d = next(pkg_resources.find_distributions(eggpath))
    except StopIteration:
        raise ValueError("Unknown or corrupt egg")
    d.activate()
    settings_module = d.get_entry_info('scrapy', 'settings').module_name
    os.environ.setdefault('SCRAPY_SETTINGS_MODULE', settings_module)


def main():
    eggfile = open('./0_1_0.egg', 'rb')

    f = None
    try:
        f = tempfile.NamedTemporaryFile(suffix='.egg', delete=False)
        shutil.copyfileobj(eggfile, f)
        activate_egg(f.name)
        f.close()

        # from scrapy.cmdline import execute
        # execute(['C:\\Users\\Sanze\\AppData\\Local\\pdm\\pdm\\Cache\\packages\\scrapyd-1.4.2-py2.py3-none-any\\lib\\scrapyd\\runner.py', 'list', '-s', 'LOG_STDOUT=0'])

        # traceback
        #
        # scrapy.cmdline.execute
        # |
        # scrapy.cmdline._get_commands_dict
        # |
        # scrapy.cmdline._get_commands_from_entry_points
        # |
        # importlib.metadata.entry_points

        norm_name = operator.attrgetter('_normalized_name')
        unique = functools.partial(unique_everseen, key=norm_name)

        list(dist.entry_points for dist in unique(distributions()))
        #### cause PermissionError: [WinError 32] The process cannot access the file because it is being used by another process

    finally:
        if f:
            os.remove(f.name)


if __name__ == '__main__':
    main()

@jpmckinney
Copy link
Contributor

@sanzenwin

As I requested, please test by add eggfile.close() after f.close(), to see what happens.

@jpmckinney
Copy link
Contributor

Please test HEAD again – I added that line myself, and also got rid of the temporary file.

@sanzenwin
Copy link

@sanzenwin

As I requested, please test by add eggfile.close() after f.close(), to see what happens.

I had already tired it, it still threw that error. The key is list(dist.entry_points for dist in unique(distributions())) , which used by scrapy.cmdline.execute, please check my posted code.

@jpmckinney
Copy link
Contributor

jpmckinney commented Sep 25, 2023

That line must be opening the file a second time. But the error must be that it’s opened a first time somewhere else. I doubt the Python standard library (importlib) has a Windows error that opens files twice on its own.

Can you try the new HEAD from GitHub?

@sanzenwin
Copy link

I checked the HEAD, having no changes. Did you have any committed code.

@jpmckinney
Copy link
Contributor

Ah, sorry, I forgot to push: you can try now.

@sanzenwin
Copy link

Ah, sorry, I forgot to push: you can try now.

I have tried it, it works.

@jpmckinney
Copy link
Contributor

Thank you for confirming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants