Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XPU] fix the dataloader problem in RDMA env #54150

Merged
merged 7 commits into from
Jun 28, 2023

Commits on May 27, 2023

  1. [kunlun] fix the dataloader problem in RDMA env

    When running multi-machine training with Paddle DataLoader, an
    unexpected segmentfault will be raised in DataLoader Process,
    where the traceback goes all back to a runtime error that dataloader
    workers exit unexpectedly. Similar problems have been discussed
    that lead to a misbehavior of OpenCV working in multiprocessing
    environment.
    See
    https://stackoverflow.com/questions/54013846/pytorch-dataloader-stucked-if-using-opencv-resize-method
    XiaociZhang committed May 27, 2023
    Configuration menu
    Copy the full SHA
    5496cda View commit details
    Browse the repository at this point in the history
  2. code style

    XiaociZhang committed May 27, 2023
    Configuration menu
    Copy the full SHA
    5ee1b12 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2023

  1. Configuration menu
    Copy the full SHA
    9154033 View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2023

  1. Configuration menu
    Copy the full SHA
    cea52df View commit details
    Browse the repository at this point in the history
  2. Update dataloader_iter.py

    spawn method raise error 'Can't pickle local object' in some situations
    XiaociZhang authored Jun 26, 2023
    Configuration menu
    Copy the full SHA
    1646bb8 View commit details
    Browse the repository at this point in the history
  3. code format check

    XiaociZhang authored Jun 26, 2023
    Configuration menu
    Copy the full SHA
    6d00310 View commit details
    Browse the repository at this point in the history
  4. code style

    XiaociZhang committed Jun 26, 2023
    Configuration menu
    Copy the full SHA
    16b7522 View commit details
    Browse the repository at this point in the history