Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Error handling and multi-file support in File Component #4353

Merged
merged 21 commits into from
Nov 5, 2024

Conversation

erichare
Copy link
Collaborator

@erichare erichare commented Nov 1, 2024

This Pull Request adds better error handling and a more robust interface for managing file uploads, including support for zip files.

WIP Changes:

  • ZIP Archive Support - Added the ability to process multiple text files within a zip archive. If a zip archive is provided, files within are automatically extracted and supported files are processed sequentially or optionally in parallel.
  • PDF File Support - Uses the PyMuPDF library to parse text from PDFs.
  • Parallel Processing Support - For zip files, process in parallel, optionally.
  • Detailed Logging for Processing Stages - Introduced comprehensive logging throughout the file loading and processing stages.
  • Graceful Error Handling with Silent Mode - Re-use of Langflow silent_errors option that allows non-blocking error handling, letting users bypass errors on files within a multi-file workflow without error.
  • Modularized Processing Logic - Split single and zip file processing into distinct methods, for better readability and maintainability.

@erichare erichare changed the title FEAT: Error handling and multi-file support in File Component feat: Error handling and multi-file support in File Component Nov 1, 2024
@github-actions github-actions bot added the enhancement New feature or request label Nov 1, 2024
@erichare erichare marked this pull request as ready for review November 4, 2024 15:14
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 4, 2024
@erichare erichare marked this pull request as draft November 4, 2024 15:24
@erichare erichare self-assigned this Nov 4, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Nov 4, 2024
@erichare erichare marked this pull request as ready for review November 4, 2024 21:03
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Nov 5, 2024
Copy link

codspeed-hq bot commented Nov 5, 2024

CodSpeed Performance Report

Merging #4353 will not alter performance

Comparing feat-file-component-improvements (0d5b770) with main (a018436)

Summary

✅ 16 untouched benchmarks

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 5, 2024
@erichare erichare enabled auto-merge (squash) November 5, 2024 22:34
@erichare erichare merged commit 07d8f2e into main Nov 5, 2024
27 of 28 checks passed
@erichare erichare deleted the feat-file-component-improvements branch November 5, 2024 23:10
joaoguilhermeS pushed a commit that referenced this pull request Nov 7, 2024
* FEAT: Error handling and multi-file support in File Component

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes

* Update file.py

* [autofix.ci] apply automated fixes

* Add support for multithreading in zips

* [autofix.ci] apply automated fixes

* Add PDF processing support

* [autofix.ci] apply automated fixes

* Update file.py

* [autofix.ci] apply automated fixes

* Update file.py

* [autofix.ci] apply automated fixes

* Update file.py

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
diogocabral pushed a commit to headlinevc/langflow that referenced this pull request Nov 26, 2024
…ow-ai#4353)

* FEAT: Error handling and multi-file support in File Component

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes

* Update file.py

* [autofix.ci] apply automated fixes

* Add support for multithreading in zips

* [autofix.ci] apply automated fixes

* Add PDF processing support

* [autofix.ci] apply automated fixes

* Update file.py

* [autofix.ci] apply automated fixes

* Update file.py

* [autofix.ci] apply automated fixes

* Update file.py

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants