You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1 validation error for Blob
Traceback (most recent call last):
File "/home/lumu/Documents/Projects/trv/Document-Processing-Backend/modules/process_document.py", line 96, in process_document
doc_list, doc_texts = load_files_ensemble(document.doc_name)
File "/home/lumu/Documents/Projects/trv/Document-Processing-Backend/modules/ocr/textract/ensemble.py", line 83, in load_files_ensemble
return load_and_split(file_name)
File "/home/lumu/Documents/Projects/trv/Document-Processing-Backend/modules/ocr/textract/ensemble.py", line 32, in load_and_split
pdf_pages = loader.load()
File "/home/lumu/.local/lib/python3.10/site-packages/langchain_community/document_loaders/pdf.py", line 747, in load
return list(self.lazy_load())
File "/home/lumu/.local/lib/python3.10/site-packages/langchain_community/document_loaders/pdf.py", line 758, in lazy_load
blob = Blob(path=self.web_path) # type: ignore[call-arg] # type: ignore[misc]
File "/home/lumu/.local/lib/python3.10/site-packages/langchain_core/load/serializable.py", line 112, in init
super().init(*args, **kwargs)
File "/home/lumu/.local/lib/python3.10/site-packages/pydantic/main.py", line 212, in init
validated_self = self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Blob
data
Field required [type=missing, input_value={'path': 's3://realviewde...77_Walter_Hardwick.pdf'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
Description
I'm trying to load a document with the textract pdf loader, however there is a bug and doesn't let it load. I've already checked and the document is on the S3, the error is caused due to a failure with pydantic and not due to a not found document error.
System Info
System Information
OS: Linux
OS Version: #41~22.04.2-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 3 11:32:55 UTC 2
Python Version: 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
Checked other resources
Example Code
The following code causes the exception:
Error Message and Stack Trace (if applicable)
1 validation error for Blob
Traceback (most recent call last):
File "/home/lumu/Documents/Projects/trv/Document-Processing-Backend/modules/process_document.py", line 96, in process_document
doc_list, doc_texts = load_files_ensemble(document.doc_name)
File "/home/lumu/Documents/Projects/trv/Document-Processing-Backend/modules/ocr/textract/ensemble.py", line 83, in load_files_ensemble
return load_and_split(file_name)
File "/home/lumu/Documents/Projects/trv/Document-Processing-Backend/modules/ocr/textract/ensemble.py", line 32, in load_and_split
pdf_pages = loader.load()
File "/home/lumu/.local/lib/python3.10/site-packages/langchain_community/document_loaders/pdf.py", line 747, in load
return list(self.lazy_load())
File "/home/lumu/.local/lib/python3.10/site-packages/langchain_community/document_loaders/pdf.py", line 758, in lazy_load
blob = Blob(path=self.web_path) # type: ignore[call-arg] # type: ignore[misc]
File "/home/lumu/.local/lib/python3.10/site-packages/langchain_core/load/serializable.py", line 112, in init
super().init(*args, **kwargs)
File "/home/lumu/.local/lib/python3.10/site-packages/pydantic/main.py", line 212, in init
validated_self = self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Blob
data
Field required [type=missing, input_value={'path': 's3://realviewde...77_Walter_Hardwick.pdf'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
Description
I'm trying to load a document with the textract pdf loader, however there is a bug and doesn't let it load. I've already checked and the document is on the S3, the error is caused due to a failure with pydantic and not due to a not found document error.
System Info
System Information
Package Information
Optional packages not installed
Other Dependencies
The text was updated successfully, but these errors were encountered: