Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OneDriveLoader #3860

Merged
Merged

Conversation

netoferraz
Copy link
Contributor

Issue: #2153

  • OneDriveLoader to extract documents from OneDrive.
  • Files Supported: [doc, docx, pdf]

@netoferraz
Copy link
Contributor Author

Hi @hwchase17 ! First of all, thanks for this great project. This is my first contribution, and I'm trying to dive deep into the details. This loader relies on O-365 package. Should I keep this as an optional dependency or include it in the pyproject.toml ?

@netoferraz netoferraz marked this pull request as ready for review May 2, 2023 02:55
@hwchase17
Copy link
Contributor

Hi @hwchase17 ! First of all, thanks for this great project. This is my first contribution, and I'm trying to dive deep into the details. This loader relies on O-365 package. Should I keep this as an optional dependency or include it in the pyproject.toml ?

this is great! lets include as an optional dependency in the pyproject.toml file. that will mean both putting something like python-o365 = {version = "...", optional =True} as well as adding it to the list of packages in the extra_installs all section

i am happy to do these changes (along with others i point out in the pr) if you want - just let me know! it can be a bit tedious/confusing, but happy to let you slog through it if you want (you said you wanted to dive deep on the details). but, im very happy to do it for this pr so you can see the usual pr - let me know!

from typing import Dict, List, Optional, Type, Union

try:
from O365 import Account, FileSystemTokenBackend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this is optional, we don't want to do this at top of file (we only want to do it when needed).

if this is needed for linting, you can do

# top of file
from __future__ import annotations

from typing import TYPE_CHECKING
...

if TYPE_CHECKING:
     from O365 import...

if this is not for type hinting only, but rather you want to use it, you should do the imports in the functions themselves where they are used

Copy link
Contributor Author

@netoferraz netoferraz May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these imports are used within the functions and in the function signature. I've tried to move them into a _init_ method, but the interpreter still complains it cannot import it. I cannot make this work by moving those imports from the top. Would you happen to have any suggestions here?

return mime_types_mapping


class OneDriveLoader(BaseLoader, BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets expose this (and the file loader) in document_loaders/init.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@netoferraz
Copy link
Contributor Author

Hi @hwchase17 ! First of all, thanks for this great project. This is my first contribution, and I'm trying to dive deep into the details. This loader relies on O-365 package. Should I keep this as an optional dependency or include it in the pyproject.toml ?

this is great! lets include as an optional dependency in the pyproject.toml file. that will mean both putting something like python-o365 = {version = "...", optional =True} as well as adding it to the list of packages in the extra_installs all section

i am happy to do these changes (along with others i point out in the pr) if you want - just let me know! it can be a bit tedious/confusing, but happy to let you slog through it if you want (you said you wanted to dive deep on the details). but, im very happy to do it for this pr so you can see the usual pr - let me know!

Done!

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - i can fix issues

@hwchase17 hwchase17 changed the base branch from master to harrison/one-drive-loader May 4, 2023 03:46
@hwchase17 hwchase17 merged commit b719352 into langchain-ai:harrison/one-drive-loader May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants