-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: schema validation in langsmith sdk #922
Conversation
monkeypatch: pytest.MonkeyPatch, langchain_client: Client | ||
) -> None: | ||
"""Test persisting runs and adding feedback.""" | ||
monkeypatch.setenv("LANGCHAIN_ENDPOINT", "https://dev.api.smith.langchain.com") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a quirk where we were overriding a single test to use dev. If we wanna test against dev, we should just configure the suite to run against dev in addition
python/langsmith/client.py
Outdated
@@ -2529,6 +2529,8 @@ def create_dataset( | |||
*, | |||
description: Optional[str] = None, | |||
data_type: ls_schemas.DataType = ls_schemas.DataType.kv, | |||
inputs_schema_definition: Optional[Dict[str, Any]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "definition" necessary? My fingers hurt just looking at this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I matched the API exactly. Do we mismatch ever otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do mismatch - our apis are pretty confusing sometimes
python/langsmith/schemas.py
Outdated
|
||
class Config: | ||
"""Configuration class for the schema.""" | ||
|
||
allow_population_by_field_name = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hinthornw I had to do some pydantic magic to make all this work. Do we do this in the SDK? I see this pattern in runtree, but I know pydantic stuff is frowned upon other parts of the code base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we just don't use pydantic in thecreate_dataset method anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll need to remove it from all dataset related areas then, because we'll need to do conversion on any read/create/etc
python/langsmith/client.py
Outdated
description=description, | ||
data_type=data_type, | ||
) | ||
dataset = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
byebye pydantic
@@ -163,6 +158,12 @@ def __init__( | |||
**kwargs: Any, | |||
) -> None: | |||
"""Initialize a Dataset object.""" | |||
if "inputs_schema_definition" in kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is disgusting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this actually be applied? Can we just not support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdym? yeah, there's a new integration test showing this works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is for the dataset copying case ya?
I think it's fine. I also wouldn't resist if we just marked it as "inputs_schema_definition" itself here
|
||
# assert read API includes the schema definition | ||
read_dataset = langchain_client.read_dataset(dataset_id=dataset.id) | ||
assert read_dataset.inputs_schema == InputSchema.model_json_schema() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hinthornw here's the integration test for reading the input schema back out
No description provided.