Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple: pydantic 2 compatibility, v0.3 #26443

Merged
merged 213 commits into from
Sep 13, 2024
Merged

multiple: pydantic 2 compatibility, v0.3 #26443

merged 213 commits into from
Sep 13, 2024

Conversation

efriis
Copy link
Member

@efriis efriis commented Sep 13, 2024

This PR upgrades the entire langchain mono-repository to use pydantic 2 internally.

We made heavy usage of gritql to generate automatic migrations using the following patterns: https://github.com/eyurtsev/migrate-pydantic/tree/main

In addition to the automation, a lot of additional changes had to be made manually to handle various breaking changes introduced by Pydantic between the 1 and 2 versions.

efriis and others added 30 commits September 3, 2024 11:06
This PR upgrades core to pydantic 2.

It involves a combination of manual changes together with automated code
mods using gritql.

Changes and known issues:

1. Current models override __repr__ to be consistent with pydantic 1
(this will be removed in a follow up PR)
Related:
https://github.com/langchain-ai/langchain/pull/25986/files#diff-e5bd296179b7a72fcd4ea5cfa28b145beaf787da057e6d122aa76ee0bb8132c9R74
2. Issue with decorator for BaseChatModel
(https://github.com/langchain-ai/langchain/pull/25986/files#diff-932bf3b314b268754ef640a5b8f52da96f9024fb81dd388dcd166b5713ecdf66R202)
-- cc @baskaryan
3. `name` attribute in Base Runnable does not have a default -- was
raising a pydantic warning due to override. We need to see if there's a
way to fix to avoid making a breaking change for folks with custom
runnables.
(https://github.com/langchain-ai/langchain/pull/25986/files#diff-836773d27f8565f4dd45e9d6cf828920f89991a880c098b7511e0d3bb78a8a0dR238)
4. Likely can remove hard-coded RunnableBranch name
(https://github.com/langchain-ai/langchain/pull/25986/files#diff-72894b94f70b1bfc908eb4d53f5ff90bb33bf8a4240a5e34cae48ddc62ac313aR147)
5. `model_*` namespace is reserved in pydantic. We'll need to specify
`protected_namespaces`
6. create_model does not have a cached path yet
7. get_input_schema() in many places has been updated to be explicit
about whether parameters are required or optional
8. injected tool args aren't picked up properly (losing type annotation)

For posterity the following gritql migrations were used:

```
engine marzano(0.1)
language python

or {
    `from $IMPORT import $...` where {
        $IMPORT <: contains `pydantic_v1`,
        $IMPORT => `pydantic`
    },
    `$X.update_forward_refs` => `$X.model_rebuild`,
  // This pattern still needs fixing as it fails (populate_by_name vs.
  // allow_populate_by_name)
  class_definition($name, $body) as $C where {
      $name <: `Config`,
      $body <: block($statements),
      $t = "",
      $statements <: some bubble($t) assignment(left=$x, right=$y) as $A where {    
        or {
            $x <: `allow_population_by_field_name` where {
                $t += `populate_by_name=$y,`
            },
            $t += `$x=$y,`
        }
      },
      $C => `model_config = ConfigDict($t)`,
      add_import(source="pydantic", name="ConfigDict")
  }
}

```



```
engine marzano(0.1)
language python

`@root_validator(pre=True)` as $decorator where {
    $decorator <: before function_definition($body, $return_type),
    $decorator => `@model_validator(mode="before")\n@classmethod`,
    add_import(source="pydantic", name="model_validator"),
    $return_type => `Any`
}
```

```
engine marzano(0.1)
language python

`@root_validator(pre=False, skip_on_failure=True)` as $decorator where {
    $decorator <: before function_definition($body, $parameters, $return_type) where {
        $body <: contains bubble or {
            `values["$Q"]` => `self.$Q`,
            `values.get("$Q")` => `(self.$Q or None)`,
            `values.get($Q, $...)` as $V where {
                $Q <: contains `"$QName"`,
                $V => `self.$QName`,
            },
            `return $Q` => `return self`
        }
    },
    $decorator => `@model_validator(mode="after")`,
    // Silly work around a bug in grit
    // Adding Self to pydantic and then will replace it with one from typing
    add_import(source="pydantic", name="model_validator"),
    $parameters => `self`,
    $return_type => `Self`
}

```

```
grit apply --language python '`Self` where { add_import(source="typing_extensions", name="Self")}'
```
Drop python 3.8 support as EOL is 2024 October
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- Fix injected args in tool signature
- Fix another unit test that was using the wrong namespace import in
pydantic
ccurme and others added 4 commits September 13, 2024 15:26
- `BaseMedia.id`
- `BaseMessage.id`
- `ToolMessage.tool_call_id`

Note: with this change we are actually less restrictive on types for
initializing `id` than in pydantic V1-- e.g., pydantic V1 will error if
you pass a dict or tuple to `id`). Let me know if we should restrict
type coercion. For tool_call_id I just enumerated supported types.
Copy link

vercel bot commented Sep 13, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 13, 2024 9:36pm

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Sep 13, 2024
@efriis efriis added the partner label Sep 13, 2024
@efriis efriis self-assigned this Sep 13, 2024
@dosubot dosubot bot added langchain Related to the langchain package 🤖:release Bumping package version for release labels Sep 13, 2024
@efriis efriis enabled auto-merge (squash) September 13, 2024 20:10
baskaryan and others added 6 commits September 13, 2024 13:17
**Description**
Replaces the libCST integration with GritQL. This has a few advantages:
* There's less code to maintain, as Grit handles the complexities of
actually replacing imports and remapping correctly (notice the big red
reduction).
* Grit includes advanced features like `--interactive` to allow
interactive review of changes.

**Dependencies**
- Added gritQL

**Twitter handle**
https://x.com/morgantepell

**Testing**
Existing tests continue to run correctly for the cli_runner.

---------

Co-authored-by: Morgante Pell <morgantep@google.com>
@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Sep 13, 2024
@efriis efriis merged commit c2a3021 into master Sep 13, 2024
280 of 311 checks passed
@efriis efriis deleted the v0.3rc branch September 13, 2024 21:40
Sheepsta300 pushed a commit to Sheepsta300/langchain that referenced this pull request Oct 1, 2024
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
langchain Related to the langchain package lgtm PR looks good. Use to confirm that a PR is ready for merging. partner 🤖:release Bumping package version for release size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants