Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sea-orm-cli generate entity PoolTimedOut error with large schema #742

Closed
wbew opened this issue May 17, 2022 · 12 comments
Closed

sea-orm-cli generate entity PoolTimedOut error with large schema #742

wbew opened this issue May 17, 2022 · 12 comments

Comments

@wbew
Copy link

wbew commented May 17, 2022

First off, thanks for all the hard work!

Description

When trying to use sea-orm-cli generate entity on a large Postgres 9.6.22 database (>400 tables of varying shapes, sizes, constraints, etc.), I encounter the following error:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: PoolTimedOut', /Users/will.wang/dev-projects/sea-schema/src/postgres/discovery/executor/real.rs:30:55

Similar to: #555

Steps to Reproduce

Unfortunately, I can't provide exact details around the schema. I think any sufficiently large database should work. I believe this occurs because the connection times out in between executing the many introspection queries.

Expected Behavior

Entities are automatically generated

Actual Behavior

Connection fails, panic occurs

Reproduces How Often

Always

Versions

sea-orm-cli: 0.8.1 (built locally on master at SeaQL/sea-schema@00429f3)
postgres: 9.6.22
os: MacOS Monterey 12.3.1

Additional Information

I think the default connect_timeout of 30 seconds for the underlying sqlx::PoolOptions is too short. When I changed this locally to 120 seconds, I was able to avoid this issue. It would be great if we could configure a connect_timeout as an argument in sea-orm-cli generate entity similar to max_connections.

@billy1624
Copy link
Member

Hey @001wwang, thanks for the investigations! Increasing the connect_timeout seems like a good temporary fix. However, I want to know why it takes so long to fetch the info of all db tables? Does anyone have a large sample schema for testing?

@billy1624 billy1624 moved this to Triage in SeaQL Dev Tracker May 18, 2022
@billy1624 billy1624 moved this from Triage to Next Up in SeaQL Dev Tracker May 18, 2022
@wbew
Copy link
Author

wbew commented May 18, 2022

No problem, thanks for the prompt reply! I'll see if I can produce a large sample schema for recreating this issue later.

@billy1624 billy1624 moved this from Next Up to Open for Contributions in SeaQL Dev Tracker Jun 28, 2022
@billy1624 billy1624 moved this from Open for Contributions to Triage in SeaQL Dev Tracker Jul 6, 2022
@aprchen
Copy link

aprchen commented Aug 3, 2022

I think the default connect_timeout of 30 seconds for the underlying sqlx::PoolOptions is too short. When I changed this locally to 120 seconds, I was able to avoid this issue. It would be great if we could configure a connect_timeout as an argument in sea-orm-cli generate entity similar to max_connections.

How did you solve it? I also have this problem now

@wbew
Copy link
Author

wbew commented Aug 3, 2022

How did you solve it? I also have this problem now

Sorry, this issue slipped my mind. Perhaps you could provide more details about your schema to help @billy1624 reproduce this locally.

For my fix, I modified the sea-orm-cli codebase locally to use a timeout of 120 seconds (then, built and run locally). Disclaimer: it's been a few months since then, so this may not be guaranteed to work nor may be the problem you're facing.

@cemremengu
Copy link
Contributor

cemremengu commented Oct 27, 2022

I encountered this error for tables with partitions (postgres). If a table has a lot of partitions, the generator will see those as separate tables and try to generate for all of them. Deleting those partitions solved this problem for me.

This is actually expected and JOOQ also has the same behavior. You can solve this in JOOQ by providing an exclude regex like
<excludes>.*_p[0-9_]+</excludes> so that it skips those.

In an ideal scenario, you should not have partitions in your target/seed schema but regex option can be added in cli to help these kind of situations in general. In other cases, connect_timeout option will be useful as well.

image

@billy1624
Copy link
Member

billy1624 commented Dec 30, 2022

Hey everyone, I created 3,000 tables on my local Postgres but still not able to reproduce the error. Could someone guide me how to reproduce it?

CREATE TABLE IF NOT EXISTS t_793593555976192 (col1 integer, col2 integer, col3 integer);
... 3,000 create table statements

@wbew
Copy link
Author

wbew commented Jan 5, 2023

I'll go ahead and close this issue given the above comments. I haven't looked into this any time recently, and it seems like it might be working at this point. Thanks all!

@wbew wbew closed this as completed Jan 5, 2023
@option-greek
Copy link

Just for anyone else facing the issue: This seem to happen intermittently on slow/weak/high packet loss networks. Ensure your network is low latency to make it work.

@billy1624 billy1624 moved this from Triage to Done in SeaQL Dev Tracker Jan 13, 2023
@jared-mackey
Copy link
Contributor

jared-mackey commented Apr 6, 2023

I am having the same issue with partitioned tables as well. +1 for regex exclusion.

I tried writing a script to just pull them one table at a time and it's still having the same issue. This is against a local docker container.

data=`psql ${DATABASE_URL} -AXqtc "SELECT table_name FROM information_schema.tables WHERE table_schema='public' AND table_type='BASE TABLE';"`
data=$(echo "$data" | rg -v -e "p_hash_p" -e "schema_migrations")
for TABLE in $data
do
  echo "$TABLE"
  sea-orm-cli generate entity -o entity/src --lib --tables "$TABLE"
done

Edit:

I realize the issue here is not which tables I am trying to gather but rather just the schema discovery's query is taking a long time. Would it be possible to take the approved table list (given to CLI above) and use it to filter the discovery piece?

@jared-mackey
Copy link
Contributor

Ah, I think I tracked it down.

In sea-schema, we are doing future::join_all() on potentially a very large list. They all end up rushing to the executor to get a connection to execute their query. The default cli configis to only have a single connection so they're all just waiting for that one connection to be available. So the pool times out for some of the later ones. The fix was to set --max-connections 50 to allow a higher throughput and avoid the timeout. A lower number probably would've worked too but that's what did it for me.

@tyt2y3
Copy link
Member

tyt2y3 commented Apr 7, 2023

@jared-mackey thank you for pointing it out. It seems like we should not join_all more than the pool size in SeaSchema.

I imagine we can chunk the vector by the pool size.

for chunk in vec.iter().chunks(pool_size) {
    join_all(op(chunk)).await?;
}

@jared-mackey
Copy link
Contributor

@tyt2y3 Seems like a good approach.

A couple of thoughts on the issue after looking into it a bit.

There is the join all problem that should be solved. However, that seems to be exacerbated by these

  • It seems that tables set to ignore via the CLI argument are still pulled via the discovery process, they're just not generated in the output. There is probably a good reason for this, but thought I'd mention it.
  • And to tie it back to Entity generation for partitioned tables #1582 a bit, in my case all the partitions were the same schema and I don't care about anything other than the super table. But I do believe there are cases where partitions can be different than the super schema so maybe we cannot optimize those out. Would need to do some research to see if they have a flag or something where we can tell if the partitions are the same as the super or not. If they are, we shouldn't pull them at all I don't believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

7 participants