Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index on short code does not work if domain is null. #2285

Closed
marijnvandevoorde opened this issue Nov 25, 2024 · 5 comments · Fixed by #2287
Closed

Index on short code does not work if domain is null. #2285

marijnvandevoorde opened this issue Nov 25, 2024 · 5 comments · Fixed by #2287
Labels
Milestone

Comments

@marijnvandevoorde
Copy link
Contributor

Shlink version

4.3.0

PHP version

8.4

How do you serve Shlink

Self-hosted RoadRunner

Database engine

MicrosoftSQL

Database version

12.0

Current behavior

Our Shlink instance holds over 1.6M records right now and redirects became really, really slow. We did some digging and noticed that we hadn't been setting a domain to the short urls. That means that the short_urls.domain_id was always null. The index that is used for the redirect spans both columns. This is correct, but because the domain_id column is first, the index is not used in our case.

Expected behavior

The index should also be used in case no domain is set. This could be achieved by switching the two columns around, this fixes our issue. It does not degrade performance for other usecases. You now first check on the column with high cardinality (very unlikely to have multiple identical values, almost impossible). In the edge case where two identical shortcodes are present, only a few rows will be left (same shortcode but for different domains).

(Which makes me realise, by the way: What if the same shortcode is presetn with null as domain and then a specific domain. I would expect it to take the explicit one over the non explicit one as behviour, and that's probably how it works. But we don't have that usecase, so not an issue for us)

Minimum steps to reproduce

  • Enter a ton of records in your DB with null as domain_id
  • do a search and check the execution plan. Notice how the index is not used.
@acelaya
Copy link
Member

acelaya commented Nov 25, 2024

Good spot! I'll try to get it sorted.

(Which makes me realise, by the way: What if the same shortcode is presetn with null as domain and then a specific domain. I would expect it to take the explicit one over the non explicit one as behviour, and that's probably how it works. But we don't have that usecase, so not an issue for us)

It will pick up the one corresponding to the visited domain. If the default domain is visited, it will pick up the entry where domain is null, if the other domain is visited, then it will pick the one explicitly set for that domain.

There's a more detailed explanation here to know how different combinations would work, in case you are curious https://shlink.io/documentation/advanced/multiple-domains/#visits

@acelaya acelaya moved this to Todo in Shlink Nov 25, 2024
@acelaya acelaya added this to the 4.3.1 milestone Nov 25, 2024
@marijnvandevoorde
Copy link
Contributor Author

Makes 100% sense, it's what I would expect, if we had that usecase. All good! And thanks for picking this up, much appreciated!

@acelaya
Copy link
Member

acelaya commented Nov 25, 2024

Apparently this only affects Microsoft SQL (I was getting confused as I was seeing the index used in the other engines).

The index defines the columns in the right order here:

$builder->addUniqueConstraint(['short_code', 'domain_id'], 'unique_short_code_plus_domain');

But then this migration deletes and recreates it in the wrong order, only for Microsoft SQL:

$this->connection->executeStatement(
'CREATE INDEX unique_short_code_plus_domain ON short_urls (domain_id, short_code);',
);

I guess it should be a matter of recreating it again with the columns in the right order, but I need to check how long would that take for instances with a lot of short URLs like in your case.

@acelaya
Copy link
Member

acelaya commented Nov 25, 2024

I need to check how long would that take for instances with a lot of short URLs like in your case.

It takes milliseconds for 1M short URLs, so I think it's fine.

@github-project-automation github-project-automation bot moved this from Todo to Done in Shlink Nov 25, 2024
@acelaya
Copy link
Member

acelaya commented Nov 25, 2024

I have just released v4.3.1 which includes a fix for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants