Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#count doesn't honor distinct attributes in select clause #5554

Closed
mtalcott opened this issue Mar 22, 2012 · 15 comments · Fixed by #10710
Closed

#count doesn't honor distinct attributes in select clause #5554

mtalcott opened this issue Mar 22, 2012 · 15 comments · Fixed by #10710

Comments

@mtalcott
Copy link

Calling count on an ActiveRecord::Relation with more than 1 distinct attribute in the select clause doesn't honor the select conditions. For example, this works:

User.select('distinct username').all    #=> SELECT distinct username FROM "users"
User.select('distinct username').count  #=> SELECT COUNT(distinct username) FROM "users"

However, the distinct clause disappears from the count SQL when another distinct attribute is specified:

User.select('distinct username, email').all     #=> SELECT distinct username, email FROM "users"
User.select('distinct username, email').count   #=> SELECT count(*) FROM "users"

Same thing with distinct users.*:

User.select('distinct users.*').all     #=> SELECT distinct users.* FROM "users"
User.select('distinct users.*').count   #=> SELECT count(*) FROM "users"

That means for a relation with multiple distinct columns, relation.count is not necessarily the same as relation.all.count.

For distinct users.*, count(:distinct => true) works (it generates SELECT COUNT(DISTINCT "users"."id") FROM "users"), but that won't work for distinct username, email.

I've verified this behavior on Postgres and SQLite.

@Empact
Copy link
Contributor

Empact commented Mar 22, 2012

FYI, as a workaround you can use the :select option to replace the "*" with the columns, &c. of your choosing.

User.count(select: 'distinct users.*') #=> SELECT count(distinct users.*) FROM "users"

@mtalcott
Copy link
Author

I'm seeing the same behavior with that workaround as well:

User.count(select: 'distinct users.*') #=> SELECT COUNT(*) FROM "users"

@Empact
Copy link
Contributor

Empact commented Mar 23, 2012

Ah you're right. Unfortunately I don't have a record of what I originally ran that set me astray.

@kranzky
Copy link

kranzky commented May 16, 2012

It seems Kaminari is working around this bug, with massive performance consequences for us. I've written a post describing the issue here: http://devblog.agworld.com.au/post/22194500063/distinctly-uncountable

I've replicated the issue here: https://github.com/agworld/distinctly_uncountable

It seems that #sum is also broken by DISTINCT, as are probably the #min, #max, #avg and other calculations. This could have dire consequences for anyone generating mission-critical reports and wondering why the numbers don't come out right. Heck, it'd be nicer if AR just raised a "too hard" exception, rather than give incorrect results.

BTW, this may be related to #1003.

@mtalcott
Copy link
Author

Nice writeup of the issue and performance in different cases. It seems that for now, if you need an accurate count/sum/etc. and there is a distinct clause on the query, you have to instantiate the AR object array. Not very performant, but that's better than getting inaccurate results.

1003 is definitely related.

@frodsan
Copy link
Contributor

frodsan commented Jul 4, 2012

Hi, i think this is expected as you see here.

If you remove that validation, you will get:

>> User.select('distinct username, email').count
   (0.2ms)  SELECT COUNT(distinct username, email) FROM "users" 
SQLite3::SQLException: wrong number of arguments to function COUNT(): SELECT COUNT(distinct username, email) FROM "users" 

>> User.select('distinct users.*').count
   (0.2ms)  SELECT COUNT(distinct users.*) FROM "users" 
SQLite3::SQLException: near "*": syntax error: SELECT COUNT(distinct users.*) FROM "users" 

/cc @carlosantoniodasilva @rafaelfranca I'm not sure about this, but i'd prefer to raise an error to avoid this kind of issues.

@frodsan
Copy link
Contributor

frodsan commented Jul 4, 2012

Something like:

def select_for_count
  if select_values.present?
    select = select_values.join(", ")
    raise IDontFigureOutAnErrorForThisYet if select = ~/[,*]/
    select
  end
end

What do you think?

@kranzky
Copy link

kranzky commented Jul 5, 2012

Yes, raising an exception is much more preferable to returning an incorrect result.

@rafaelfranca
Copy link
Member

Related with #6865

@senny
Copy link
Member

senny commented Mar 7, 2013

Is this still happening? I might have fixed this by working on a different issue. Let me know if it's still a problem on master and I'll take a look.

@kranzky
Copy link

kranzky commented Mar 12, 2013

I've updated https://github.com/agworld/distinctly_uncountable to work with rails master, and the tests still fail, so would be great if you can fix this @senny.

You should also note that this bug affects everything in ActiveRecord::Calculations (so sum, average, minimum, maximum and so on, as well as count).

@senny
Copy link
Member

senny commented Mar 12, 2013

I'll take a look

@senny
Copy link
Member

senny commented Mar 12, 2013

not every database supports counting on multiple distinct columns. The fallback is to use subqueries.

The code that silently falls back to count(*) feels hacky at best. I think we should let the user know what's happening and if a count can't be performed.

@dangrahn
Copy link

dangrahn commented Apr 4, 2013

@mtalcott, would you mind to show an example of an AR object array that would work around this issue?

Following is generating correct count for me:
User.find(:all, :select => "DISTINCT username, email").count

However, since I need this in a model scope it wont work since this returns an array instead of an AR relation.

@senny
Copy link
Member

senny commented May 21, 2013

I submitted a PR to solve this issue: #10710

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants