Counting DISTINCT over multiple columns
Is there a better way of doing a query like this:
SELECT COUNT(*)
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
FROM DocumentOutputItems) AS internalQuery
I need to count the number of distinct items from this table but the distinct is over two columns.
My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)
sql
add a comment |
Is there a better way of doing a query like this:
SELECT COUNT(*)
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
FROM DocumentOutputItems) AS internalQuery
I need to count the number of distinct items from this table but the distinct is over two columns.
My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)
sql
IordanTanev, Mark Brackett, RC - thanks for replies, it was a nice try, but you need to check what you doing before posting to SO. The queries you provided are not equivalent to my query. You can easily see I always have a scalar a result but your query returns multiple rows.
– Novitzky
Sep 24 '09 at 12:30
Just updated the question to include your clarifying comment from one of the answers
– Jeff
Mar 29 '16 at 2:54
FYI: community.oracle.com/ideas/18664
– quetzalcoatl
Sep 6 '18 at 14:49
add a comment |
Is there a better way of doing a query like this:
SELECT COUNT(*)
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
FROM DocumentOutputItems) AS internalQuery
I need to count the number of distinct items from this table but the distinct is over two columns.
My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)
sql
Is there a better way of doing a query like this:
SELECT COUNT(*)
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
FROM DocumentOutputItems) AS internalQuery
I need to count the number of distinct items from this table but the distinct is over two columns.
My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)
sql
sql
edited Mar 29 '16 at 2:54
Jeff
9,36352449
9,36352449
asked Sep 24 '09 at 12:03
NovitzkyNovitzky
1,83931826
1,83931826
IordanTanev, Mark Brackett, RC - thanks for replies, it was a nice try, but you need to check what you doing before posting to SO. The queries you provided are not equivalent to my query. You can easily see I always have a scalar a result but your query returns multiple rows.
– Novitzky
Sep 24 '09 at 12:30
Just updated the question to include your clarifying comment from one of the answers
– Jeff
Mar 29 '16 at 2:54
FYI: community.oracle.com/ideas/18664
– quetzalcoatl
Sep 6 '18 at 14:49
add a comment |
IordanTanev, Mark Brackett, RC - thanks for replies, it was a nice try, but you need to check what you doing before posting to SO. The queries you provided are not equivalent to my query. You can easily see I always have a scalar a result but your query returns multiple rows.
– Novitzky
Sep 24 '09 at 12:30
Just updated the question to include your clarifying comment from one of the answers
– Jeff
Mar 29 '16 at 2:54
FYI: community.oracle.com/ideas/18664
– quetzalcoatl
Sep 6 '18 at 14:49
IordanTanev, Mark Brackett, RC - thanks for replies, it was a nice try, but you need to check what you doing before posting to SO. The queries you provided are not equivalent to my query. You can easily see I always have a scalar a result but your query returns multiple rows.
– Novitzky
Sep 24 '09 at 12:30
IordanTanev, Mark Brackett, RC - thanks for replies, it was a nice try, but you need to check what you doing before posting to SO. The queries you provided are not equivalent to my query. You can easily see I always have a scalar a result but your query returns multiple rows.
– Novitzky
Sep 24 '09 at 12:30
Just updated the question to include your clarifying comment from one of the answers
– Jeff
Mar 29 '16 at 2:54
Just updated the question to include your clarifying comment from one of the answers
– Jeff
Mar 29 '16 at 2:54
FYI: community.oracle.com/ideas/18664
– quetzalcoatl
Sep 6 '18 at 14:49
FYI: community.oracle.com/ideas/18664
– quetzalcoatl
Sep 6 '18 at 14:49
add a comment |
17 Answers
17
active
oldest
votes
If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.
Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.
I believe a distinct count of the computed column would be equivalent to your query.
3
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
add a comment |
Edit: Altered from the less-than-reliable checksum-only query
I've discovered a way to do this (in SQL Server 2005) that works pretty well for me and I can use as many columns as I need (by adding them to the CHECKSUM() function). The REVERSE() function turns the ints into varchars to make the distinct more reliable
SELECT COUNT(DISTINCT (CHECKSUM(DocumentId,DocumentSessionId)) + CHECKSUM(REVERSE(DocumentId),REVERSE(DocumentSessionId)) )
FROM DocumentOutPutItems
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
7
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
1
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
3
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
1
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
|
show 2 more comments
What is it about your existing query that you don't like? If you are concerned that DISTINCT across two columns does not return just the unique permutations why not try it?
It certainly works as you might expect in Oracle.
SQL> select distinct deptno, job from emp
2 order by deptno, job
3 /
DEPTNO JOB
---------- ---------
10 CLERK
10 MANAGER
10 PRESIDENT
20 ANALYST
20 CLERK
20 MANAGER
30 CLERK
30 MANAGER
30 SALESMAN
9 rows selected.
SQL> select count(*) from (
2 select distinct deptno, job from emp
3 )
4 /
COUNT(*)
----------
9
SQL>
edit
I went down a blind alley with analytics but the answer was depressingly obvious...
SQL> select count(distinct concat(deptno,job)) from emp
2 /
COUNT(DISTINCTCONCAT(DEPTNO,JOB))
---------------------------------
9
SQL>
edit 2
Given the following data the concatenating solution provided above will miscount:
col1 col2
---- ----
A AA
AA A
So we to include a separator...
select col1 + '*' + col2 from t23
/
Obviously the chosen separator must be a character, or set of characters, which can never appear in either column.
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
add a comment |
How about something like:
select count(*)
from
(select count(*) cnt
from DocumentOutputItems
group by DocumentId, DocumentSessionId) t1
Probably just does the same as you are already though but it avoids the DISTINCT.
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
1
Depending on the complexity of the original query, solving this withGROUP BYmay introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already hadGROUP BYorHAVINGclauses...)
– Lukas Eder
Dec 17 '13 at 9:08
add a comment |
To run as a single query, concatenate the columns, then get the distinct count of instances of the concatenated string.
SELECT count(DISTINCT concat(DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
In MySQL you can do the same thing without the concatenation step as follows:
SELECT count(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems;
This feature is mentioned in the MySQL documentation:
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count-distinct
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
FWIW, this almost works in PostgreSQL; just need extra parentheses:SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
– ijoseph
Jun 25 '18 at 20:12
add a comment |
Here's a shorter version without the subselect:
SELECT COUNT(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems
It works fine in MySQL, and I think that the optimizer has an easier time understanding this one.
Edit: Apparently I misread MSSQL and MySQL - sorry about that, but maybe it helps anyway.
6
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
Please give a look in @JayTee answer. It works like a charm.count ( distinct CHECKSUM ([Field1], [Field2])
– Custodio
Nov 16 '12 at 14:53
add a comment |
There's nothing wrong with your query, but you could also do it this way:
WITH internalQuery (Amount)
AS
(
SELECT (0)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
)
SELECT COUNT(*) AS NumberOfDistinctRows
FROM internalQuery
add a comment |
I found this when I Googled for my own issue, found that if you count DISTINCT objects, you get the correct number returned (I'm using MySQL)
SELECT COUNT(DISTINCT DocumentID) AS Count1,
COUNT(DISTINCT DocumentSessionId) AS Count2
FROM DocumentOutputItems
5
The above query will return a different set of results than what the OP was looking for (the distinct combinations ofDocumentIdandDocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.
– Anthony Geoghegan
Jul 28 '14 at 9:21
add a comment |
if you had only one field to "DISTINCT", you could use:
SELECT COUNT(DISTINCT DocumentId)
FROM DocumentOutputItems
and that does return the same query plan as the original, as tested with SET SHOWPLAN_ALL ON. However you are using two fields so you could try something crazy like:
SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId))
FROM DocumentOutputItems
but you'll have issues if NULLs are involved. I'd just stick with the original query.
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
add a comment |
Hope this works i am writing on prima vista
SELECT COUNT(*)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
7
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
add a comment |
I wish MS SQL could also do something like COUNT(DISTINCT A, B). But it can't.
At first JayTee's answer seemed like a solution to me bu after some tests CHECKSUM() failed to create unique values. A quick example is, both CHECKSUM(31,467,519) and CHECKSUM(69,1120,823) gives the same answer which is 55.
Then I made some research and found that Microsoft does NOT recommend using CHECKSUM for change detection purposes. In some forums some suggested using
SELECT COUNT(DISTINCT CHECKSUM(value1, value2, ..., valueN) + CHECKSUM(valueN, value(N-1), ..., value1))
but this is also not conforting.
You can use HASHBYTES() function as suggested in TSQL CHECKSUM conundrum. However this also has a small chance of not returning unique results.
I would suggest using
SELECT COUNT(DISTINCT CAST(DocumentId AS VARCHAR)+'-'+CAST(DocumentSessionId AS VARCHAR)) FROM DocumentOutputItems
add a comment |
Many (most?) SQL databases can work with tuples like values so you can just do:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId))
FROM DocumentOutputItems;
If your database doesn't support this, it can be simulated as per @oncel-umut-turer's suggestion of CHECKSUM or other scalar function providing good uniqueness e.g.
.
COUNT(DISTINCT CONCAT(DocumentId, ':', DocumentSessionId))
A related use of tuples is performing IN queries such as:
SELECT * FROM DocumentOutputItems
WHERE (DocumentId, DocumentSessionId) in (('a', '1'), ('b', '2'));
what databases supportselect count(distinct(a, b))? :D
– Vytenis Bivainis
Oct 11 '18 at 21:39
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
add a comment |
It works for me. In oracle:
SELECT SUM(DECODE(COUNT(*),1,1,1))
FROM DocumentOutputItems GROUP BY DocumentId, DocumentSessionId;
In jpql:
SELECT SUM(CASE WHEN COUNT(i)=1 THEN 1 ELSE 1 END)
FROM DocumentOutputItems i GROUP BY i.DocumentId, i.DocumentSessionId;
add a comment |
How about this,
Select DocumentId, DocumentSessionId, count(*) as c
from DocumentOutputItems
group by DocumentId, DocumentSessionId;
This will get us the count of all possible combinations of DocumentId, and DocumentSessionId
add a comment |
You can just use the Count Function Twice.
In this case, it would be:
SELECT COUNT (DISTINCT DocumentId), COUNT (DISTINCT DocumentSessionId)
FROM DocumentOutputItems
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
add a comment |
select DISTINCT DocumentId as i, DocumentSessionId as s , count(*)
from DocumentOutputItems
group by i ,s;
This code uses distinct on 2 parameters and provides count of number of rows specific to those distinct values row count. It worked for me in mysql like a charm.
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
add a comment |
This was posed and answered on Quora (https://www.quora.com/In-SQL-how-to-I-count-DISTINCT-over-multiple-columns):
select col1, col2, col3, count(*)
from table
group by col1, col2, col3
I was working on this in SAS, and SAS Proc SQL does not like DISTINCT with more than one column.
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f1471250%2fcounting-distinct-over-multiple-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
17 Answers
17
active
oldest
votes
17 Answers
17
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.
Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.
I believe a distinct count of the computed column would be equivalent to your query.
3
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
add a comment |
If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.
Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.
I believe a distinct count of the computed column would be equivalent to your query.
3
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
add a comment |
If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.
Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.
I believe a distinct count of the computed column would be equivalent to your query.
If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.
Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.
I believe a distinct count of the computed column would be equivalent to your query.
edited Jul 27 '16 at 19:18
Scott Beeson
7,6882178149
7,6882178149
answered Sep 26 '09 at 3:42
JasonHornerJasonHorner
2,61111827
2,61111827
3
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
add a comment |
3
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
3
3
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Excellent suggestion! The more I read, the more I am realizing that SQL is less about knowing syntax and functions and more about applying pure logic.. I wish I had 2 upvotes!
– tumchaaditya
Oct 4 '13 at 22:48
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
Too good suggestion. It avoided me to write unnecessary code to this.
– Avrajit Roy
Mar 9 '16 at 6:39
add a comment |
Edit: Altered from the less-than-reliable checksum-only query
I've discovered a way to do this (in SQL Server 2005) that works pretty well for me and I can use as many columns as I need (by adding them to the CHECKSUM() function). The REVERSE() function turns the ints into varchars to make the distinct more reliable
SELECT COUNT(DISTINCT (CHECKSUM(DocumentId,DocumentSessionId)) + CHECKSUM(REVERSE(DocumentId),REVERSE(DocumentSessionId)) )
FROM DocumentOutPutItems
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
7
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
1
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
3
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
1
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
|
show 2 more comments
Edit: Altered from the less-than-reliable checksum-only query
I've discovered a way to do this (in SQL Server 2005) that works pretty well for me and I can use as many columns as I need (by adding them to the CHECKSUM() function). The REVERSE() function turns the ints into varchars to make the distinct more reliable
SELECT COUNT(DISTINCT (CHECKSUM(DocumentId,DocumentSessionId)) + CHECKSUM(REVERSE(DocumentId),REVERSE(DocumentSessionId)) )
FROM DocumentOutPutItems
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
7
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
1
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
3
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
1
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
|
show 2 more comments
Edit: Altered from the less-than-reliable checksum-only query
I've discovered a way to do this (in SQL Server 2005) that works pretty well for me and I can use as many columns as I need (by adding them to the CHECKSUM() function). The REVERSE() function turns the ints into varchars to make the distinct more reliable
SELECT COUNT(DISTINCT (CHECKSUM(DocumentId,DocumentSessionId)) + CHECKSUM(REVERSE(DocumentId),REVERSE(DocumentSessionId)) )
FROM DocumentOutPutItems
Edit: Altered from the less-than-reliable checksum-only query
I've discovered a way to do this (in SQL Server 2005) that works pretty well for me and I can use as many columns as I need (by adding them to the CHECKSUM() function). The REVERSE() function turns the ints into varchars to make the distinct more reliable
SELECT COUNT(DISTINCT (CHECKSUM(DocumentId,DocumentSessionId)) + CHECKSUM(REVERSE(DocumentId),REVERSE(DocumentSessionId)) )
FROM DocumentOutPutItems
edited Sep 4 '14 at 14:01
answered Jul 6 '12 at 23:01
JayTeeJayTee
2,23521825
2,23521825
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
7
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
1
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
3
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
1
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
|
show 2 more comments
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
7
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
1
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
3
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
1
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
+1 Nice one, works perfect (when you have the right column types to perform a CheckSum on... ;)
– Bernoulli IT
Nov 7 '12 at 12:54
7
7
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
With hashes like Checksum(), there is small chance that the same hash will be returned for different inputs so the count may be very slightly off. HashBytes() is an even smaller chance but still not zero. If those two Ids were int's (32b) then a "lossless hash" could combine them into an bigint (64b) like Id1 << 32 + Id2.
– crokusek
Jan 31 '14 at 19:35
1
1
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
the chance is not so small even, especially when you start combining columns (which is what it was supposed to be meant for). I was curious about this approach and in a particular case the checksum ended up with a count 10% smaller. If you think of it a bit longer, Checksum just returns an int, so if you'd checksum a full bigint range you'll end up with a distinct count about 2 billion times smaller than there actually is. -1
– pvolders
Jul 23 '14 at 7:53
3
3
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
Could we avoid CHECKSUM -- could we just concatenate the two values together? I suppose that risks considering as the same thing: ('he', 'art') == 'hear', 't'). But I think that can be solved with a delimiter as @APC proposes (some value that doesn't appear in either column), so 'he|art' != 'hear|t' Are there other problems with a simple "concatenation" approach?
– The Red Pea
Jan 9 '16 at 15:40
1
1
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
this answer is better than good: "good enough"
– scottlittle
Oct 9 '17 at 17:44
|
show 2 more comments
What is it about your existing query that you don't like? If you are concerned that DISTINCT across two columns does not return just the unique permutations why not try it?
It certainly works as you might expect in Oracle.
SQL> select distinct deptno, job from emp
2 order by deptno, job
3 /
DEPTNO JOB
---------- ---------
10 CLERK
10 MANAGER
10 PRESIDENT
20 ANALYST
20 CLERK
20 MANAGER
30 CLERK
30 MANAGER
30 SALESMAN
9 rows selected.
SQL> select count(*) from (
2 select distinct deptno, job from emp
3 )
4 /
COUNT(*)
----------
9
SQL>
edit
I went down a blind alley with analytics but the answer was depressingly obvious...
SQL> select count(distinct concat(deptno,job)) from emp
2 /
COUNT(DISTINCTCONCAT(DEPTNO,JOB))
---------------------------------
9
SQL>
edit 2
Given the following data the concatenating solution provided above will miscount:
col1 col2
---- ----
A AA
AA A
So we to include a separator...
select col1 + '*' + col2 from t23
/
Obviously the chosen separator must be a character, or set of characters, which can never appear in either column.
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
add a comment |
What is it about your existing query that you don't like? If you are concerned that DISTINCT across two columns does not return just the unique permutations why not try it?
It certainly works as you might expect in Oracle.
SQL> select distinct deptno, job from emp
2 order by deptno, job
3 /
DEPTNO JOB
---------- ---------
10 CLERK
10 MANAGER
10 PRESIDENT
20 ANALYST
20 CLERK
20 MANAGER
30 CLERK
30 MANAGER
30 SALESMAN
9 rows selected.
SQL> select count(*) from (
2 select distinct deptno, job from emp
3 )
4 /
COUNT(*)
----------
9
SQL>
edit
I went down a blind alley with analytics but the answer was depressingly obvious...
SQL> select count(distinct concat(deptno,job)) from emp
2 /
COUNT(DISTINCTCONCAT(DEPTNO,JOB))
---------------------------------
9
SQL>
edit 2
Given the following data the concatenating solution provided above will miscount:
col1 col2
---- ----
A AA
AA A
So we to include a separator...
select col1 + '*' + col2 from t23
/
Obviously the chosen separator must be a character, or set of characters, which can never appear in either column.
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
add a comment |
What is it about your existing query that you don't like? If you are concerned that DISTINCT across two columns does not return just the unique permutations why not try it?
It certainly works as you might expect in Oracle.
SQL> select distinct deptno, job from emp
2 order by deptno, job
3 /
DEPTNO JOB
---------- ---------
10 CLERK
10 MANAGER
10 PRESIDENT
20 ANALYST
20 CLERK
20 MANAGER
30 CLERK
30 MANAGER
30 SALESMAN
9 rows selected.
SQL> select count(*) from (
2 select distinct deptno, job from emp
3 )
4 /
COUNT(*)
----------
9
SQL>
edit
I went down a blind alley with analytics but the answer was depressingly obvious...
SQL> select count(distinct concat(deptno,job)) from emp
2 /
COUNT(DISTINCTCONCAT(DEPTNO,JOB))
---------------------------------
9
SQL>
edit 2
Given the following data the concatenating solution provided above will miscount:
col1 col2
---- ----
A AA
AA A
So we to include a separator...
select col1 + '*' + col2 from t23
/
Obviously the chosen separator must be a character, or set of characters, which can never appear in either column.
What is it about your existing query that you don't like? If you are concerned that DISTINCT across two columns does not return just the unique permutations why not try it?
It certainly works as you might expect in Oracle.
SQL> select distinct deptno, job from emp
2 order by deptno, job
3 /
DEPTNO JOB
---------- ---------
10 CLERK
10 MANAGER
10 PRESIDENT
20 ANALYST
20 CLERK
20 MANAGER
30 CLERK
30 MANAGER
30 SALESMAN
9 rows selected.
SQL> select count(*) from (
2 select distinct deptno, job from emp
3 )
4 /
COUNT(*)
----------
9
SQL>
edit
I went down a blind alley with analytics but the answer was depressingly obvious...
SQL> select count(distinct concat(deptno,job)) from emp
2 /
COUNT(DISTINCTCONCAT(DEPTNO,JOB))
---------------------------------
9
SQL>
edit 2
Given the following data the concatenating solution provided above will miscount:
col1 col2
---- ----
A AA
AA A
So we to include a separator...
select col1 + '*' + col2 from t23
/
Obviously the chosen separator must be a character, or set of characters, which can never appear in either column.
edited Apr 19 '12 at 6:15
answered Sep 24 '09 at 12:41
APCAPC
118k15117229
118k15117229
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
add a comment |
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
+1 from me. Thanks for your answer. My query works fine but I was wondering if I can get the final result using just one query (without using a subquery)
– Novitzky
Sep 24 '09 at 13:57
add a comment |
How about something like:
select count(*)
from
(select count(*) cnt
from DocumentOutputItems
group by DocumentId, DocumentSessionId) t1
Probably just does the same as you are already though but it avoids the DISTINCT.
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
1
Depending on the complexity of the original query, solving this withGROUP BYmay introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already hadGROUP BYorHAVINGclauses...)
– Lukas Eder
Dec 17 '13 at 9:08
add a comment |
How about something like:
select count(*)
from
(select count(*) cnt
from DocumentOutputItems
group by DocumentId, DocumentSessionId) t1
Probably just does the same as you are already though but it avoids the DISTINCT.
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
1
Depending on the complexity of the original query, solving this withGROUP BYmay introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already hadGROUP BYorHAVINGclauses...)
– Lukas Eder
Dec 17 '13 at 9:08
add a comment |
How about something like:
select count(*)
from
(select count(*) cnt
from DocumentOutputItems
group by DocumentId, DocumentSessionId) t1
Probably just does the same as you are already though but it avoids the DISTINCT.
How about something like:
select count(*)
from
(select count(*) cnt
from DocumentOutputItems
group by DocumentId, DocumentSessionId) t1
Probably just does the same as you are already though but it avoids the DISTINCT.
edited Sep 24 '09 at 12:57
answered Sep 24 '09 at 12:46
Trevor TippinsTrevor Tippins
2,667107
2,667107
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
1
Depending on the complexity of the original query, solving this withGROUP BYmay introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already hadGROUP BYorHAVINGclauses...)
– Lukas Eder
Dec 17 '13 at 9:08
add a comment |
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
1
Depending on the complexity of the original query, solving this withGROUP BYmay introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already hadGROUP BYorHAVINGclauses...)
– Lukas Eder
Dec 17 '13 at 9:08
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
Yes, you right. It does the same job as my original one.
– Novitzky
Sep 24 '09 at 13:35
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
in my tests (using SET SHOWPLAN_ALL ON), it had the same execution plan and exact same TotalSubtreeCost
– KM.
Sep 24 '09 at 13:43
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
+1 for a nice try and the explanation.
– Novitzky
Sep 24 '09 at 13:50
1
1
Depending on the complexity of the original query, solving this with
GROUP BY may introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already had GROUP BY or HAVING clauses...)– Lukas Eder
Dec 17 '13 at 9:08
Depending on the complexity of the original query, solving this with
GROUP BY may introduce a couple of additional challenges to the query transformation to achieve the desired output (e.g. when the original query already had GROUP BY or HAVING clauses...)– Lukas Eder
Dec 17 '13 at 9:08
add a comment |
To run as a single query, concatenate the columns, then get the distinct count of instances of the concatenated string.
SELECT count(DISTINCT concat(DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
In MySQL you can do the same thing without the concatenation step as follows:
SELECT count(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems;
This feature is mentioned in the MySQL documentation:
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count-distinct
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
FWIW, this almost works in PostgreSQL; just need extra parentheses:SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
– ijoseph
Jun 25 '18 at 20:12
add a comment |
To run as a single query, concatenate the columns, then get the distinct count of instances of the concatenated string.
SELECT count(DISTINCT concat(DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
In MySQL you can do the same thing without the concatenation step as follows:
SELECT count(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems;
This feature is mentioned in the MySQL documentation:
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count-distinct
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
FWIW, this almost works in PostgreSQL; just need extra parentheses:SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
– ijoseph
Jun 25 '18 at 20:12
add a comment |
To run as a single query, concatenate the columns, then get the distinct count of instances of the concatenated string.
SELECT count(DISTINCT concat(DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
In MySQL you can do the same thing without the concatenation step as follows:
SELECT count(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems;
This feature is mentioned in the MySQL documentation:
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count-distinct
To run as a single query, concatenate the columns, then get the distinct count of instances of the concatenated string.
SELECT count(DISTINCT concat(DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
In MySQL you can do the same thing without the concatenation step as follows:
SELECT count(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems;
This feature is mentioned in the MySQL documentation:
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count-distinct
answered Jul 28 '16 at 20:21
spelunk1spelunk1
12914
12914
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
FWIW, this almost works in PostgreSQL; just need extra parentheses:SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
– ijoseph
Jun 25 '18 at 20:12
add a comment |
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
FWIW, this almost works in PostgreSQL; just need extra parentheses:SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;
– ijoseph
Jun 25 '18 at 20:12
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
This was a SQL Server question, and both options you posted have already been mentioned in the following answers to this question: stackoverflow.com/a/1471444/4955425 and stackoverflow.com/a/1471713/4955425.
– sstan
Jul 28 '16 at 20:31
FWIW, this almost works in PostgreSQL; just need extra parentheses:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;– ijoseph
Jun 25 '18 at 20:12
FWIW, this almost works in PostgreSQL; just need extra parentheses:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId)) FROM DocumentOutputItems;– ijoseph
Jun 25 '18 at 20:12
add a comment |
Here's a shorter version without the subselect:
SELECT COUNT(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems
It works fine in MySQL, and I think that the optimizer has an easier time understanding this one.
Edit: Apparently I misread MSSQL and MySQL - sorry about that, but maybe it helps anyway.
6
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
Please give a look in @JayTee answer. It works like a charm.count ( distinct CHECKSUM ([Field1], [Field2])
– Custodio
Nov 16 '12 at 14:53
add a comment |
Here's a shorter version without the subselect:
SELECT COUNT(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems
It works fine in MySQL, and I think that the optimizer has an easier time understanding this one.
Edit: Apparently I misread MSSQL and MySQL - sorry about that, but maybe it helps anyway.
6
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
Please give a look in @JayTee answer. It works like a charm.count ( distinct CHECKSUM ([Field1], [Field2])
– Custodio
Nov 16 '12 at 14:53
add a comment |
Here's a shorter version without the subselect:
SELECT COUNT(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems
It works fine in MySQL, and I think that the optimizer has an easier time understanding this one.
Edit: Apparently I misread MSSQL and MySQL - sorry about that, but maybe it helps anyway.
Here's a shorter version without the subselect:
SELECT COUNT(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems
It works fine in MySQL, and I think that the optimizer has an easier time understanding this one.
Edit: Apparently I misread MSSQL and MySQL - sorry about that, but maybe it helps anyway.
edited Dec 1 '16 at 10:40
starbeamrainbowlabs
2,99052947
2,99052947
answered Sep 24 '09 at 13:33
Alexander KjällAlexander Kjäll
3,02332145
3,02332145
6
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
Please give a look in @JayTee answer. It works like a charm.count ( distinct CHECKSUM ([Field1], [Field2])
– Custodio
Nov 16 '12 at 14:53
add a comment |
6
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
Please give a look in @JayTee answer. It works like a charm.count ( distinct CHECKSUM ([Field1], [Field2])
– Custodio
Nov 16 '12 at 14:53
6
6
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
in SQL Server you get: Msg 102, Level 15, State 1, Line 1 Incorrect syntax near ','.
– KM.
Sep 24 '09 at 13:35
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
This is what I was thinking of. I want do similar thing in MSSQL if possible.
– Novitzky
Sep 24 '09 at 13:38
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
@Kamil Nowicki, in SQL Server, you can only have one field in a COUNT(), in my answer I show that you can concatenate the two fields into one and try this approach. However, I'd just stick with the original since the query plans would end up the same.
– KM.
Sep 24 '09 at 13:45
Please give a look in @JayTee answer. It works like a charm.
count ( distinct CHECKSUM ([Field1], [Field2])– Custodio
Nov 16 '12 at 14:53
Please give a look in @JayTee answer. It works like a charm.
count ( distinct CHECKSUM ([Field1], [Field2])– Custodio
Nov 16 '12 at 14:53
add a comment |
There's nothing wrong with your query, but you could also do it this way:
WITH internalQuery (Amount)
AS
(
SELECT (0)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
)
SELECT COUNT(*) AS NumberOfDistinctRows
FROM internalQuery
add a comment |
There's nothing wrong with your query, but you could also do it this way:
WITH internalQuery (Amount)
AS
(
SELECT (0)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
)
SELECT COUNT(*) AS NumberOfDistinctRows
FROM internalQuery
add a comment |
There's nothing wrong with your query, but you could also do it this way:
WITH internalQuery (Amount)
AS
(
SELECT (0)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
)
SELECT COUNT(*) AS NumberOfDistinctRows
FROM internalQuery
There's nothing wrong with your query, but you could also do it this way:
WITH internalQuery (Amount)
AS
(
SELECT (0)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
)
SELECT COUNT(*) AS NumberOfDistinctRows
FROM internalQuery
answered Sep 24 '09 at 13:37
BliekBliek
40637
40637
add a comment |
add a comment |
I found this when I Googled for my own issue, found that if you count DISTINCT objects, you get the correct number returned (I'm using MySQL)
SELECT COUNT(DISTINCT DocumentID) AS Count1,
COUNT(DISTINCT DocumentSessionId) AS Count2
FROM DocumentOutputItems
5
The above query will return a different set of results than what the OP was looking for (the distinct combinations ofDocumentIdandDocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.
– Anthony Geoghegan
Jul 28 '14 at 9:21
add a comment |
I found this when I Googled for my own issue, found that if you count DISTINCT objects, you get the correct number returned (I'm using MySQL)
SELECT COUNT(DISTINCT DocumentID) AS Count1,
COUNT(DISTINCT DocumentSessionId) AS Count2
FROM DocumentOutputItems
5
The above query will return a different set of results than what the OP was looking for (the distinct combinations ofDocumentIdandDocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.
– Anthony Geoghegan
Jul 28 '14 at 9:21
add a comment |
I found this when I Googled for my own issue, found that if you count DISTINCT objects, you get the correct number returned (I'm using MySQL)
SELECT COUNT(DISTINCT DocumentID) AS Count1,
COUNT(DISTINCT DocumentSessionId) AS Count2
FROM DocumentOutputItems
I found this when I Googled for my own issue, found that if you count DISTINCT objects, you get the correct number returned (I'm using MySQL)
SELECT COUNT(DISTINCT DocumentID) AS Count1,
COUNT(DISTINCT DocumentSessionId) AS Count2
FROM DocumentOutputItems
answered Apr 12 '13 at 16:31
tehaugmentertehaugmenter
6311
6311
5
The above query will return a different set of results than what the OP was looking for (the distinct combinations ofDocumentIdandDocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.
– Anthony Geoghegan
Jul 28 '14 at 9:21
add a comment |
5
The above query will return a different set of results than what the OP was looking for (the distinct combinations ofDocumentIdandDocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.
– Anthony Geoghegan
Jul 28 '14 at 9:21
5
5
The above query will return a different set of results than what the OP was looking for (the distinct combinations of
DocumentId and DocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.– Anthony Geoghegan
Jul 28 '14 at 9:21
The above query will return a different set of results than what the OP was looking for (the distinct combinations of
DocumentId and DocumentSessionId). Alexander Kjäll already posted the correct answer if the OP was using MySQL and not MS SQL Server.– Anthony Geoghegan
Jul 28 '14 at 9:21
add a comment |
if you had only one field to "DISTINCT", you could use:
SELECT COUNT(DISTINCT DocumentId)
FROM DocumentOutputItems
and that does return the same query plan as the original, as tested with SET SHOWPLAN_ALL ON. However you are using two fields so you could try something crazy like:
SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId))
FROM DocumentOutputItems
but you'll have issues if NULLs are involved. I'd just stick with the original query.
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
add a comment |
if you had only one field to "DISTINCT", you could use:
SELECT COUNT(DISTINCT DocumentId)
FROM DocumentOutputItems
and that does return the same query plan as the original, as tested with SET SHOWPLAN_ALL ON. However you are using two fields so you could try something crazy like:
SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId))
FROM DocumentOutputItems
but you'll have issues if NULLs are involved. I'd just stick with the original query.
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
add a comment |
if you had only one field to "DISTINCT", you could use:
SELECT COUNT(DISTINCT DocumentId)
FROM DocumentOutputItems
and that does return the same query plan as the original, as tested with SET SHOWPLAN_ALL ON. However you are using two fields so you could try something crazy like:
SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId))
FROM DocumentOutputItems
but you'll have issues if NULLs are involved. I'd just stick with the original query.
if you had only one field to "DISTINCT", you could use:
SELECT COUNT(DISTINCT DocumentId)
FROM DocumentOutputItems
and that does return the same query plan as the original, as tested with SET SHOWPLAN_ALL ON. However you are using two fields so you could try something crazy like:
SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId))
FROM DocumentOutputItems
but you'll have issues if NULLs are involved. I'd just stick with the original query.
answered Sep 24 '09 at 13:34
KM.KM.
82.4k27147189
82.4k27147189
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
add a comment |
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
+1 from me. Thanks but I will stick with my query as you suggested. Using "convert" can decrease performance even more.
– Novitzky
Sep 24 '09 at 13:58
add a comment |
Hope this works i am writing on prima vista
SELECT COUNT(*)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
7
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
add a comment |
Hope this works i am writing on prima vista
SELECT COUNT(*)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
7
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
add a comment |
Hope this works i am writing on prima vista
SELECT COUNT(*)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
Hope this works i am writing on prima vista
SELECT COUNT(*)
FROM DocumentOutputItems
GROUP BY DocumentId, DocumentSessionId
edited Mar 7 '13 at 12:49
Alexander
20.4k74667
20.4k74667
answered Sep 24 '09 at 12:10
IordanTanevIordanTanev
4,55533146
4,55533146
7
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
add a comment |
7
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
7
7
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
In order for this to give the final answer, you would have to wrap it in another SELECT COUNT(*) FROM ( ... ). Essentially this answer is just giving you another way to list the distinct values you want to count. It's no better than your original solution.
– Dave Costa
Sep 24 '09 at 13:19
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
Thanks Dave. I know you can use group by instead of distinct in my case. I was wondering if you get the final result using just one query. I think is impossible but I might be wrong.
– Novitzky
Sep 24 '09 at 13:32
add a comment |
I wish MS SQL could also do something like COUNT(DISTINCT A, B). But it can't.
At first JayTee's answer seemed like a solution to me bu after some tests CHECKSUM() failed to create unique values. A quick example is, both CHECKSUM(31,467,519) and CHECKSUM(69,1120,823) gives the same answer which is 55.
Then I made some research and found that Microsoft does NOT recommend using CHECKSUM for change detection purposes. In some forums some suggested using
SELECT COUNT(DISTINCT CHECKSUM(value1, value2, ..., valueN) + CHECKSUM(valueN, value(N-1), ..., value1))
but this is also not conforting.
You can use HASHBYTES() function as suggested in TSQL CHECKSUM conundrum. However this also has a small chance of not returning unique results.
I would suggest using
SELECT COUNT(DISTINCT CAST(DocumentId AS VARCHAR)+'-'+CAST(DocumentSessionId AS VARCHAR)) FROM DocumentOutputItems
add a comment |
I wish MS SQL could also do something like COUNT(DISTINCT A, B). But it can't.
At first JayTee's answer seemed like a solution to me bu after some tests CHECKSUM() failed to create unique values. A quick example is, both CHECKSUM(31,467,519) and CHECKSUM(69,1120,823) gives the same answer which is 55.
Then I made some research and found that Microsoft does NOT recommend using CHECKSUM for change detection purposes. In some forums some suggested using
SELECT COUNT(DISTINCT CHECKSUM(value1, value2, ..., valueN) + CHECKSUM(valueN, value(N-1), ..., value1))
but this is also not conforting.
You can use HASHBYTES() function as suggested in TSQL CHECKSUM conundrum. However this also has a small chance of not returning unique results.
I would suggest using
SELECT COUNT(DISTINCT CAST(DocumentId AS VARCHAR)+'-'+CAST(DocumentSessionId AS VARCHAR)) FROM DocumentOutputItems
add a comment |
I wish MS SQL could also do something like COUNT(DISTINCT A, B). But it can't.
At first JayTee's answer seemed like a solution to me bu after some tests CHECKSUM() failed to create unique values. A quick example is, both CHECKSUM(31,467,519) and CHECKSUM(69,1120,823) gives the same answer which is 55.
Then I made some research and found that Microsoft does NOT recommend using CHECKSUM for change detection purposes. In some forums some suggested using
SELECT COUNT(DISTINCT CHECKSUM(value1, value2, ..., valueN) + CHECKSUM(valueN, value(N-1), ..., value1))
but this is also not conforting.
You can use HASHBYTES() function as suggested in TSQL CHECKSUM conundrum. However this also has a small chance of not returning unique results.
I would suggest using
SELECT COUNT(DISTINCT CAST(DocumentId AS VARCHAR)+'-'+CAST(DocumentSessionId AS VARCHAR)) FROM DocumentOutputItems
I wish MS SQL could also do something like COUNT(DISTINCT A, B). But it can't.
At first JayTee's answer seemed like a solution to me bu after some tests CHECKSUM() failed to create unique values. A quick example is, both CHECKSUM(31,467,519) and CHECKSUM(69,1120,823) gives the same answer which is 55.
Then I made some research and found that Microsoft does NOT recommend using CHECKSUM for change detection purposes. In some forums some suggested using
SELECT COUNT(DISTINCT CHECKSUM(value1, value2, ..., valueN) + CHECKSUM(valueN, value(N-1), ..., value1))
but this is also not conforting.
You can use HASHBYTES() function as suggested in TSQL CHECKSUM conundrum. However this also has a small chance of not returning unique results.
I would suggest using
SELECT COUNT(DISTINCT CAST(DocumentId AS VARCHAR)+'-'+CAST(DocumentSessionId AS VARCHAR)) FROM DocumentOutputItems
edited May 23 '17 at 11:47
Community♦
11
11
answered Jan 24 '13 at 10:34
Oncel Umut TUREROncel Umut TURER
430812
430812
add a comment |
add a comment |
Many (most?) SQL databases can work with tuples like values so you can just do:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId))
FROM DocumentOutputItems;
If your database doesn't support this, it can be simulated as per @oncel-umut-turer's suggestion of CHECKSUM or other scalar function providing good uniqueness e.g.
.
COUNT(DISTINCT CONCAT(DocumentId, ':', DocumentSessionId))
A related use of tuples is performing IN queries such as:
SELECT * FROM DocumentOutputItems
WHERE (DocumentId, DocumentSessionId) in (('a', '1'), ('b', '2'));
what databases supportselect count(distinct(a, b))? :D
– Vytenis Bivainis
Oct 11 '18 at 21:39
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
add a comment |
Many (most?) SQL databases can work with tuples like values so you can just do:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId))
FROM DocumentOutputItems;
If your database doesn't support this, it can be simulated as per @oncel-umut-turer's suggestion of CHECKSUM or other scalar function providing good uniqueness e.g.
.
COUNT(DISTINCT CONCAT(DocumentId, ':', DocumentSessionId))
A related use of tuples is performing IN queries such as:
SELECT * FROM DocumentOutputItems
WHERE (DocumentId, DocumentSessionId) in (('a', '1'), ('b', '2'));
what databases supportselect count(distinct(a, b))? :D
– Vytenis Bivainis
Oct 11 '18 at 21:39
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
add a comment |
Many (most?) SQL databases can work with tuples like values so you can just do:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId))
FROM DocumentOutputItems;
If your database doesn't support this, it can be simulated as per @oncel-umut-turer's suggestion of CHECKSUM or other scalar function providing good uniqueness e.g.
.
COUNT(DISTINCT CONCAT(DocumentId, ':', DocumentSessionId))
A related use of tuples is performing IN queries such as:
SELECT * FROM DocumentOutputItems
WHERE (DocumentId, DocumentSessionId) in (('a', '1'), ('b', '2'));
Many (most?) SQL databases can work with tuples like values so you can just do:
SELECT COUNT(DISTINCT (DocumentId, DocumentSessionId))
FROM DocumentOutputItems;
If your database doesn't support this, it can be simulated as per @oncel-umut-turer's suggestion of CHECKSUM or other scalar function providing good uniqueness e.g.
.
COUNT(DISTINCT CONCAT(DocumentId, ':', DocumentSessionId))
A related use of tuples is performing IN queries such as:
SELECT * FROM DocumentOutputItems
WHERE (DocumentId, DocumentSessionId) in (('a', '1'), ('b', '2'));
answered Jun 2 '18 at 16:42
karmakazekarmakaze
21.7k11922
21.7k11922
what databases supportselect count(distinct(a, b))? :D
– Vytenis Bivainis
Oct 11 '18 at 21:39
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
add a comment |
what databases supportselect count(distinct(a, b))? :D
– Vytenis Bivainis
Oct 11 '18 at 21:39
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
what databases support
select count(distinct(a, b))? :D– Vytenis Bivainis
Oct 11 '18 at 21:39
what databases support
select count(distinct(a, b))? :D– Vytenis Bivainis
Oct 11 '18 at 21:39
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
@VytenisBivainis I know PostgreSQL does--not sure since which version.
– karmakaze
Oct 17 '18 at 0:35
add a comment |
It works for me. In oracle:
SELECT SUM(DECODE(COUNT(*),1,1,1))
FROM DocumentOutputItems GROUP BY DocumentId, DocumentSessionId;
In jpql:
SELECT SUM(CASE WHEN COUNT(i)=1 THEN 1 ELSE 1 END)
FROM DocumentOutputItems i GROUP BY i.DocumentId, i.DocumentSessionId;
add a comment |
It works for me. In oracle:
SELECT SUM(DECODE(COUNT(*),1,1,1))
FROM DocumentOutputItems GROUP BY DocumentId, DocumentSessionId;
In jpql:
SELECT SUM(CASE WHEN COUNT(i)=1 THEN 1 ELSE 1 END)
FROM DocumentOutputItems i GROUP BY i.DocumentId, i.DocumentSessionId;
add a comment |
It works for me. In oracle:
SELECT SUM(DECODE(COUNT(*),1,1,1))
FROM DocumentOutputItems GROUP BY DocumentId, DocumentSessionId;
In jpql:
SELECT SUM(CASE WHEN COUNT(i)=1 THEN 1 ELSE 1 END)
FROM DocumentOutputItems i GROUP BY i.DocumentId, i.DocumentSessionId;
It works for me. In oracle:
SELECT SUM(DECODE(COUNT(*),1,1,1))
FROM DocumentOutputItems GROUP BY DocumentId, DocumentSessionId;
In jpql:
SELECT SUM(CASE WHEN COUNT(i)=1 THEN 1 ELSE 1 END)
FROM DocumentOutputItems i GROUP BY i.DocumentId, i.DocumentSessionId;
answered Mar 29 '18 at 7:59
NataNata
11
11
add a comment |
add a comment |
How about this,
Select DocumentId, DocumentSessionId, count(*) as c
from DocumentOutputItems
group by DocumentId, DocumentSessionId;
This will get us the count of all possible combinations of DocumentId, and DocumentSessionId
add a comment |
How about this,
Select DocumentId, DocumentSessionId, count(*) as c
from DocumentOutputItems
group by DocumentId, DocumentSessionId;
This will get us the count of all possible combinations of DocumentId, and DocumentSessionId
add a comment |
How about this,
Select DocumentId, DocumentSessionId, count(*) as c
from DocumentOutputItems
group by DocumentId, DocumentSessionId;
This will get us the count of all possible combinations of DocumentId, and DocumentSessionId
How about this,
Select DocumentId, DocumentSessionId, count(*) as c
from DocumentOutputItems
group by DocumentId, DocumentSessionId;
This will get us the count of all possible combinations of DocumentId, and DocumentSessionId
edited May 1 '18 at 10:57
ADyson
23.6k112445
23.6k112445
answered May 1 '18 at 10:45
Nikhil SinghNikhil Singh
11
11
add a comment |
add a comment |
You can just use the Count Function Twice.
In this case, it would be:
SELECT COUNT (DISTINCT DocumentId), COUNT (DISTINCT DocumentSessionId)
FROM DocumentOutputItems
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
add a comment |
You can just use the Count Function Twice.
In this case, it would be:
SELECT COUNT (DISTINCT DocumentId), COUNT (DISTINCT DocumentSessionId)
FROM DocumentOutputItems
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
add a comment |
You can just use the Count Function Twice.
In this case, it would be:
SELECT COUNT (DISTINCT DocumentId), COUNT (DISTINCT DocumentSessionId)
FROM DocumentOutputItems
You can just use the Count Function Twice.
In this case, it would be:
SELECT COUNT (DISTINCT DocumentId), COUNT (DISTINCT DocumentSessionId)
FROM DocumentOutputItems
edited Nov 16 '18 at 8:17
Nick
27k111939
27k111939
answered Aug 14 '18 at 16:11
BibekBibek
91
91
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
add a comment |
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
this doesn't do as require in the question, it counts the distinct in separate for each column
– naviram
Jan 10 at 22:17
add a comment |
select DISTINCT DocumentId as i, DocumentSessionId as s , count(*)
from DocumentOutputItems
group by i ,s;
This code uses distinct on 2 parameters and provides count of number of rows specific to those distinct values row count. It worked for me in mysql like a charm.
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
add a comment |
select DISTINCT DocumentId as i, DocumentSessionId as s , count(*)
from DocumentOutputItems
group by i ,s;
This code uses distinct on 2 parameters and provides count of number of rows specific to those distinct values row count. It worked for me in mysql like a charm.
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
add a comment |
select DISTINCT DocumentId as i, DocumentSessionId as s , count(*)
from DocumentOutputItems
group by i ,s;
This code uses distinct on 2 parameters and provides count of number of rows specific to those distinct values row count. It worked for me in mysql like a charm.
select DISTINCT DocumentId as i, DocumentSessionId as s , count(*)
from DocumentOutputItems
group by i ,s;
This code uses distinct on 2 parameters and provides count of number of rows specific to those distinct values row count. It worked for me in mysql like a charm.
edited Nov 22 '18 at 6:13
answered Nov 16 '18 at 7:17
rishi jainrishi jain
2519
2519
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
add a comment |
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
This is the part of review answer in stackoverflow. Add some explanation though code is explanatory.
– Harsha B
Nov 16 '18 at 13:10
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Can somebody please explain why my answer got a -1. Sorry, I am a newbie and I tried different answers and none of them was working. I made the changes and posted the working code?
– rishi jain
Nov 20 '18 at 6:45
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
Answer was not formatted properly. May be that's why someone downvoted. Don't worry, I have formatted the same.
– Harsha B
Nov 20 '18 at 7:04
add a comment |
This was posed and answered on Quora (https://www.quora.com/In-SQL-how-to-I-count-DISTINCT-over-multiple-columns):
select col1, col2, col3, count(*)
from table
group by col1, col2, col3
I was working on this in SAS, and SAS Proc SQL does not like DISTINCT with more than one column.
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
add a comment |
This was posed and answered on Quora (https://www.quora.com/In-SQL-how-to-I-count-DISTINCT-over-multiple-columns):
select col1, col2, col3, count(*)
from table
group by col1, col2, col3
I was working on this in SAS, and SAS Proc SQL does not like DISTINCT with more than one column.
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
add a comment |
This was posed and answered on Quora (https://www.quora.com/In-SQL-how-to-I-count-DISTINCT-over-multiple-columns):
select col1, col2, col3, count(*)
from table
group by col1, col2, col3
I was working on this in SAS, and SAS Proc SQL does not like DISTINCT with more than one column.
This was posed and answered on Quora (https://www.quora.com/In-SQL-how-to-I-count-DISTINCT-over-multiple-columns):
select col1, col2, col3, count(*)
from table
group by col1, col2, col3
I was working on this in SAS, and SAS Proc SQL does not like DISTINCT with more than one column.
answered Sep 20 '16 at 20:12
Barry DeCiccoBarry DeCicco
12
12
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
add a comment |
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
The original query in the question returns the number of combinations in given columns. This answer instead returns the number of occurrences for each combination in given columns.
– jumxozizi
Jan 27 '17 at 14:27
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f1471250%2fcounting-distinct-over-multiple-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
IordanTanev, Mark Brackett, RC - thanks for replies, it was a nice try, but you need to check what you doing before posting to SO. The queries you provided are not equivalent to my query. You can easily see I always have a scalar a result but your query returns multiple rows.
– Novitzky
Sep 24 '09 at 12:30
Just updated the question to include your clarifying comment from one of the answers
– Jeff
Mar 29 '16 at 2:54
FYI: community.oracle.com/ideas/18664
– quetzalcoatl
Sep 6 '18 at 14:49