Postgres clustering using multi-column indexes











up vote
2
down vote

favorite












I have a table which includes a multi-column index defined as



CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)


The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this



WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;


The table has been clustered on this index in a effort to improve performance.



CLUSTER tab_a USING tab_a_idx1;


Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.



SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef
FROM pg_class c
JOIN pg_index i ON i.indrelid = c.oid
JOIN pg_class x ON i.indexrelid = x.oid
JOIN pg_indexes z ON x.relname = z.indexname
WHERE c.relkind = 'r' AND c.relhasindex AND i.indisclustered


And I've been using the pg_stats table to check the correlation of the indexed columns.



The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.



Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).




  • If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

  • There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?


Thanks





UPDATE - the parent table was created as follows....



CREATE TABLE tab_a 
( device CHAR(6),
fixTime TIMESTAMP,
....lots more fields.....
)
PARTITION BY RANGE (fixTime);


The individual partitions were created like this



CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )


And the index used for the clustering like this....



CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);


And the command to do the cluster....



CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;









share|improve this question
























  • Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?
    – Laurenz Albe
    Nov 18 at 13:32















up vote
2
down vote

favorite












I have a table which includes a multi-column index defined as



CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)


The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this



WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;


The table has been clustered on this index in a effort to improve performance.



CLUSTER tab_a USING tab_a_idx1;


Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.



SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef
FROM pg_class c
JOIN pg_index i ON i.indrelid = c.oid
JOIN pg_class x ON i.indexrelid = x.oid
JOIN pg_indexes z ON x.relname = z.indexname
WHERE c.relkind = 'r' AND c.relhasindex AND i.indisclustered


And I've been using the pg_stats table to check the correlation of the indexed columns.



The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.



Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).




  • If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

  • There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?


Thanks





UPDATE - the parent table was created as follows....



CREATE TABLE tab_a 
( device CHAR(6),
fixTime TIMESTAMP,
....lots more fields.....
)
PARTITION BY RANGE (fixTime);


The individual partitions were created like this



CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )


And the index used for the clustering like this....



CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);


And the command to do the cluster....



CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;









share|improve this question
























  • Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?
    – Laurenz Albe
    Nov 18 at 13:32













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I have a table which includes a multi-column index defined as



CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)


The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this



WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;


The table has been clustered on this index in a effort to improve performance.



CLUSTER tab_a USING tab_a_idx1;


Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.



SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef
FROM pg_class c
JOIN pg_index i ON i.indrelid = c.oid
JOIN pg_class x ON i.indexrelid = x.oid
JOIN pg_indexes z ON x.relname = z.indexname
WHERE c.relkind = 'r' AND c.relhasindex AND i.indisclustered


And I've been using the pg_stats table to check the correlation of the indexed columns.



The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.



Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).




  • If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

  • There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?


Thanks





UPDATE - the parent table was created as follows....



CREATE TABLE tab_a 
( device CHAR(6),
fixTime TIMESTAMP,
....lots more fields.....
)
PARTITION BY RANGE (fixTime);


The individual partitions were created like this



CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )


And the index used for the clustering like this....



CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);


And the command to do the cluster....



CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;









share|improve this question















I have a table which includes a multi-column index defined as



CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)


The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this



WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;


The table has been clustered on this index in a effort to improve performance.



CLUSTER tab_a USING tab_a_idx1;


Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.



SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef
FROM pg_class c
JOIN pg_index i ON i.indrelid = c.oid
JOIN pg_class x ON i.indexrelid = x.oid
JOIN pg_indexes z ON x.relname = z.indexname
WHERE c.relkind = 'r' AND c.relhasindex AND i.indisclustered


And I've been using the pg_stats table to check the correlation of the indexed columns.



The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.



Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).




  • If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

  • There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?


Thanks





UPDATE - the parent table was created as follows....



CREATE TABLE tab_a 
( device CHAR(6),
fixTime TIMESTAMP,
....lots more fields.....
)
PARTITION BY RANGE (fixTime);


The individual partitions were created like this



CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )


And the index used for the clustering like this....



CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);


And the command to do the cluster....



CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;






postgresql query-performance






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 18:19

























asked Nov 17 at 13:59









Hemel

1948




1948












  • Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?
    – Laurenz Albe
    Nov 18 at 13:32


















  • Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?
    – Laurenz Albe
    Nov 18 at 13:32
















Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?
– Laurenz Albe
Nov 18 at 13:32




Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?
– Laurenz Albe
Nov 18 at 13:32

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53351910%2fpostgres-clustering-using-multi-column-indexes%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53351910%2fpostgres-clustering-using-multi-column-indexes%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Costa Masnaga

Fotorealismo

Sidney Franklin