PostgreSQL - Ordering Jumbled Data (Recursive?)












1















I have some unordered data in a PostgreSQL database that looks like this:



ID  PATH    START   END
7 A123 B C
4 B456 D E
9 A123 A B
2 B456 A B
6 B456 B C
21 A123 C D
3 B456 C D
5 B456 E F


START and END values can't be ordered alphabetically, these have just been used to illustrate the issue



I am trying to achieve this:



id  path    sequence    start   end
9 A123 1 A B
7 A123 2 B C
21 A123 3 C D
2 B456 1 A B
6 B456 2 B C
3 B456 3 C D
4 B456 4 D E
5 B456 5 E F


The logic I am thinking of to achieve this to determine the start value (represented as A) for each path (A123 / B456). Then determine the sequence AB, BC, CD etc). Which needs to be repeated for all paths.



I've written recursive query that iterates through a given path name (See WHERE path = 'B456')



WITH RECURSIVE ordered(id, path, sequence, "start", "end") AS (
WITH path AS (SELECT id, "path", "start", "end"
FROM unordered
WHERE path = 'B456'),
startofpath AS (SELECT p1.id
FROM unordered p1
LEFT JOIN unordered p2 ON p1.start = p2.end
WHERE p2.start IS NULL)

--find start of path (A)
SELECT path.id, path.path, 1, path.start, path.end
FROM path, startofpath
WHERE path.id = startofpath.id
UNION ALL
--add on next path (B -> C)
SELECT path.id, path.path, ordered.sequence + 1, path.start, path.end FROM
path
INNER JOIN ordered
ON path.start = ordered."end")
SELECT * FROM ordered


Sample Data:



CREATE table unordered (
id INT PRIMARY KEY,
path TEXT NOT NULL,
"start" TEXT NOT NULL,
"end" TEXT NOT NULL);

INSERT INTO unordered (id, path, "start", "end") VALUES (7,'A123','B','C');
INSERT INTO unordered (id, path, "start", "end") VALUES (4,'B456','D','E');
INSERT INTO unordered (id, path, "start", "end") VALUES (9,'A123','A','B');
INSERT INTO unordered (id, path, "start", "end") VALUES (2,'B456','A','B');
INSERT INTO unordered (id, path, "start", "end") VALUES (6,'B456','B','C');
INSERT INTO unordered (id, path, "start", "end") VALUES (21,'A123','C','D');
INSERT INTO unordered (id, path, "start", "end") VALUES (3,'B456','C','D');
INSERT INTO unordered (id, path, "start", "end") VALUES (5,'B456','E','F');


What I am having trouble solving is then how to iterate through all paths (A123, then B456 etc)



Is anyone able to assist with this next step?
(or reworking my query from scratch if I've got the complete wrong idea)



Many thanks!










share|improve this question





























    1















    I have some unordered data in a PostgreSQL database that looks like this:



    ID  PATH    START   END
    7 A123 B C
    4 B456 D E
    9 A123 A B
    2 B456 A B
    6 B456 B C
    21 A123 C D
    3 B456 C D
    5 B456 E F


    START and END values can't be ordered alphabetically, these have just been used to illustrate the issue



    I am trying to achieve this:



    id  path    sequence    start   end
    9 A123 1 A B
    7 A123 2 B C
    21 A123 3 C D
    2 B456 1 A B
    6 B456 2 B C
    3 B456 3 C D
    4 B456 4 D E
    5 B456 5 E F


    The logic I am thinking of to achieve this to determine the start value (represented as A) for each path (A123 / B456). Then determine the sequence AB, BC, CD etc). Which needs to be repeated for all paths.



    I've written recursive query that iterates through a given path name (See WHERE path = 'B456')



    WITH RECURSIVE ordered(id, path, sequence, "start", "end") AS (
    WITH path AS (SELECT id, "path", "start", "end"
    FROM unordered
    WHERE path = 'B456'),
    startofpath AS (SELECT p1.id
    FROM unordered p1
    LEFT JOIN unordered p2 ON p1.start = p2.end
    WHERE p2.start IS NULL)

    --find start of path (A)
    SELECT path.id, path.path, 1, path.start, path.end
    FROM path, startofpath
    WHERE path.id = startofpath.id
    UNION ALL
    --add on next path (B -> C)
    SELECT path.id, path.path, ordered.sequence + 1, path.start, path.end FROM
    path
    INNER JOIN ordered
    ON path.start = ordered."end")
    SELECT * FROM ordered


    Sample Data:



    CREATE table unordered (
    id INT PRIMARY KEY,
    path TEXT NOT NULL,
    "start" TEXT NOT NULL,
    "end" TEXT NOT NULL);

    INSERT INTO unordered (id, path, "start", "end") VALUES (7,'A123','B','C');
    INSERT INTO unordered (id, path, "start", "end") VALUES (4,'B456','D','E');
    INSERT INTO unordered (id, path, "start", "end") VALUES (9,'A123','A','B');
    INSERT INTO unordered (id, path, "start", "end") VALUES (2,'B456','A','B');
    INSERT INTO unordered (id, path, "start", "end") VALUES (6,'B456','B','C');
    INSERT INTO unordered (id, path, "start", "end") VALUES (21,'A123','C','D');
    INSERT INTO unordered (id, path, "start", "end") VALUES (3,'B456','C','D');
    INSERT INTO unordered (id, path, "start", "end") VALUES (5,'B456','E','F');


    What I am having trouble solving is then how to iterate through all paths (A123, then B456 etc)



    Is anyone able to assist with this next step?
    (or reworking my query from scratch if I've got the complete wrong idea)



    Many thanks!










    share|improve this question



























      1












      1








      1








      I have some unordered data in a PostgreSQL database that looks like this:



      ID  PATH    START   END
      7 A123 B C
      4 B456 D E
      9 A123 A B
      2 B456 A B
      6 B456 B C
      21 A123 C D
      3 B456 C D
      5 B456 E F


      START and END values can't be ordered alphabetically, these have just been used to illustrate the issue



      I am trying to achieve this:



      id  path    sequence    start   end
      9 A123 1 A B
      7 A123 2 B C
      21 A123 3 C D
      2 B456 1 A B
      6 B456 2 B C
      3 B456 3 C D
      4 B456 4 D E
      5 B456 5 E F


      The logic I am thinking of to achieve this to determine the start value (represented as A) for each path (A123 / B456). Then determine the sequence AB, BC, CD etc). Which needs to be repeated for all paths.



      I've written recursive query that iterates through a given path name (See WHERE path = 'B456')



      WITH RECURSIVE ordered(id, path, sequence, "start", "end") AS (
      WITH path AS (SELECT id, "path", "start", "end"
      FROM unordered
      WHERE path = 'B456'),
      startofpath AS (SELECT p1.id
      FROM unordered p1
      LEFT JOIN unordered p2 ON p1.start = p2.end
      WHERE p2.start IS NULL)

      --find start of path (A)
      SELECT path.id, path.path, 1, path.start, path.end
      FROM path, startofpath
      WHERE path.id = startofpath.id
      UNION ALL
      --add on next path (B -> C)
      SELECT path.id, path.path, ordered.sequence + 1, path.start, path.end FROM
      path
      INNER JOIN ordered
      ON path.start = ordered."end")
      SELECT * FROM ordered


      Sample Data:



      CREATE table unordered (
      id INT PRIMARY KEY,
      path TEXT NOT NULL,
      "start" TEXT NOT NULL,
      "end" TEXT NOT NULL);

      INSERT INTO unordered (id, path, "start", "end") VALUES (7,'A123','B','C');
      INSERT INTO unordered (id, path, "start", "end") VALUES (4,'B456','D','E');
      INSERT INTO unordered (id, path, "start", "end") VALUES (9,'A123','A','B');
      INSERT INTO unordered (id, path, "start", "end") VALUES (2,'B456','A','B');
      INSERT INTO unordered (id, path, "start", "end") VALUES (6,'B456','B','C');
      INSERT INTO unordered (id, path, "start", "end") VALUES (21,'A123','C','D');
      INSERT INTO unordered (id, path, "start", "end") VALUES (3,'B456','C','D');
      INSERT INTO unordered (id, path, "start", "end") VALUES (5,'B456','E','F');


      What I am having trouble solving is then how to iterate through all paths (A123, then B456 etc)



      Is anyone able to assist with this next step?
      (or reworking my query from scratch if I've got the complete wrong idea)



      Many thanks!










      share|improve this question
















      I have some unordered data in a PostgreSQL database that looks like this:



      ID  PATH    START   END
      7 A123 B C
      4 B456 D E
      9 A123 A B
      2 B456 A B
      6 B456 B C
      21 A123 C D
      3 B456 C D
      5 B456 E F


      START and END values can't be ordered alphabetically, these have just been used to illustrate the issue



      I am trying to achieve this:



      id  path    sequence    start   end
      9 A123 1 A B
      7 A123 2 B C
      21 A123 3 C D
      2 B456 1 A B
      6 B456 2 B C
      3 B456 3 C D
      4 B456 4 D E
      5 B456 5 E F


      The logic I am thinking of to achieve this to determine the start value (represented as A) for each path (A123 / B456). Then determine the sequence AB, BC, CD etc). Which needs to be repeated for all paths.



      I've written recursive query that iterates through a given path name (See WHERE path = 'B456')



      WITH RECURSIVE ordered(id, path, sequence, "start", "end") AS (
      WITH path AS (SELECT id, "path", "start", "end"
      FROM unordered
      WHERE path = 'B456'),
      startofpath AS (SELECT p1.id
      FROM unordered p1
      LEFT JOIN unordered p2 ON p1.start = p2.end
      WHERE p2.start IS NULL)

      --find start of path (A)
      SELECT path.id, path.path, 1, path.start, path.end
      FROM path, startofpath
      WHERE path.id = startofpath.id
      UNION ALL
      --add on next path (B -> C)
      SELECT path.id, path.path, ordered.sequence + 1, path.start, path.end FROM
      path
      INNER JOIN ordered
      ON path.start = ordered."end")
      SELECT * FROM ordered


      Sample Data:



      CREATE table unordered (
      id INT PRIMARY KEY,
      path TEXT NOT NULL,
      "start" TEXT NOT NULL,
      "end" TEXT NOT NULL);

      INSERT INTO unordered (id, path, "start", "end") VALUES (7,'A123','B','C');
      INSERT INTO unordered (id, path, "start", "end") VALUES (4,'B456','D','E');
      INSERT INTO unordered (id, path, "start", "end") VALUES (9,'A123','A','B');
      INSERT INTO unordered (id, path, "start", "end") VALUES (2,'B456','A','B');
      INSERT INTO unordered (id, path, "start", "end") VALUES (6,'B456','B','C');
      INSERT INTO unordered (id, path, "start", "end") VALUES (21,'A123','C','D');
      INSERT INTO unordered (id, path, "start", "end") VALUES (3,'B456','C','D');
      INSERT INTO unordered (id, path, "start", "end") VALUES (5,'B456','E','F');


      What I am having trouble solving is then how to iterate through all paths (A123, then B456 etc)



      Is anyone able to assist with this next step?
      (or reworking my query from scratch if I've got the complete wrong idea)



      Many thanks!







      postgresql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 '18 at 5:09







      Josh Flynn

















      asked Nov 22 '18 at 4:51









      Josh FlynnJosh Flynn

      83




      83
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Is this what you are looking for ?



          WITH RECURSIVE
          get_path(id, path, sequence, starting, ending) AS (
          SELECT u.id, u.path, 1, u.starting, u.ending
          FROM unordered AS u
          WHERE u.starting NOT IN (SELECT u.ending FROM unordered AS u) -- a starting point has no entry in ending column

          UNION

          SELECT u.id, u.path, g.sequence + 1, g.ending, u.ending
          FROM get_path AS g, unordered AS u
          WHERE u.starting = g.ending
          )
          TABLE get_path ORDER BY path, sequence;


          Please note that i changed "start" into starting and "end" into ending.






          share|improve this answer


























          • Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

            – Josh Flynn
            Nov 22 '18 at 22:08













          • Please check the edit.

            – Tony
            Nov 22 '18 at 22:24











          • That's great! Thanks Tony!

            – Josh Flynn
            Nov 22 '18 at 22:44











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53424090%2fpostgresql-ordering-jumbled-data-recursive%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Is this what you are looking for ?



          WITH RECURSIVE
          get_path(id, path, sequence, starting, ending) AS (
          SELECT u.id, u.path, 1, u.starting, u.ending
          FROM unordered AS u
          WHERE u.starting NOT IN (SELECT u.ending FROM unordered AS u) -- a starting point has no entry in ending column

          UNION

          SELECT u.id, u.path, g.sequence + 1, g.ending, u.ending
          FROM get_path AS g, unordered AS u
          WHERE u.starting = g.ending
          )
          TABLE get_path ORDER BY path, sequence;


          Please note that i changed "start" into starting and "end" into ending.






          share|improve this answer


























          • Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

            – Josh Flynn
            Nov 22 '18 at 22:08













          • Please check the edit.

            – Tony
            Nov 22 '18 at 22:24











          • That's great! Thanks Tony!

            – Josh Flynn
            Nov 22 '18 at 22:44
















          0














          Is this what you are looking for ?



          WITH RECURSIVE
          get_path(id, path, sequence, starting, ending) AS (
          SELECT u.id, u.path, 1, u.starting, u.ending
          FROM unordered AS u
          WHERE u.starting NOT IN (SELECT u.ending FROM unordered AS u) -- a starting point has no entry in ending column

          UNION

          SELECT u.id, u.path, g.sequence + 1, g.ending, u.ending
          FROM get_path AS g, unordered AS u
          WHERE u.starting = g.ending
          )
          TABLE get_path ORDER BY path, sequence;


          Please note that i changed "start" into starting and "end" into ending.






          share|improve this answer


























          • Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

            – Josh Flynn
            Nov 22 '18 at 22:08













          • Please check the edit.

            – Tony
            Nov 22 '18 at 22:24











          • That's great! Thanks Tony!

            – Josh Flynn
            Nov 22 '18 at 22:44














          0












          0








          0







          Is this what you are looking for ?



          WITH RECURSIVE
          get_path(id, path, sequence, starting, ending) AS (
          SELECT u.id, u.path, 1, u.starting, u.ending
          FROM unordered AS u
          WHERE u.starting NOT IN (SELECT u.ending FROM unordered AS u) -- a starting point has no entry in ending column

          UNION

          SELECT u.id, u.path, g.sequence + 1, g.ending, u.ending
          FROM get_path AS g, unordered AS u
          WHERE u.starting = g.ending
          )
          TABLE get_path ORDER BY path, sequence;


          Please note that i changed "start" into starting and "end" into ending.






          share|improve this answer















          Is this what you are looking for ?



          WITH RECURSIVE
          get_path(id, path, sequence, starting, ending) AS (
          SELECT u.id, u.path, 1, u.starting, u.ending
          FROM unordered AS u
          WHERE u.starting NOT IN (SELECT u.ending FROM unordered AS u) -- a starting point has no entry in ending column

          UNION

          SELECT u.id, u.path, g.sequence + 1, g.ending, u.ending
          FROM get_path AS g, unordered AS u
          WHERE u.starting = g.ending
          )
          TABLE get_path ORDER BY path, sequence;


          Please note that i changed "start" into starting and "end" into ending.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 22 '18 at 22:23

























          answered Nov 22 '18 at 10:22









          TonyTony

          384211




          384211













          • Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

            – Josh Flynn
            Nov 22 '18 at 22:08













          • Please check the edit.

            – Tony
            Nov 22 '18 at 22:24











          • That's great! Thanks Tony!

            – Josh Flynn
            Nov 22 '18 at 22:44



















          • Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

            – Josh Flynn
            Nov 22 '18 at 22:08













          • Please check the edit.

            – Tony
            Nov 22 '18 at 22:24











          • That's great! Thanks Tony!

            – Josh Flynn
            Nov 22 '18 at 22:44

















          Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

          – Josh Flynn
          Nov 22 '18 at 22:08







          Close, in my query I fixed the path as a constant value, in your query you've fixed the start as 'A'. What I'm having trouble with is the combination of the two. For each path find the correct order of it's child records when the start is unknown (but can be derived because B -> C joins to A -> B but A -> B does not join to anything) Thanks for taking a look.

          – Josh Flynn
          Nov 22 '18 at 22:08















          Please check the edit.

          – Tony
          Nov 22 '18 at 22:24





          Please check the edit.

          – Tony
          Nov 22 '18 at 22:24













          That's great! Thanks Tony!

          – Josh Flynn
          Nov 22 '18 at 22:44





          That's great! Thanks Tony!

          – Josh Flynn
          Nov 22 '18 at 22:44


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53424090%2fpostgresql-ordering-jumbled-data-recursive%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Costa Masnaga

          Fotorealismo

          Sidney Franklin