Updating Dataset with Some Corrected Information, Some New Information












1















I have an old dataset with yearly record high temperatures for a number of cities. Each observation has seven variables: City State Year1 Year2 Year3 Year4 Year5



I also have a new dataset with yearly record high temperatures for an overlapping range of years. Each observation has seven variables: City State Year3 Year4 Year5 Year6 Year7



Some of the cities in the new dataset are the same as those in the old dataset, some are entirely new. I would like to combine the two datasets into one with nine variables: City State Year1 Year2 Year3 Year4 Year5 Year6 Year7



For those cities that appear in both datasets, the situation is complicated. For the overlapping years, Year3 Year4 Year5, non-missing values in the new dataset indicate a correction to the old dataset, and these non-missing values in the new dataset should appear in the merged dataset; missing values in the new dataset indicate no correction to the old dataset, and the values in the old dataset should appear in the merged dataset.



How best to merge the two datasets under these circumstances?










share|improve this question



























    1















    I have an old dataset with yearly record high temperatures for a number of cities. Each observation has seven variables: City State Year1 Year2 Year3 Year4 Year5



    I also have a new dataset with yearly record high temperatures for an overlapping range of years. Each observation has seven variables: City State Year3 Year4 Year5 Year6 Year7



    Some of the cities in the new dataset are the same as those in the old dataset, some are entirely new. I would like to combine the two datasets into one with nine variables: City State Year1 Year2 Year3 Year4 Year5 Year6 Year7



    For those cities that appear in both datasets, the situation is complicated. For the overlapping years, Year3 Year4 Year5, non-missing values in the new dataset indicate a correction to the old dataset, and these non-missing values in the new dataset should appear in the merged dataset; missing values in the new dataset indicate no correction to the old dataset, and the values in the old dataset should appear in the merged dataset.



    How best to merge the two datasets under these circumstances?










    share|improve this question

























      1












      1








      1








      I have an old dataset with yearly record high temperatures for a number of cities. Each observation has seven variables: City State Year1 Year2 Year3 Year4 Year5



      I also have a new dataset with yearly record high temperatures for an overlapping range of years. Each observation has seven variables: City State Year3 Year4 Year5 Year6 Year7



      Some of the cities in the new dataset are the same as those in the old dataset, some are entirely new. I would like to combine the two datasets into one with nine variables: City State Year1 Year2 Year3 Year4 Year5 Year6 Year7



      For those cities that appear in both datasets, the situation is complicated. For the overlapping years, Year3 Year4 Year5, non-missing values in the new dataset indicate a correction to the old dataset, and these non-missing values in the new dataset should appear in the merged dataset; missing values in the new dataset indicate no correction to the old dataset, and the values in the old dataset should appear in the merged dataset.



      How best to merge the two datasets under these circumstances?










      share|improve this question














      I have an old dataset with yearly record high temperatures for a number of cities. Each observation has seven variables: City State Year1 Year2 Year3 Year4 Year5



      I also have a new dataset with yearly record high temperatures for an overlapping range of years. Each observation has seven variables: City State Year3 Year4 Year5 Year6 Year7



      Some of the cities in the new dataset are the same as those in the old dataset, some are entirely new. I would like to combine the two datasets into one with nine variables: City State Year1 Year2 Year3 Year4 Year5 Year6 Year7



      For those cities that appear in both datasets, the situation is complicated. For the overlapping years, Year3 Year4 Year5, non-missing values in the new dataset indicate a correction to the old dataset, and these non-missing values in the new dataset should appear in the merged dataset; missing values in the new dataset indicate no correction to the old dataset, and the values in the old dataset should appear in the merged dataset.



      How best to merge the two datasets under these circumstances?







      sas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 25 '18 at 22:12









      soandrewsoandrew

      82




      82
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Update should work for you. below example should work. 2nd dataset updates records of first one when only 2nd dataset has value(it does not update with missing value). it also add's new cities from 2nd data please try running the example below.



          data old_dataset;
          input City $ State $ Year1 Year2 Year3 Year4 Year5;
          datalines;
          city1 state1 90 80 70 60 50
          city3 state3 90 80 . 60 50
          ;

          data new_dataset;
          input City $ State $ Year1 Year2 Year3 Year4 Year5 Year6 Year7;
          datalines;
          city1 state1 . . . . . . .
          city2 state2 90 80 70 60 50 90 80
          city3 state3 90 80 90 60 . 60 50
          ;

          proc sort data=old_dataset;
          by city;
          run;

          proc sort data=new_dataset;
          by city;
          run;

          data want;
          update old_dataset new_dataset;
          by city;
          run;


          After update following things happen
          1. new columns are added
          2. Non missing values of 2nd dataset updates column of first dataset .






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53472522%2fupdating-dataset-with-some-corrected-information-some-new-information%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Update should work for you. below example should work. 2nd dataset updates records of first one when only 2nd dataset has value(it does not update with missing value). it also add's new cities from 2nd data please try running the example below.



            data old_dataset;
            input City $ State $ Year1 Year2 Year3 Year4 Year5;
            datalines;
            city1 state1 90 80 70 60 50
            city3 state3 90 80 . 60 50
            ;

            data new_dataset;
            input City $ State $ Year1 Year2 Year3 Year4 Year5 Year6 Year7;
            datalines;
            city1 state1 . . . . . . .
            city2 state2 90 80 70 60 50 90 80
            city3 state3 90 80 90 60 . 60 50
            ;

            proc sort data=old_dataset;
            by city;
            run;

            proc sort data=new_dataset;
            by city;
            run;

            data want;
            update old_dataset new_dataset;
            by city;
            run;


            After update following things happen
            1. new columns are added
            2. Non missing values of 2nd dataset updates column of first dataset .






            share|improve this answer




























              0














              Update should work for you. below example should work. 2nd dataset updates records of first one when only 2nd dataset has value(it does not update with missing value). it also add's new cities from 2nd data please try running the example below.



              data old_dataset;
              input City $ State $ Year1 Year2 Year3 Year4 Year5;
              datalines;
              city1 state1 90 80 70 60 50
              city3 state3 90 80 . 60 50
              ;

              data new_dataset;
              input City $ State $ Year1 Year2 Year3 Year4 Year5 Year6 Year7;
              datalines;
              city1 state1 . . . . . . .
              city2 state2 90 80 70 60 50 90 80
              city3 state3 90 80 90 60 . 60 50
              ;

              proc sort data=old_dataset;
              by city;
              run;

              proc sort data=new_dataset;
              by city;
              run;

              data want;
              update old_dataset new_dataset;
              by city;
              run;


              After update following things happen
              1. new columns are added
              2. Non missing values of 2nd dataset updates column of first dataset .






              share|improve this answer


























                0












                0








                0







                Update should work for you. below example should work. 2nd dataset updates records of first one when only 2nd dataset has value(it does not update with missing value). it also add's new cities from 2nd data please try running the example below.



                data old_dataset;
                input City $ State $ Year1 Year2 Year3 Year4 Year5;
                datalines;
                city1 state1 90 80 70 60 50
                city3 state3 90 80 . 60 50
                ;

                data new_dataset;
                input City $ State $ Year1 Year2 Year3 Year4 Year5 Year6 Year7;
                datalines;
                city1 state1 . . . . . . .
                city2 state2 90 80 70 60 50 90 80
                city3 state3 90 80 90 60 . 60 50
                ;

                proc sort data=old_dataset;
                by city;
                run;

                proc sort data=new_dataset;
                by city;
                run;

                data want;
                update old_dataset new_dataset;
                by city;
                run;


                After update following things happen
                1. new columns are added
                2. Non missing values of 2nd dataset updates column of first dataset .






                share|improve this answer













                Update should work for you. below example should work. 2nd dataset updates records of first one when only 2nd dataset has value(it does not update with missing value). it also add's new cities from 2nd data please try running the example below.



                data old_dataset;
                input City $ State $ Year1 Year2 Year3 Year4 Year5;
                datalines;
                city1 state1 90 80 70 60 50
                city3 state3 90 80 . 60 50
                ;

                data new_dataset;
                input City $ State $ Year1 Year2 Year3 Year4 Year5 Year6 Year7;
                datalines;
                city1 state1 . . . . . . .
                city2 state2 90 80 70 60 50 90 80
                city3 state3 90 80 90 60 . 60 50
                ;

                proc sort data=old_dataset;
                by city;
                run;

                proc sort data=new_dataset;
                by city;
                run;

                data want;
                update old_dataset new_dataset;
                by city;
                run;


                After update following things happen
                1. new columns are added
                2. Non missing values of 2nd dataset updates column of first dataset .







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 25 '18 at 23:43









                Kiran Kiran

                2,96531020




                2,96531020
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53472522%2fupdating-dataset-with-some-corrected-information-some-new-information%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Costa Masnaga

                    Fotorealismo

                    Sidney Franklin