How to create high quality reproducible examples in Stata












7















After having spent more than a year answering questions in the stata, it is clear to me that the vast majority of people asking questions here are mostly new or occasional users of Stack Overflow. Such users are more likely to confuse this site with a traditional forum such as Statalist and post unclear or off-topic questions.



These users typically:




  1. Have little familiarity with Stata's interface and language conventions asking questions that can easily be answered by consulting a help file

  2. Ask how to do something without providing any context

  3. Dump an unformatted code snippet with no comments, cryptic variable names and typos

  4. Fail to provide realistic data examples

  5. Neglect to describe what results they expect


The motivation for this question has been the success of similar questions for
Pandas and R in providing a one-stop beginner's resource. The examples provided there from experienced users appear to have increased the chance of (new) askers getting answers to their questions.



What is your advice for creating a high quality stata reproducible example?



Simple datasets can be entered with the edit command, which opens the
variables editor and allows the user to manually type or paste data.



Programmatically, the input command can also be used:



clear

input id str5 name income
1 "Tracy" 90000
2 "Ramon" 70000
3 "Kevin" 80000
end

list

+---------------------+
| id name income |
|---------------------|
1. | 1 Tracy 90000 |
2. | 2 Ramon 70000 |
3. | 3 Kevin 80000 |
+---------------------+


Are there any additional tips to using input for providing example data?
When should one use the trace and pause commands? What other information
should be included?









share















migrated from stackoverflow.com Nov 24 '18 at 16:17


This question came from our site for professional and enthusiast programmers.























    7















    After having spent more than a year answering questions in the stata, it is clear to me that the vast majority of people asking questions here are mostly new or occasional users of Stack Overflow. Such users are more likely to confuse this site with a traditional forum such as Statalist and post unclear or off-topic questions.



    These users typically:




    1. Have little familiarity with Stata's interface and language conventions asking questions that can easily be answered by consulting a help file

    2. Ask how to do something without providing any context

    3. Dump an unformatted code snippet with no comments, cryptic variable names and typos

    4. Fail to provide realistic data examples

    5. Neglect to describe what results they expect


    The motivation for this question has been the success of similar questions for
    Pandas and R in providing a one-stop beginner's resource. The examples provided there from experienced users appear to have increased the chance of (new) askers getting answers to their questions.



    What is your advice for creating a high quality stata reproducible example?



    Simple datasets can be entered with the edit command, which opens the
    variables editor and allows the user to manually type or paste data.



    Programmatically, the input command can also be used:



    clear

    input id str5 name income
    1 "Tracy" 90000
    2 "Ramon" 70000
    3 "Kevin" 80000
    end

    list

    +---------------------+
    | id name income |
    |---------------------|
    1. | 1 Tracy 90000 |
    2. | 2 Ramon 70000 |
    3. | 3 Kevin 80000 |
    +---------------------+


    Are there any additional tips to using input for providing example data?
    When should one use the trace and pause commands? What other information
    should be included?









    share















    migrated from stackoverflow.com Nov 24 '18 at 16:17


    This question came from our site for professional and enthusiast programmers.





















      7












      7








      7


      1






      After having spent more than a year answering questions in the stata, it is clear to me that the vast majority of people asking questions here are mostly new or occasional users of Stack Overflow. Such users are more likely to confuse this site with a traditional forum such as Statalist and post unclear or off-topic questions.



      These users typically:




      1. Have little familiarity with Stata's interface and language conventions asking questions that can easily be answered by consulting a help file

      2. Ask how to do something without providing any context

      3. Dump an unformatted code snippet with no comments, cryptic variable names and typos

      4. Fail to provide realistic data examples

      5. Neglect to describe what results they expect


      The motivation for this question has been the success of similar questions for
      Pandas and R in providing a one-stop beginner's resource. The examples provided there from experienced users appear to have increased the chance of (new) askers getting answers to their questions.



      What is your advice for creating a high quality stata reproducible example?



      Simple datasets can be entered with the edit command, which opens the
      variables editor and allows the user to manually type or paste data.



      Programmatically, the input command can also be used:



      clear

      input id str5 name income
      1 "Tracy" 90000
      2 "Ramon" 70000
      3 "Kevin" 80000
      end

      list

      +---------------------+
      | id name income |
      |---------------------|
      1. | 1 Tracy 90000 |
      2. | 2 Ramon 70000 |
      3. | 3 Kevin 80000 |
      +---------------------+


      Are there any additional tips to using input for providing example data?
      When should one use the trace and pause commands? What other information
      should be included?









      share
















      After having spent more than a year answering questions in the stata, it is clear to me that the vast majority of people asking questions here are mostly new or occasional users of Stack Overflow. Such users are more likely to confuse this site with a traditional forum such as Statalist and post unclear or off-topic questions.



      These users typically:




      1. Have little familiarity with Stata's interface and language conventions asking questions that can easily be answered by consulting a help file

      2. Ask how to do something without providing any context

      3. Dump an unformatted code snippet with no comments, cryptic variable names and typos

      4. Fail to provide realistic data examples

      5. Neglect to describe what results they expect


      The motivation for this question has been the success of similar questions for
      Pandas and R in providing a one-stop beginner's resource. The examples provided there from experienced users appear to have increased the chance of (new) askers getting answers to their questions.



      What is your advice for creating a high quality stata reproducible example?



      Simple datasets can be entered with the edit command, which opens the
      variables editor and allows the user to manually type or paste data.



      Programmatically, the input command can also be used:



      clear

      input id str5 name income
      1 "Tracy" 90000
      2 "Ramon" 70000
      3 "Kevin" 80000
      end

      list

      +---------------------+
      | id name income |
      |---------------------|
      1. | 1 Tracy 90000 |
      2. | 2 Ramon 70000 |
      3. | 3 Kevin 80000 |
      +---------------------+


      Are there any additional tips to using input for providing example data?
      When should one use the trace and pause commands? What other information
      should be included?







      discussion





      share














      share












      share



      share








      edited Dec 3 '18 at 11:10









      Nick Cox

      25.1k66




      25.1k66










      asked Nov 22 '18 at 21:22









      Pearly SpencerPearly Spencer

      10.6k1520




      10.6k1520




      migrated from stackoverflow.com Nov 24 '18 at 16:17


      This question came from our site for professional and enthusiast programmers.









      migrated from stackoverflow.com Nov 24 '18 at 16:17


      This question came from our site for professional and enthusiast programmers.
























          1 Answer
          1






          active

          oldest

          votes


















          16














          Congratulations on seeking help for your Stata programming problem!



          Acknowledging that you need the input of others is the first step towards
          resolving it and maturing as a Stata user/programmer. Writing a good question is not a trivial task and requires experience. The latter comes with practice, which in turn requires perseverance.



          The secret in writing a high quality reproducible Stata question is the
          successful creation of a sandboxed example. This should use the shortest
          possible snippet of code and the minimal amount of example data required to
          replicate your problem.



          Stack Overflow's Stata volunteers are always happy to help but they do not spoon-feed. Lack of effort from your part will make it less likely that you get an answer and increase the chance that your Stata question will be closed and ultimately deleted.







          Tips for improving your question




          You can add this thread in your list of favorite questions by clicking the star ⭐ on the left hand side above. This will allow you to easily access these tips again from your profile's favorites tab.






          • I am new in Stata and I do not know its commands yet



          Before posting a question, please make sure that you have read Getting Started
          with Stata
          . You can access these introductory manuals by typing help gs from Stata's command prompt.



          There is simply no replacement for acquainting yourself with the basic concepts
          and syntax of Stata. This is particularly important as effective communication requires you to be able to speak the same language as the other more experienced Stata users on Stack Overflow.



          Do not forget that these users want to answer interesting programming problems,
          rather than act as tutors for teaching the basics. A more general forum such as Statalist or reddit may be more appropriate for problems relating to basic command usage.



          • I want to do [something] in Stata and I need the code



          Stack Overflow is geared towards solutions for specific programming problems. It is thus important that you explain as clearly as possible your situation and show us what you have tried.




          1. Start by clearly stating your question and telling us your Stata version and platform (Windows, Mac, Linux).


          2. Then give some context. This should focus on succinctly describing in words
            both your dataset and what you are trying to do.


          3. Next tell us how you attempted to accomplish your goal. This stage includes
            attaching the code that you used and the produced output. You should also link to any similar questions that you consulted on-line.


          4. Finally provide us with example data to run the code and reproduce your problem. These data should not be shown using a screenshot! See further down for help on this step.


          5. Always respond to comments requesting clarification from potential answerers.



          • I have this code in R/Python/SAS/SPSS and need the same in Stata



          Questions asking how to translate code from other languages to Stata's ado or
          mata languages are only valid if and only if there is a specific problem to be
          addressed in the attempted Stata code. Consequently, all the items listed in the previous and next sections are also relevant here.



          • This is the Stata code I used but it does not work





          1. Check for typos both in the script and in the code snippet provided in your question. The Stata interpreter is unforgiving: what might seem a straightforward programming task can thus quickly become an exercise in frustration. Stata volunteers on Stack Overflow are not here to hunt down typos arising from careless typing.



            Example:



            locla mymacro HELLO
            genrate var = 5




            local mymacro HELLO
            generate var = 5



          2. It is best to not abbreviate commands and avoid eliminating all
            white-space. This makes code harder to read and it is more error-prone. Other
            inexperienced users may also find it difficult to recognize even basic commands.



            Example:



            forval i=1/5 {
            loc mymacro`i' HELLO `i'
            g var`i'=`i'
            }




            forvalues i = 1 / 5 {
            local mymacro`i' HELLO `i'
            generate var`i' = `i'
            }



          3. Do not post your entire do file or code segment, but only the problematic part. In addition, make sure you properly format your code using code blocks. If your code snippet is more than five or six lines, break this into sections if necessary. Where the names of the variables you use are not self-explanatory, please provide comments.



            Example:



            sysuse auto

            des

            sum mpg

            gen mempg=r(mean)

            gen smpg=r(sum)

            reg mpg weight length





            /* load data */

            sysuse auto
            describe

            /* descriptive statistics */

            summarize mpg
            generate mean_mpg = r(mean)
            generate sum_education = r(sum)

            /* regression analysis */

            regress mpg weight length







          1. Make sure to check the help file for clues on why your code fails. Problems are often caused by invalid syntax. You can access the help files for commands and functions by typing help command/function name in Stata's command prompt.



            Example:



            list, separate(0)
            option separate() not allowed
            r(198);


            Here, typing help list reveals that this is not legal syntax. Indeed, the name of the option is separator(#) and not separate(#).








          1. Try to debug the code on your own before you ask here. Stata has useful debugging commands such as set trace (see help trace for more details), which shows how the code executes in real-time. Another useful debugging command is pause, which temporarily suspends execution of the code (help pause for more information).



            Example:



            set trace on

            forvalues i = 1 / 2 {
            display `i'
            }

            - forvalues i = 1 / 2 {
            - display `i'
            = display 1
            1
            - }
            - display `i'
            = display 2
            2
            - }


            Use these commands if your problem is not an obvious syntax error and include in your question selected relevant output, which is likely to shed more light on the causes of the problem. In addition, always include the full error code and message that Stata reports.







          1. If you are using a community-contributed command that you have
            downloaded from SSC, the Stata Journal or another source, it is important that you indicate this early on in your question. In this way, people who might answer do not waste time looking for it in external sites and can more quickly identify problems related specifically to this command.




          • Here is a screenshot of my Stata dataset



          Please do not upload screenshots!



          enter image description here



          Screenshots are not as helpful as you hope, primarily because they do not allow people who might answer to copy and paste data into their own Stata and try to reproduce the problem.



          In addition to the edit and input commands outlined in the question part of this thread, there are four other ways you can provide example data for your Stata question.





          1. If your data is confidential, you can demonstrate the problem using the sysuse command to load one of Stata's toy datasets:



            sysuse dir
            auto.dta bplong.dta brand2.dta bsexper3.dta census.dta
            auto2.dta bpwide.dta bsexper1.dta cancer.dta citytemp.dta
            autornd.dta brand1.dta bsexper2.dta cearep.dta citytemp4.dta

            sysuse census, clear

            list state region pop marriage in 1 / 5

            +---------------------------------------------+
            | state region pop marriage |
            |---------------------------------------------|
            1. | Alabama South 3,893,888 49,018 |
            2. | Alaska West 401,851 5,361 |
            3. | Arizona West 2,718,215 30,223 |
            4. | Arkansas South 2,286,435 26,513 |
            5. | California West 23,667,902 210,864 |
            +---------------------------------------------+



          2. Alternatively, you can directly download an online example dataset with the use command:



            clear
            use http://fmwww.bc.edu/ec-p/data/wooldridge/vote1

            list district voteA expendA shareA in 1 / 5

            +-------------------------------------+
            | district voteA expendA shareA |
            |-------------------------------------|
            1. | 7 68 328.3 97.41 |
            2. | 1 62 626.38 60.88 |
            3. | 2 73 99.61 97.01 |
            4. | 3 69 319.69 92.4 |
            5. | 3 75 159.22 72.61 |
            +-------------------------------------+



          3. For examples with your current dataset use the dataex command:



            dataex mpg price foreign in 1 / 5, elsewhere 

            ----------------------- copy starting from the next line -----------------------
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(mpg price) byte foreign
            22 4099 0
            17 4749 0
            22 3799 0
            20 4816 0
            15 7827 0
            end
            label values foreign origin
            label def origin 0 "Domestic", modify
            ------------------ copy up to and including the previous line ------------------


            In this case, the first five observations of variables mpg, price and foreign are requested. Note the option elsewhere, which is explained in the help file for dataex.



            Copy and paste everything between the end lines and use the {} button in
            the Stack Overflow question editor to format the snippet.



            The dataex command is especially needed when:




            • We need to be clear on whether a variable shown as text is really a string variable or a numeric variable with value labels.

            • You have date variables, which otherwise can be very awkward for people who might answer to handle.








          1. You can also use several other Stata commands and functions to simulate data:



            /* generate data in wide form */

            // discard data in memory
            clear

            // set the number of observations in dataset
            set obs 6

            // create a simple identifier
            generate id = _n

            // set the random-number seed for reproducibility
            set seed 12345

            // create a uniformly distributed random variable with values between 0 and 1
            generate var1 = runiform()

            // create a normally-distributed random variable with mean 20 and standard deviation 5
            generate var2 = rnormal(20, 5)

            // create random indicator variable 0/1
            generate var3 = rbinomial(1, 0.5)

            // see the results
            list, separator(0)

            +---------------------------------+
            | id var1 var2 var3 |
            |---------------------------------|
            1. | 1 .3576297 22.72038 0 |
            2. | 2 .4004426 20.00814 1 |
            3. | 3 .6893833 21.7884 1 |
            4. | 4 .5597356 29.39434 0 |
            5. | 5 .5744513 33.77373 0 |
            6. | 6 .2076905 16.93702 1 |
            +---------------------------------+




            // optionally create value labels for numeric variables such as id above

            label define idlabel 1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 6 "six"
            label values id idlabel

            list id, separator(0)

            +-------+
            | id |
            |-------|
            1. | one |
            2. | two |
            3. | three |
            4. | four |
            5. | five |
            6. | six |
            +-------+




            // create (random) date variables
            clear
            set obs 6

            // a daily date numeric variable
            display date("25/11/2018", "DMY")
            21513

            generate var1 = 21513 + _n

            // a random date variable within a specified interval
            generate var2 = floor( ( mdy(12,31,2018) - mdy(1,1,2017)+1 ) * ///
            runiform() + mdy(1,1,2017) )

            // a half-yearly date numeric variable
            display yh(2018, 1)
            116

            generate var3 = 116 + _n

            // see the raw results
            list var1 var2 var3, separator(0)

            +----------------------+
            | var1 var2 var3 |
            |----------------------|
            1. | 21514 21004 117 |
            2. | 21515 21351 118 |
            3. | 21516 21529 119 |
            4. | 21517 21532 120 |
            5. | 21518 21104 121 |
            6. | 21519 21523 122 |
            +----------------------+




            // see formatted results

            format %tdDD/NN/CCYY var1
            format %tdDD/NN/CCYY var2
            format %th var3

            list var1 var2 var3, separator(0)

            +----------------------------------+
            | var1 var2 var3 |
            |----------------------------------|
            1. | 26/11/2018 04/07/2017 2018h2 |
            2. | 27/11/2018 16/06/2018 2019h1 |
            3. | 28/11/2018 11/12/2018 2019h2 |
            4. | 29/11/2018 14/12/2018 2020h1 |
            5. | 30/11/2018 12/10/2017 2020h2 |
            6. | 01/12/2018 05/12/2018 2021h1 |
            +----------------------------------+




            /* generate data in long form */

            clear
            set obs 9

            // create an identifier increasing every three observations
            egen id = seq(), block(3)

            // create a year variable within each id
            bysort id: generate year = 2015 + _n

            // create a normally distributed random variable within each id
            bysort id: generate var = rnormal()

            // calculate the sum of var within each id
            bysort id: generate sum_var = sum(var)

            // note here the use of the `bysort` prefix, which sorts data and repeats
            // the command for each group of observations

            // see the results by id
            list, sepby(id)

            +-----------------------------------+
            | id year var sum_var |
            |-----------------------------------|
            1. | 1 2016 .1973079 .1973079 |
            2. | 1 2017 1.610224 1.807532 |
            3. | 1 2018 -.8034225 1.004109 |
            |-----------------------------------|
            4. | 2 2016 1.096012 1.096012 |
            5. | 2 2017 -.4407027 .6553089 |
            6. | 2 2018 -1.011427 -.3561177 |
            |-----------------------------------|
            7. | 3 2016 1.019227 1.019227 |
            8. | 3 2017 1.871976 2.891204 |
            9. | 3 2018 .4235664 3.31477 |
            +-----------------------------------+



          • Stata does not produce the results I want



          You should always include the output from Stata in your question. This should be copied and pasted directly from the Stata console. Then select the pasted output and click {} in the question editor.



          Example:



          enter image description here



          . sysuse auto
          (1978 Automobile Data)

          . regress price mpg i.foreign

          Source | SS df MS Number of obs = 74
          -------------+---------------------------------- F(2, 71) = 14.07
          Model | 180261702 2 90130850.8 Prob > F = 0.0000
          Residual | 454803695 71 6405685.84 R-squared = 0.2838
          -------------+---------------------------------- Adj R-squared = 0.2637
          Total | 635065396 73 8699525.97 Root MSE = 2530.9

          ------------------------------------------------------------------------------
          price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494
          |
          foreign |
          Foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368
          _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67
          ------------------------------------------------------------------------------


          If Stata gets back to you with unexpected results, it is most likely because it was not programmed correctly. Stata simply does what the user instructs it to do.



          That said, users on Stack Overflow are not mind-readers. Providing an example of the desired output will greatly increase your chances of getting a helpful response.



          If it is a graph, you can post a picture that illustrates the outcome.
          Otherwise, a table with an adequate amount of expected results is best.
          This can be generated using an online table creator (such as Table Generator or ASCII Table Generator) and pasted in your question appropriately formatted in code blocks.



          Example:



          Country Population Mean_age Sex_Ratio GDP

          United States of America 3999 23 1.01 5000

          Afghanistan 544 19 0.97 457

          China 10000 27 0.96 3400



          +--------------------------+------------+----------+-----------+------+
          | | Population | Mean_Age | Sex_Ratio | GDP |
          | Country | | | | |
          +--------------------------+------------+----------+-----------+------+
          | United States of America | 3999 | 23 | 1.01 | 5000 |
          +--------------------------+------------+----------+-----------+------+
          | Afghanistan | 544 | 19 | 0.97 | 457 |
          +--------------------------+------------+----------+-----------+------+
          | China | 10000 | 27 | 0.96 | 3400 |
          +--------------------------+------------+----------+-----------+------+




          Examples of high quality questions



          The following questions can be considered as good examples of how you should structure your own Stata-related programming question:




          • Change order of text in variable names


          • Axis labels off-center when broken over multiple lines


          • Equivalent of Stata macros in Python


          • Getting coefplot to use estimate titles or outcome labels


          • One-liner for wrapping some text around members of a list of numbers





          Further advice



          It is crucial that you also read the following pages on Stack Overflow:




          1. Site tour

          2. How to ask

          3. Minimal, Complete, and Verifiable example


          Finally, you may also find helpful the information on the Statalist FAQ.





          share





















          • 1





            Excellent. Some small suggestions edited in.

            – Nick Cox
            Dec 3 '18 at 11:21



















          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          16














          Congratulations on seeking help for your Stata programming problem!



          Acknowledging that you need the input of others is the first step towards
          resolving it and maturing as a Stata user/programmer. Writing a good question is not a trivial task and requires experience. The latter comes with practice, which in turn requires perseverance.



          The secret in writing a high quality reproducible Stata question is the
          successful creation of a sandboxed example. This should use the shortest
          possible snippet of code and the minimal amount of example data required to
          replicate your problem.



          Stack Overflow's Stata volunteers are always happy to help but they do not spoon-feed. Lack of effort from your part will make it less likely that you get an answer and increase the chance that your Stata question will be closed and ultimately deleted.







          Tips for improving your question




          You can add this thread in your list of favorite questions by clicking the star ⭐ on the left hand side above. This will allow you to easily access these tips again from your profile's favorites tab.






          • I am new in Stata and I do not know its commands yet



          Before posting a question, please make sure that you have read Getting Started
          with Stata
          . You can access these introductory manuals by typing help gs from Stata's command prompt.



          There is simply no replacement for acquainting yourself with the basic concepts
          and syntax of Stata. This is particularly important as effective communication requires you to be able to speak the same language as the other more experienced Stata users on Stack Overflow.



          Do not forget that these users want to answer interesting programming problems,
          rather than act as tutors for teaching the basics. A more general forum such as Statalist or reddit may be more appropriate for problems relating to basic command usage.



          • I want to do [something] in Stata and I need the code



          Stack Overflow is geared towards solutions for specific programming problems. It is thus important that you explain as clearly as possible your situation and show us what you have tried.




          1. Start by clearly stating your question and telling us your Stata version and platform (Windows, Mac, Linux).


          2. Then give some context. This should focus on succinctly describing in words
            both your dataset and what you are trying to do.


          3. Next tell us how you attempted to accomplish your goal. This stage includes
            attaching the code that you used and the produced output. You should also link to any similar questions that you consulted on-line.


          4. Finally provide us with example data to run the code and reproduce your problem. These data should not be shown using a screenshot! See further down for help on this step.


          5. Always respond to comments requesting clarification from potential answerers.



          • I have this code in R/Python/SAS/SPSS and need the same in Stata



          Questions asking how to translate code from other languages to Stata's ado or
          mata languages are only valid if and only if there is a specific problem to be
          addressed in the attempted Stata code. Consequently, all the items listed in the previous and next sections are also relevant here.



          • This is the Stata code I used but it does not work





          1. Check for typos both in the script and in the code snippet provided in your question. The Stata interpreter is unforgiving: what might seem a straightforward programming task can thus quickly become an exercise in frustration. Stata volunteers on Stack Overflow are not here to hunt down typos arising from careless typing.



            Example:



            locla mymacro HELLO
            genrate var = 5




            local mymacro HELLO
            generate var = 5



          2. It is best to not abbreviate commands and avoid eliminating all
            white-space. This makes code harder to read and it is more error-prone. Other
            inexperienced users may also find it difficult to recognize even basic commands.



            Example:



            forval i=1/5 {
            loc mymacro`i' HELLO `i'
            g var`i'=`i'
            }




            forvalues i = 1 / 5 {
            local mymacro`i' HELLO `i'
            generate var`i' = `i'
            }



          3. Do not post your entire do file or code segment, but only the problematic part. In addition, make sure you properly format your code using code blocks. If your code snippet is more than five or six lines, break this into sections if necessary. Where the names of the variables you use are not self-explanatory, please provide comments.



            Example:



            sysuse auto

            des

            sum mpg

            gen mempg=r(mean)

            gen smpg=r(sum)

            reg mpg weight length





            /* load data */

            sysuse auto
            describe

            /* descriptive statistics */

            summarize mpg
            generate mean_mpg = r(mean)
            generate sum_education = r(sum)

            /* regression analysis */

            regress mpg weight length







          1. Make sure to check the help file for clues on why your code fails. Problems are often caused by invalid syntax. You can access the help files for commands and functions by typing help command/function name in Stata's command prompt.



            Example:



            list, separate(0)
            option separate() not allowed
            r(198);


            Here, typing help list reveals that this is not legal syntax. Indeed, the name of the option is separator(#) and not separate(#).








          1. Try to debug the code on your own before you ask here. Stata has useful debugging commands such as set trace (see help trace for more details), which shows how the code executes in real-time. Another useful debugging command is pause, which temporarily suspends execution of the code (help pause for more information).



            Example:



            set trace on

            forvalues i = 1 / 2 {
            display `i'
            }

            - forvalues i = 1 / 2 {
            - display `i'
            = display 1
            1
            - }
            - display `i'
            = display 2
            2
            - }


            Use these commands if your problem is not an obvious syntax error and include in your question selected relevant output, which is likely to shed more light on the causes of the problem. In addition, always include the full error code and message that Stata reports.







          1. If you are using a community-contributed command that you have
            downloaded from SSC, the Stata Journal or another source, it is important that you indicate this early on in your question. In this way, people who might answer do not waste time looking for it in external sites and can more quickly identify problems related specifically to this command.




          • Here is a screenshot of my Stata dataset



          Please do not upload screenshots!



          enter image description here



          Screenshots are not as helpful as you hope, primarily because they do not allow people who might answer to copy and paste data into their own Stata and try to reproduce the problem.



          In addition to the edit and input commands outlined in the question part of this thread, there are four other ways you can provide example data for your Stata question.





          1. If your data is confidential, you can demonstrate the problem using the sysuse command to load one of Stata's toy datasets:



            sysuse dir
            auto.dta bplong.dta brand2.dta bsexper3.dta census.dta
            auto2.dta bpwide.dta bsexper1.dta cancer.dta citytemp.dta
            autornd.dta brand1.dta bsexper2.dta cearep.dta citytemp4.dta

            sysuse census, clear

            list state region pop marriage in 1 / 5

            +---------------------------------------------+
            | state region pop marriage |
            |---------------------------------------------|
            1. | Alabama South 3,893,888 49,018 |
            2. | Alaska West 401,851 5,361 |
            3. | Arizona West 2,718,215 30,223 |
            4. | Arkansas South 2,286,435 26,513 |
            5. | California West 23,667,902 210,864 |
            +---------------------------------------------+



          2. Alternatively, you can directly download an online example dataset with the use command:



            clear
            use http://fmwww.bc.edu/ec-p/data/wooldridge/vote1

            list district voteA expendA shareA in 1 / 5

            +-------------------------------------+
            | district voteA expendA shareA |
            |-------------------------------------|
            1. | 7 68 328.3 97.41 |
            2. | 1 62 626.38 60.88 |
            3. | 2 73 99.61 97.01 |
            4. | 3 69 319.69 92.4 |
            5. | 3 75 159.22 72.61 |
            +-------------------------------------+



          3. For examples with your current dataset use the dataex command:



            dataex mpg price foreign in 1 / 5, elsewhere 

            ----------------------- copy starting from the next line -----------------------
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(mpg price) byte foreign
            22 4099 0
            17 4749 0
            22 3799 0
            20 4816 0
            15 7827 0
            end
            label values foreign origin
            label def origin 0 "Domestic", modify
            ------------------ copy up to and including the previous line ------------------


            In this case, the first five observations of variables mpg, price and foreign are requested. Note the option elsewhere, which is explained in the help file for dataex.



            Copy and paste everything between the end lines and use the {} button in
            the Stack Overflow question editor to format the snippet.



            The dataex command is especially needed when:




            • We need to be clear on whether a variable shown as text is really a string variable or a numeric variable with value labels.

            • You have date variables, which otherwise can be very awkward for people who might answer to handle.








          1. You can also use several other Stata commands and functions to simulate data:



            /* generate data in wide form */

            // discard data in memory
            clear

            // set the number of observations in dataset
            set obs 6

            // create a simple identifier
            generate id = _n

            // set the random-number seed for reproducibility
            set seed 12345

            // create a uniformly distributed random variable with values between 0 and 1
            generate var1 = runiform()

            // create a normally-distributed random variable with mean 20 and standard deviation 5
            generate var2 = rnormal(20, 5)

            // create random indicator variable 0/1
            generate var3 = rbinomial(1, 0.5)

            // see the results
            list, separator(0)

            +---------------------------------+
            | id var1 var2 var3 |
            |---------------------------------|
            1. | 1 .3576297 22.72038 0 |
            2. | 2 .4004426 20.00814 1 |
            3. | 3 .6893833 21.7884 1 |
            4. | 4 .5597356 29.39434 0 |
            5. | 5 .5744513 33.77373 0 |
            6. | 6 .2076905 16.93702 1 |
            +---------------------------------+




            // optionally create value labels for numeric variables such as id above

            label define idlabel 1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 6 "six"
            label values id idlabel

            list id, separator(0)

            +-------+
            | id |
            |-------|
            1. | one |
            2. | two |
            3. | three |
            4. | four |
            5. | five |
            6. | six |
            +-------+




            // create (random) date variables
            clear
            set obs 6

            // a daily date numeric variable
            display date("25/11/2018", "DMY")
            21513

            generate var1 = 21513 + _n

            // a random date variable within a specified interval
            generate var2 = floor( ( mdy(12,31,2018) - mdy(1,1,2017)+1 ) * ///
            runiform() + mdy(1,1,2017) )

            // a half-yearly date numeric variable
            display yh(2018, 1)
            116

            generate var3 = 116 + _n

            // see the raw results
            list var1 var2 var3, separator(0)

            +----------------------+
            | var1 var2 var3 |
            |----------------------|
            1. | 21514 21004 117 |
            2. | 21515 21351 118 |
            3. | 21516 21529 119 |
            4. | 21517 21532 120 |
            5. | 21518 21104 121 |
            6. | 21519 21523 122 |
            +----------------------+




            // see formatted results

            format %tdDD/NN/CCYY var1
            format %tdDD/NN/CCYY var2
            format %th var3

            list var1 var2 var3, separator(0)

            +----------------------------------+
            | var1 var2 var3 |
            |----------------------------------|
            1. | 26/11/2018 04/07/2017 2018h2 |
            2. | 27/11/2018 16/06/2018 2019h1 |
            3. | 28/11/2018 11/12/2018 2019h2 |
            4. | 29/11/2018 14/12/2018 2020h1 |
            5. | 30/11/2018 12/10/2017 2020h2 |
            6. | 01/12/2018 05/12/2018 2021h1 |
            +----------------------------------+




            /* generate data in long form */

            clear
            set obs 9

            // create an identifier increasing every three observations
            egen id = seq(), block(3)

            // create a year variable within each id
            bysort id: generate year = 2015 + _n

            // create a normally distributed random variable within each id
            bysort id: generate var = rnormal()

            // calculate the sum of var within each id
            bysort id: generate sum_var = sum(var)

            // note here the use of the `bysort` prefix, which sorts data and repeats
            // the command for each group of observations

            // see the results by id
            list, sepby(id)

            +-----------------------------------+
            | id year var sum_var |
            |-----------------------------------|
            1. | 1 2016 .1973079 .1973079 |
            2. | 1 2017 1.610224 1.807532 |
            3. | 1 2018 -.8034225 1.004109 |
            |-----------------------------------|
            4. | 2 2016 1.096012 1.096012 |
            5. | 2 2017 -.4407027 .6553089 |
            6. | 2 2018 -1.011427 -.3561177 |
            |-----------------------------------|
            7. | 3 2016 1.019227 1.019227 |
            8. | 3 2017 1.871976 2.891204 |
            9. | 3 2018 .4235664 3.31477 |
            +-----------------------------------+



          • Stata does not produce the results I want



          You should always include the output from Stata in your question. This should be copied and pasted directly from the Stata console. Then select the pasted output and click {} in the question editor.



          Example:



          enter image description here



          . sysuse auto
          (1978 Automobile Data)

          . regress price mpg i.foreign

          Source | SS df MS Number of obs = 74
          -------------+---------------------------------- F(2, 71) = 14.07
          Model | 180261702 2 90130850.8 Prob > F = 0.0000
          Residual | 454803695 71 6405685.84 R-squared = 0.2838
          -------------+---------------------------------- Adj R-squared = 0.2637
          Total | 635065396 73 8699525.97 Root MSE = 2530.9

          ------------------------------------------------------------------------------
          price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494
          |
          foreign |
          Foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368
          _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67
          ------------------------------------------------------------------------------


          If Stata gets back to you with unexpected results, it is most likely because it was not programmed correctly. Stata simply does what the user instructs it to do.



          That said, users on Stack Overflow are not mind-readers. Providing an example of the desired output will greatly increase your chances of getting a helpful response.



          If it is a graph, you can post a picture that illustrates the outcome.
          Otherwise, a table with an adequate amount of expected results is best.
          This can be generated using an online table creator (such as Table Generator or ASCII Table Generator) and pasted in your question appropriately formatted in code blocks.



          Example:



          Country Population Mean_age Sex_Ratio GDP

          United States of America 3999 23 1.01 5000

          Afghanistan 544 19 0.97 457

          China 10000 27 0.96 3400



          +--------------------------+------------+----------+-----------+------+
          | | Population | Mean_Age | Sex_Ratio | GDP |
          | Country | | | | |
          +--------------------------+------------+----------+-----------+------+
          | United States of America | 3999 | 23 | 1.01 | 5000 |
          +--------------------------+------------+----------+-----------+------+
          | Afghanistan | 544 | 19 | 0.97 | 457 |
          +--------------------------+------------+----------+-----------+------+
          | China | 10000 | 27 | 0.96 | 3400 |
          +--------------------------+------------+----------+-----------+------+




          Examples of high quality questions



          The following questions can be considered as good examples of how you should structure your own Stata-related programming question:




          • Change order of text in variable names


          • Axis labels off-center when broken over multiple lines


          • Equivalent of Stata macros in Python


          • Getting coefplot to use estimate titles or outcome labels


          • One-liner for wrapping some text around members of a list of numbers





          Further advice



          It is crucial that you also read the following pages on Stack Overflow:




          1. Site tour

          2. How to ask

          3. Minimal, Complete, and Verifiable example


          Finally, you may also find helpful the information on the Statalist FAQ.





          share





















          • 1





            Excellent. Some small suggestions edited in.

            – Nick Cox
            Dec 3 '18 at 11:21
















          16














          Congratulations on seeking help for your Stata programming problem!



          Acknowledging that you need the input of others is the first step towards
          resolving it and maturing as a Stata user/programmer. Writing a good question is not a trivial task and requires experience. The latter comes with practice, which in turn requires perseverance.



          The secret in writing a high quality reproducible Stata question is the
          successful creation of a sandboxed example. This should use the shortest
          possible snippet of code and the minimal amount of example data required to
          replicate your problem.



          Stack Overflow's Stata volunteers are always happy to help but they do not spoon-feed. Lack of effort from your part will make it less likely that you get an answer and increase the chance that your Stata question will be closed and ultimately deleted.







          Tips for improving your question




          You can add this thread in your list of favorite questions by clicking the star ⭐ on the left hand side above. This will allow you to easily access these tips again from your profile's favorites tab.






          • I am new in Stata and I do not know its commands yet



          Before posting a question, please make sure that you have read Getting Started
          with Stata
          . You can access these introductory manuals by typing help gs from Stata's command prompt.



          There is simply no replacement for acquainting yourself with the basic concepts
          and syntax of Stata. This is particularly important as effective communication requires you to be able to speak the same language as the other more experienced Stata users on Stack Overflow.



          Do not forget that these users want to answer interesting programming problems,
          rather than act as tutors for teaching the basics. A more general forum such as Statalist or reddit may be more appropriate for problems relating to basic command usage.



          • I want to do [something] in Stata and I need the code



          Stack Overflow is geared towards solutions for specific programming problems. It is thus important that you explain as clearly as possible your situation and show us what you have tried.




          1. Start by clearly stating your question and telling us your Stata version and platform (Windows, Mac, Linux).


          2. Then give some context. This should focus on succinctly describing in words
            both your dataset and what you are trying to do.


          3. Next tell us how you attempted to accomplish your goal. This stage includes
            attaching the code that you used and the produced output. You should also link to any similar questions that you consulted on-line.


          4. Finally provide us with example data to run the code and reproduce your problem. These data should not be shown using a screenshot! See further down for help on this step.


          5. Always respond to comments requesting clarification from potential answerers.



          • I have this code in R/Python/SAS/SPSS and need the same in Stata



          Questions asking how to translate code from other languages to Stata's ado or
          mata languages are only valid if and only if there is a specific problem to be
          addressed in the attempted Stata code. Consequently, all the items listed in the previous and next sections are also relevant here.



          • This is the Stata code I used but it does not work





          1. Check for typos both in the script and in the code snippet provided in your question. The Stata interpreter is unforgiving: what might seem a straightforward programming task can thus quickly become an exercise in frustration. Stata volunteers on Stack Overflow are not here to hunt down typos arising from careless typing.



            Example:



            locla mymacro HELLO
            genrate var = 5




            local mymacro HELLO
            generate var = 5



          2. It is best to not abbreviate commands and avoid eliminating all
            white-space. This makes code harder to read and it is more error-prone. Other
            inexperienced users may also find it difficult to recognize even basic commands.



            Example:



            forval i=1/5 {
            loc mymacro`i' HELLO `i'
            g var`i'=`i'
            }




            forvalues i = 1 / 5 {
            local mymacro`i' HELLO `i'
            generate var`i' = `i'
            }



          3. Do not post your entire do file or code segment, but only the problematic part. In addition, make sure you properly format your code using code blocks. If your code snippet is more than five or six lines, break this into sections if necessary. Where the names of the variables you use are not self-explanatory, please provide comments.



            Example:



            sysuse auto

            des

            sum mpg

            gen mempg=r(mean)

            gen smpg=r(sum)

            reg mpg weight length





            /* load data */

            sysuse auto
            describe

            /* descriptive statistics */

            summarize mpg
            generate mean_mpg = r(mean)
            generate sum_education = r(sum)

            /* regression analysis */

            regress mpg weight length







          1. Make sure to check the help file for clues on why your code fails. Problems are often caused by invalid syntax. You can access the help files for commands and functions by typing help command/function name in Stata's command prompt.



            Example:



            list, separate(0)
            option separate() not allowed
            r(198);


            Here, typing help list reveals that this is not legal syntax. Indeed, the name of the option is separator(#) and not separate(#).








          1. Try to debug the code on your own before you ask here. Stata has useful debugging commands such as set trace (see help trace for more details), which shows how the code executes in real-time. Another useful debugging command is pause, which temporarily suspends execution of the code (help pause for more information).



            Example:



            set trace on

            forvalues i = 1 / 2 {
            display `i'
            }

            - forvalues i = 1 / 2 {
            - display `i'
            = display 1
            1
            - }
            - display `i'
            = display 2
            2
            - }


            Use these commands if your problem is not an obvious syntax error and include in your question selected relevant output, which is likely to shed more light on the causes of the problem. In addition, always include the full error code and message that Stata reports.







          1. If you are using a community-contributed command that you have
            downloaded from SSC, the Stata Journal or another source, it is important that you indicate this early on in your question. In this way, people who might answer do not waste time looking for it in external sites and can more quickly identify problems related specifically to this command.




          • Here is a screenshot of my Stata dataset



          Please do not upload screenshots!



          enter image description here



          Screenshots are not as helpful as you hope, primarily because they do not allow people who might answer to copy and paste data into their own Stata and try to reproduce the problem.



          In addition to the edit and input commands outlined in the question part of this thread, there are four other ways you can provide example data for your Stata question.





          1. If your data is confidential, you can demonstrate the problem using the sysuse command to load one of Stata's toy datasets:



            sysuse dir
            auto.dta bplong.dta brand2.dta bsexper3.dta census.dta
            auto2.dta bpwide.dta bsexper1.dta cancer.dta citytemp.dta
            autornd.dta brand1.dta bsexper2.dta cearep.dta citytemp4.dta

            sysuse census, clear

            list state region pop marriage in 1 / 5

            +---------------------------------------------+
            | state region pop marriage |
            |---------------------------------------------|
            1. | Alabama South 3,893,888 49,018 |
            2. | Alaska West 401,851 5,361 |
            3. | Arizona West 2,718,215 30,223 |
            4. | Arkansas South 2,286,435 26,513 |
            5. | California West 23,667,902 210,864 |
            +---------------------------------------------+



          2. Alternatively, you can directly download an online example dataset with the use command:



            clear
            use http://fmwww.bc.edu/ec-p/data/wooldridge/vote1

            list district voteA expendA shareA in 1 / 5

            +-------------------------------------+
            | district voteA expendA shareA |
            |-------------------------------------|
            1. | 7 68 328.3 97.41 |
            2. | 1 62 626.38 60.88 |
            3. | 2 73 99.61 97.01 |
            4. | 3 69 319.69 92.4 |
            5. | 3 75 159.22 72.61 |
            +-------------------------------------+



          3. For examples with your current dataset use the dataex command:



            dataex mpg price foreign in 1 / 5, elsewhere 

            ----------------------- copy starting from the next line -----------------------
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(mpg price) byte foreign
            22 4099 0
            17 4749 0
            22 3799 0
            20 4816 0
            15 7827 0
            end
            label values foreign origin
            label def origin 0 "Domestic", modify
            ------------------ copy up to and including the previous line ------------------


            In this case, the first five observations of variables mpg, price and foreign are requested. Note the option elsewhere, which is explained in the help file for dataex.



            Copy and paste everything between the end lines and use the {} button in
            the Stack Overflow question editor to format the snippet.



            The dataex command is especially needed when:




            • We need to be clear on whether a variable shown as text is really a string variable or a numeric variable with value labels.

            • You have date variables, which otherwise can be very awkward for people who might answer to handle.








          1. You can also use several other Stata commands and functions to simulate data:



            /* generate data in wide form */

            // discard data in memory
            clear

            // set the number of observations in dataset
            set obs 6

            // create a simple identifier
            generate id = _n

            // set the random-number seed for reproducibility
            set seed 12345

            // create a uniformly distributed random variable with values between 0 and 1
            generate var1 = runiform()

            // create a normally-distributed random variable with mean 20 and standard deviation 5
            generate var2 = rnormal(20, 5)

            // create random indicator variable 0/1
            generate var3 = rbinomial(1, 0.5)

            // see the results
            list, separator(0)

            +---------------------------------+
            | id var1 var2 var3 |
            |---------------------------------|
            1. | 1 .3576297 22.72038 0 |
            2. | 2 .4004426 20.00814 1 |
            3. | 3 .6893833 21.7884 1 |
            4. | 4 .5597356 29.39434 0 |
            5. | 5 .5744513 33.77373 0 |
            6. | 6 .2076905 16.93702 1 |
            +---------------------------------+




            // optionally create value labels for numeric variables such as id above

            label define idlabel 1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 6 "six"
            label values id idlabel

            list id, separator(0)

            +-------+
            | id |
            |-------|
            1. | one |
            2. | two |
            3. | three |
            4. | four |
            5. | five |
            6. | six |
            +-------+




            // create (random) date variables
            clear
            set obs 6

            // a daily date numeric variable
            display date("25/11/2018", "DMY")
            21513

            generate var1 = 21513 + _n

            // a random date variable within a specified interval
            generate var2 = floor( ( mdy(12,31,2018) - mdy(1,1,2017)+1 ) * ///
            runiform() + mdy(1,1,2017) )

            // a half-yearly date numeric variable
            display yh(2018, 1)
            116

            generate var3 = 116 + _n

            // see the raw results
            list var1 var2 var3, separator(0)

            +----------------------+
            | var1 var2 var3 |
            |----------------------|
            1. | 21514 21004 117 |
            2. | 21515 21351 118 |
            3. | 21516 21529 119 |
            4. | 21517 21532 120 |
            5. | 21518 21104 121 |
            6. | 21519 21523 122 |
            +----------------------+




            // see formatted results

            format %tdDD/NN/CCYY var1
            format %tdDD/NN/CCYY var2
            format %th var3

            list var1 var2 var3, separator(0)

            +----------------------------------+
            | var1 var2 var3 |
            |----------------------------------|
            1. | 26/11/2018 04/07/2017 2018h2 |
            2. | 27/11/2018 16/06/2018 2019h1 |
            3. | 28/11/2018 11/12/2018 2019h2 |
            4. | 29/11/2018 14/12/2018 2020h1 |
            5. | 30/11/2018 12/10/2017 2020h2 |
            6. | 01/12/2018 05/12/2018 2021h1 |
            +----------------------------------+




            /* generate data in long form */

            clear
            set obs 9

            // create an identifier increasing every three observations
            egen id = seq(), block(3)

            // create a year variable within each id
            bysort id: generate year = 2015 + _n

            // create a normally distributed random variable within each id
            bysort id: generate var = rnormal()

            // calculate the sum of var within each id
            bysort id: generate sum_var = sum(var)

            // note here the use of the `bysort` prefix, which sorts data and repeats
            // the command for each group of observations

            // see the results by id
            list, sepby(id)

            +-----------------------------------+
            | id year var sum_var |
            |-----------------------------------|
            1. | 1 2016 .1973079 .1973079 |
            2. | 1 2017 1.610224 1.807532 |
            3. | 1 2018 -.8034225 1.004109 |
            |-----------------------------------|
            4. | 2 2016 1.096012 1.096012 |
            5. | 2 2017 -.4407027 .6553089 |
            6. | 2 2018 -1.011427 -.3561177 |
            |-----------------------------------|
            7. | 3 2016 1.019227 1.019227 |
            8. | 3 2017 1.871976 2.891204 |
            9. | 3 2018 .4235664 3.31477 |
            +-----------------------------------+



          • Stata does not produce the results I want



          You should always include the output from Stata in your question. This should be copied and pasted directly from the Stata console. Then select the pasted output and click {} in the question editor.



          Example:



          enter image description here



          . sysuse auto
          (1978 Automobile Data)

          . regress price mpg i.foreign

          Source | SS df MS Number of obs = 74
          -------------+---------------------------------- F(2, 71) = 14.07
          Model | 180261702 2 90130850.8 Prob > F = 0.0000
          Residual | 454803695 71 6405685.84 R-squared = 0.2838
          -------------+---------------------------------- Adj R-squared = 0.2637
          Total | 635065396 73 8699525.97 Root MSE = 2530.9

          ------------------------------------------------------------------------------
          price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494
          |
          foreign |
          Foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368
          _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67
          ------------------------------------------------------------------------------


          If Stata gets back to you with unexpected results, it is most likely because it was not programmed correctly. Stata simply does what the user instructs it to do.



          That said, users on Stack Overflow are not mind-readers. Providing an example of the desired output will greatly increase your chances of getting a helpful response.



          If it is a graph, you can post a picture that illustrates the outcome.
          Otherwise, a table with an adequate amount of expected results is best.
          This can be generated using an online table creator (such as Table Generator or ASCII Table Generator) and pasted in your question appropriately formatted in code blocks.



          Example:



          Country Population Mean_age Sex_Ratio GDP

          United States of America 3999 23 1.01 5000

          Afghanistan 544 19 0.97 457

          China 10000 27 0.96 3400



          +--------------------------+------------+----------+-----------+------+
          | | Population | Mean_Age | Sex_Ratio | GDP |
          | Country | | | | |
          +--------------------------+------------+----------+-----------+------+
          | United States of America | 3999 | 23 | 1.01 | 5000 |
          +--------------------------+------------+----------+-----------+------+
          | Afghanistan | 544 | 19 | 0.97 | 457 |
          +--------------------------+------------+----------+-----------+------+
          | China | 10000 | 27 | 0.96 | 3400 |
          +--------------------------+------------+----------+-----------+------+




          Examples of high quality questions



          The following questions can be considered as good examples of how you should structure your own Stata-related programming question:




          • Change order of text in variable names


          • Axis labels off-center when broken over multiple lines


          • Equivalent of Stata macros in Python


          • Getting coefplot to use estimate titles or outcome labels


          • One-liner for wrapping some text around members of a list of numbers





          Further advice



          It is crucial that you also read the following pages on Stack Overflow:




          1. Site tour

          2. How to ask

          3. Minimal, Complete, and Verifiable example


          Finally, you may also find helpful the information on the Statalist FAQ.





          share





















          • 1





            Excellent. Some small suggestions edited in.

            – Nick Cox
            Dec 3 '18 at 11:21














          16












          16








          16







          Congratulations on seeking help for your Stata programming problem!



          Acknowledging that you need the input of others is the first step towards
          resolving it and maturing as a Stata user/programmer. Writing a good question is not a trivial task and requires experience. The latter comes with practice, which in turn requires perseverance.



          The secret in writing a high quality reproducible Stata question is the
          successful creation of a sandboxed example. This should use the shortest
          possible snippet of code and the minimal amount of example data required to
          replicate your problem.



          Stack Overflow's Stata volunteers are always happy to help but they do not spoon-feed. Lack of effort from your part will make it less likely that you get an answer and increase the chance that your Stata question will be closed and ultimately deleted.







          Tips for improving your question




          You can add this thread in your list of favorite questions by clicking the star ⭐ on the left hand side above. This will allow you to easily access these tips again from your profile's favorites tab.






          • I am new in Stata and I do not know its commands yet



          Before posting a question, please make sure that you have read Getting Started
          with Stata
          . You can access these introductory manuals by typing help gs from Stata's command prompt.



          There is simply no replacement for acquainting yourself with the basic concepts
          and syntax of Stata. This is particularly important as effective communication requires you to be able to speak the same language as the other more experienced Stata users on Stack Overflow.



          Do not forget that these users want to answer interesting programming problems,
          rather than act as tutors for teaching the basics. A more general forum such as Statalist or reddit may be more appropriate for problems relating to basic command usage.



          • I want to do [something] in Stata and I need the code



          Stack Overflow is geared towards solutions for specific programming problems. It is thus important that you explain as clearly as possible your situation and show us what you have tried.




          1. Start by clearly stating your question and telling us your Stata version and platform (Windows, Mac, Linux).


          2. Then give some context. This should focus on succinctly describing in words
            both your dataset and what you are trying to do.


          3. Next tell us how you attempted to accomplish your goal. This stage includes
            attaching the code that you used and the produced output. You should also link to any similar questions that you consulted on-line.


          4. Finally provide us with example data to run the code and reproduce your problem. These data should not be shown using a screenshot! See further down for help on this step.


          5. Always respond to comments requesting clarification from potential answerers.



          • I have this code in R/Python/SAS/SPSS and need the same in Stata



          Questions asking how to translate code from other languages to Stata's ado or
          mata languages are only valid if and only if there is a specific problem to be
          addressed in the attempted Stata code. Consequently, all the items listed in the previous and next sections are also relevant here.



          • This is the Stata code I used but it does not work





          1. Check for typos both in the script and in the code snippet provided in your question. The Stata interpreter is unforgiving: what might seem a straightforward programming task can thus quickly become an exercise in frustration. Stata volunteers on Stack Overflow are not here to hunt down typos arising from careless typing.



            Example:



            locla mymacro HELLO
            genrate var = 5




            local mymacro HELLO
            generate var = 5



          2. It is best to not abbreviate commands and avoid eliminating all
            white-space. This makes code harder to read and it is more error-prone. Other
            inexperienced users may also find it difficult to recognize even basic commands.



            Example:



            forval i=1/5 {
            loc mymacro`i' HELLO `i'
            g var`i'=`i'
            }




            forvalues i = 1 / 5 {
            local mymacro`i' HELLO `i'
            generate var`i' = `i'
            }



          3. Do not post your entire do file or code segment, but only the problematic part. In addition, make sure you properly format your code using code blocks. If your code snippet is more than five or six lines, break this into sections if necessary. Where the names of the variables you use are not self-explanatory, please provide comments.



            Example:



            sysuse auto

            des

            sum mpg

            gen mempg=r(mean)

            gen smpg=r(sum)

            reg mpg weight length





            /* load data */

            sysuse auto
            describe

            /* descriptive statistics */

            summarize mpg
            generate mean_mpg = r(mean)
            generate sum_education = r(sum)

            /* regression analysis */

            regress mpg weight length







          1. Make sure to check the help file for clues on why your code fails. Problems are often caused by invalid syntax. You can access the help files for commands and functions by typing help command/function name in Stata's command prompt.



            Example:



            list, separate(0)
            option separate() not allowed
            r(198);


            Here, typing help list reveals that this is not legal syntax. Indeed, the name of the option is separator(#) and not separate(#).








          1. Try to debug the code on your own before you ask here. Stata has useful debugging commands such as set trace (see help trace for more details), which shows how the code executes in real-time. Another useful debugging command is pause, which temporarily suspends execution of the code (help pause for more information).



            Example:



            set trace on

            forvalues i = 1 / 2 {
            display `i'
            }

            - forvalues i = 1 / 2 {
            - display `i'
            = display 1
            1
            - }
            - display `i'
            = display 2
            2
            - }


            Use these commands if your problem is not an obvious syntax error and include in your question selected relevant output, which is likely to shed more light on the causes of the problem. In addition, always include the full error code and message that Stata reports.







          1. If you are using a community-contributed command that you have
            downloaded from SSC, the Stata Journal or another source, it is important that you indicate this early on in your question. In this way, people who might answer do not waste time looking for it in external sites and can more quickly identify problems related specifically to this command.




          • Here is a screenshot of my Stata dataset



          Please do not upload screenshots!



          enter image description here



          Screenshots are not as helpful as you hope, primarily because they do not allow people who might answer to copy and paste data into their own Stata and try to reproduce the problem.



          In addition to the edit and input commands outlined in the question part of this thread, there are four other ways you can provide example data for your Stata question.





          1. If your data is confidential, you can demonstrate the problem using the sysuse command to load one of Stata's toy datasets:



            sysuse dir
            auto.dta bplong.dta brand2.dta bsexper3.dta census.dta
            auto2.dta bpwide.dta bsexper1.dta cancer.dta citytemp.dta
            autornd.dta brand1.dta bsexper2.dta cearep.dta citytemp4.dta

            sysuse census, clear

            list state region pop marriage in 1 / 5

            +---------------------------------------------+
            | state region pop marriage |
            |---------------------------------------------|
            1. | Alabama South 3,893,888 49,018 |
            2. | Alaska West 401,851 5,361 |
            3. | Arizona West 2,718,215 30,223 |
            4. | Arkansas South 2,286,435 26,513 |
            5. | California West 23,667,902 210,864 |
            +---------------------------------------------+



          2. Alternatively, you can directly download an online example dataset with the use command:



            clear
            use http://fmwww.bc.edu/ec-p/data/wooldridge/vote1

            list district voteA expendA shareA in 1 / 5

            +-------------------------------------+
            | district voteA expendA shareA |
            |-------------------------------------|
            1. | 7 68 328.3 97.41 |
            2. | 1 62 626.38 60.88 |
            3. | 2 73 99.61 97.01 |
            4. | 3 69 319.69 92.4 |
            5. | 3 75 159.22 72.61 |
            +-------------------------------------+



          3. For examples with your current dataset use the dataex command:



            dataex mpg price foreign in 1 / 5, elsewhere 

            ----------------------- copy starting from the next line -----------------------
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(mpg price) byte foreign
            22 4099 0
            17 4749 0
            22 3799 0
            20 4816 0
            15 7827 0
            end
            label values foreign origin
            label def origin 0 "Domestic", modify
            ------------------ copy up to and including the previous line ------------------


            In this case, the first five observations of variables mpg, price and foreign are requested. Note the option elsewhere, which is explained in the help file for dataex.



            Copy and paste everything between the end lines and use the {} button in
            the Stack Overflow question editor to format the snippet.



            The dataex command is especially needed when:




            • We need to be clear on whether a variable shown as text is really a string variable or a numeric variable with value labels.

            • You have date variables, which otherwise can be very awkward for people who might answer to handle.








          1. You can also use several other Stata commands and functions to simulate data:



            /* generate data in wide form */

            // discard data in memory
            clear

            // set the number of observations in dataset
            set obs 6

            // create a simple identifier
            generate id = _n

            // set the random-number seed for reproducibility
            set seed 12345

            // create a uniformly distributed random variable with values between 0 and 1
            generate var1 = runiform()

            // create a normally-distributed random variable with mean 20 and standard deviation 5
            generate var2 = rnormal(20, 5)

            // create random indicator variable 0/1
            generate var3 = rbinomial(1, 0.5)

            // see the results
            list, separator(0)

            +---------------------------------+
            | id var1 var2 var3 |
            |---------------------------------|
            1. | 1 .3576297 22.72038 0 |
            2. | 2 .4004426 20.00814 1 |
            3. | 3 .6893833 21.7884 1 |
            4. | 4 .5597356 29.39434 0 |
            5. | 5 .5744513 33.77373 0 |
            6. | 6 .2076905 16.93702 1 |
            +---------------------------------+




            // optionally create value labels for numeric variables such as id above

            label define idlabel 1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 6 "six"
            label values id idlabel

            list id, separator(0)

            +-------+
            | id |
            |-------|
            1. | one |
            2. | two |
            3. | three |
            4. | four |
            5. | five |
            6. | six |
            +-------+




            // create (random) date variables
            clear
            set obs 6

            // a daily date numeric variable
            display date("25/11/2018", "DMY")
            21513

            generate var1 = 21513 + _n

            // a random date variable within a specified interval
            generate var2 = floor( ( mdy(12,31,2018) - mdy(1,1,2017)+1 ) * ///
            runiform() + mdy(1,1,2017) )

            // a half-yearly date numeric variable
            display yh(2018, 1)
            116

            generate var3 = 116 + _n

            // see the raw results
            list var1 var2 var3, separator(0)

            +----------------------+
            | var1 var2 var3 |
            |----------------------|
            1. | 21514 21004 117 |
            2. | 21515 21351 118 |
            3. | 21516 21529 119 |
            4. | 21517 21532 120 |
            5. | 21518 21104 121 |
            6. | 21519 21523 122 |
            +----------------------+




            // see formatted results

            format %tdDD/NN/CCYY var1
            format %tdDD/NN/CCYY var2
            format %th var3

            list var1 var2 var3, separator(0)

            +----------------------------------+
            | var1 var2 var3 |
            |----------------------------------|
            1. | 26/11/2018 04/07/2017 2018h2 |
            2. | 27/11/2018 16/06/2018 2019h1 |
            3. | 28/11/2018 11/12/2018 2019h2 |
            4. | 29/11/2018 14/12/2018 2020h1 |
            5. | 30/11/2018 12/10/2017 2020h2 |
            6. | 01/12/2018 05/12/2018 2021h1 |
            +----------------------------------+




            /* generate data in long form */

            clear
            set obs 9

            // create an identifier increasing every three observations
            egen id = seq(), block(3)

            // create a year variable within each id
            bysort id: generate year = 2015 + _n

            // create a normally distributed random variable within each id
            bysort id: generate var = rnormal()

            // calculate the sum of var within each id
            bysort id: generate sum_var = sum(var)

            // note here the use of the `bysort` prefix, which sorts data and repeats
            // the command for each group of observations

            // see the results by id
            list, sepby(id)

            +-----------------------------------+
            | id year var sum_var |
            |-----------------------------------|
            1. | 1 2016 .1973079 .1973079 |
            2. | 1 2017 1.610224 1.807532 |
            3. | 1 2018 -.8034225 1.004109 |
            |-----------------------------------|
            4. | 2 2016 1.096012 1.096012 |
            5. | 2 2017 -.4407027 .6553089 |
            6. | 2 2018 -1.011427 -.3561177 |
            |-----------------------------------|
            7. | 3 2016 1.019227 1.019227 |
            8. | 3 2017 1.871976 2.891204 |
            9. | 3 2018 .4235664 3.31477 |
            +-----------------------------------+



          • Stata does not produce the results I want



          You should always include the output from Stata in your question. This should be copied and pasted directly from the Stata console. Then select the pasted output and click {} in the question editor.



          Example:



          enter image description here



          . sysuse auto
          (1978 Automobile Data)

          . regress price mpg i.foreign

          Source | SS df MS Number of obs = 74
          -------------+---------------------------------- F(2, 71) = 14.07
          Model | 180261702 2 90130850.8 Prob > F = 0.0000
          Residual | 454803695 71 6405685.84 R-squared = 0.2838
          -------------+---------------------------------- Adj R-squared = 0.2637
          Total | 635065396 73 8699525.97 Root MSE = 2530.9

          ------------------------------------------------------------------------------
          price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494
          |
          foreign |
          Foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368
          _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67
          ------------------------------------------------------------------------------


          If Stata gets back to you with unexpected results, it is most likely because it was not programmed correctly. Stata simply does what the user instructs it to do.



          That said, users on Stack Overflow are not mind-readers. Providing an example of the desired output will greatly increase your chances of getting a helpful response.



          If it is a graph, you can post a picture that illustrates the outcome.
          Otherwise, a table with an adequate amount of expected results is best.
          This can be generated using an online table creator (such as Table Generator or ASCII Table Generator) and pasted in your question appropriately formatted in code blocks.



          Example:



          Country Population Mean_age Sex_Ratio GDP

          United States of America 3999 23 1.01 5000

          Afghanistan 544 19 0.97 457

          China 10000 27 0.96 3400



          +--------------------------+------------+----------+-----------+------+
          | | Population | Mean_Age | Sex_Ratio | GDP |
          | Country | | | | |
          +--------------------------+------------+----------+-----------+------+
          | United States of America | 3999 | 23 | 1.01 | 5000 |
          +--------------------------+------------+----------+-----------+------+
          | Afghanistan | 544 | 19 | 0.97 | 457 |
          +--------------------------+------------+----------+-----------+------+
          | China | 10000 | 27 | 0.96 | 3400 |
          +--------------------------+------------+----------+-----------+------+




          Examples of high quality questions



          The following questions can be considered as good examples of how you should structure your own Stata-related programming question:




          • Change order of text in variable names


          • Axis labels off-center when broken over multiple lines


          • Equivalent of Stata macros in Python


          • Getting coefplot to use estimate titles or outcome labels


          • One-liner for wrapping some text around members of a list of numbers





          Further advice



          It is crucial that you also read the following pages on Stack Overflow:




          1. Site tour

          2. How to ask

          3. Minimal, Complete, and Verifiable example


          Finally, you may also find helpful the information on the Statalist FAQ.





          share















          Congratulations on seeking help for your Stata programming problem!



          Acknowledging that you need the input of others is the first step towards
          resolving it and maturing as a Stata user/programmer. Writing a good question is not a trivial task and requires experience. The latter comes with practice, which in turn requires perseverance.



          The secret in writing a high quality reproducible Stata question is the
          successful creation of a sandboxed example. This should use the shortest
          possible snippet of code and the minimal amount of example data required to
          replicate your problem.



          Stack Overflow's Stata volunteers are always happy to help but they do not spoon-feed. Lack of effort from your part will make it less likely that you get an answer and increase the chance that your Stata question will be closed and ultimately deleted.







          Tips for improving your question




          You can add this thread in your list of favorite questions by clicking the star ⭐ on the left hand side above. This will allow you to easily access these tips again from your profile's favorites tab.






          • I am new in Stata and I do not know its commands yet



          Before posting a question, please make sure that you have read Getting Started
          with Stata
          . You can access these introductory manuals by typing help gs from Stata's command prompt.



          There is simply no replacement for acquainting yourself with the basic concepts
          and syntax of Stata. This is particularly important as effective communication requires you to be able to speak the same language as the other more experienced Stata users on Stack Overflow.



          Do not forget that these users want to answer interesting programming problems,
          rather than act as tutors for teaching the basics. A more general forum such as Statalist or reddit may be more appropriate for problems relating to basic command usage.



          • I want to do [something] in Stata and I need the code



          Stack Overflow is geared towards solutions for specific programming problems. It is thus important that you explain as clearly as possible your situation and show us what you have tried.




          1. Start by clearly stating your question and telling us your Stata version and platform (Windows, Mac, Linux).


          2. Then give some context. This should focus on succinctly describing in words
            both your dataset and what you are trying to do.


          3. Next tell us how you attempted to accomplish your goal. This stage includes
            attaching the code that you used and the produced output. You should also link to any similar questions that you consulted on-line.


          4. Finally provide us with example data to run the code and reproduce your problem. These data should not be shown using a screenshot! See further down for help on this step.


          5. Always respond to comments requesting clarification from potential answerers.



          • I have this code in R/Python/SAS/SPSS and need the same in Stata



          Questions asking how to translate code from other languages to Stata's ado or
          mata languages are only valid if and only if there is a specific problem to be
          addressed in the attempted Stata code. Consequently, all the items listed in the previous and next sections are also relevant here.



          • This is the Stata code I used but it does not work





          1. Check for typos both in the script and in the code snippet provided in your question. The Stata interpreter is unforgiving: what might seem a straightforward programming task can thus quickly become an exercise in frustration. Stata volunteers on Stack Overflow are not here to hunt down typos arising from careless typing.



            Example:



            locla mymacro HELLO
            genrate var = 5




            local mymacro HELLO
            generate var = 5



          2. It is best to not abbreviate commands and avoid eliminating all
            white-space. This makes code harder to read and it is more error-prone. Other
            inexperienced users may also find it difficult to recognize even basic commands.



            Example:



            forval i=1/5 {
            loc mymacro`i' HELLO `i'
            g var`i'=`i'
            }




            forvalues i = 1 / 5 {
            local mymacro`i' HELLO `i'
            generate var`i' = `i'
            }



          3. Do not post your entire do file or code segment, but only the problematic part. In addition, make sure you properly format your code using code blocks. If your code snippet is more than five or six lines, break this into sections if necessary. Where the names of the variables you use are not self-explanatory, please provide comments.



            Example:



            sysuse auto

            des

            sum mpg

            gen mempg=r(mean)

            gen smpg=r(sum)

            reg mpg weight length





            /* load data */

            sysuse auto
            describe

            /* descriptive statistics */

            summarize mpg
            generate mean_mpg = r(mean)
            generate sum_education = r(sum)

            /* regression analysis */

            regress mpg weight length







          1. Make sure to check the help file for clues on why your code fails. Problems are often caused by invalid syntax. You can access the help files for commands and functions by typing help command/function name in Stata's command prompt.



            Example:



            list, separate(0)
            option separate() not allowed
            r(198);


            Here, typing help list reveals that this is not legal syntax. Indeed, the name of the option is separator(#) and not separate(#).








          1. Try to debug the code on your own before you ask here. Stata has useful debugging commands such as set trace (see help trace for more details), which shows how the code executes in real-time. Another useful debugging command is pause, which temporarily suspends execution of the code (help pause for more information).



            Example:



            set trace on

            forvalues i = 1 / 2 {
            display `i'
            }

            - forvalues i = 1 / 2 {
            - display `i'
            = display 1
            1
            - }
            - display `i'
            = display 2
            2
            - }


            Use these commands if your problem is not an obvious syntax error and include in your question selected relevant output, which is likely to shed more light on the causes of the problem. In addition, always include the full error code and message that Stata reports.







          1. If you are using a community-contributed command that you have
            downloaded from SSC, the Stata Journal or another source, it is important that you indicate this early on in your question. In this way, people who might answer do not waste time looking for it in external sites and can more quickly identify problems related specifically to this command.




          • Here is a screenshot of my Stata dataset



          Please do not upload screenshots!



          enter image description here



          Screenshots are not as helpful as you hope, primarily because they do not allow people who might answer to copy and paste data into their own Stata and try to reproduce the problem.



          In addition to the edit and input commands outlined in the question part of this thread, there are four other ways you can provide example data for your Stata question.





          1. If your data is confidential, you can demonstrate the problem using the sysuse command to load one of Stata's toy datasets:



            sysuse dir
            auto.dta bplong.dta brand2.dta bsexper3.dta census.dta
            auto2.dta bpwide.dta bsexper1.dta cancer.dta citytemp.dta
            autornd.dta brand1.dta bsexper2.dta cearep.dta citytemp4.dta

            sysuse census, clear

            list state region pop marriage in 1 / 5

            +---------------------------------------------+
            | state region pop marriage |
            |---------------------------------------------|
            1. | Alabama South 3,893,888 49,018 |
            2. | Alaska West 401,851 5,361 |
            3. | Arizona West 2,718,215 30,223 |
            4. | Arkansas South 2,286,435 26,513 |
            5. | California West 23,667,902 210,864 |
            +---------------------------------------------+



          2. Alternatively, you can directly download an online example dataset with the use command:



            clear
            use http://fmwww.bc.edu/ec-p/data/wooldridge/vote1

            list district voteA expendA shareA in 1 / 5

            +-------------------------------------+
            | district voteA expendA shareA |
            |-------------------------------------|
            1. | 7 68 328.3 97.41 |
            2. | 1 62 626.38 60.88 |
            3. | 2 73 99.61 97.01 |
            4. | 3 69 319.69 92.4 |
            5. | 3 75 159.22 72.61 |
            +-------------------------------------+



          3. For examples with your current dataset use the dataex command:



            dataex mpg price foreign in 1 / 5, elsewhere 

            ----------------------- copy starting from the next line -----------------------
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(mpg price) byte foreign
            22 4099 0
            17 4749 0
            22 3799 0
            20 4816 0
            15 7827 0
            end
            label values foreign origin
            label def origin 0 "Domestic", modify
            ------------------ copy up to and including the previous line ------------------


            In this case, the first five observations of variables mpg, price and foreign are requested. Note the option elsewhere, which is explained in the help file for dataex.



            Copy and paste everything between the end lines and use the {} button in
            the Stack Overflow question editor to format the snippet.



            The dataex command is especially needed when:




            • We need to be clear on whether a variable shown as text is really a string variable or a numeric variable with value labels.

            • You have date variables, which otherwise can be very awkward for people who might answer to handle.








          1. You can also use several other Stata commands and functions to simulate data:



            /* generate data in wide form */

            // discard data in memory
            clear

            // set the number of observations in dataset
            set obs 6

            // create a simple identifier
            generate id = _n

            // set the random-number seed for reproducibility
            set seed 12345

            // create a uniformly distributed random variable with values between 0 and 1
            generate var1 = runiform()

            // create a normally-distributed random variable with mean 20 and standard deviation 5
            generate var2 = rnormal(20, 5)

            // create random indicator variable 0/1
            generate var3 = rbinomial(1, 0.5)

            // see the results
            list, separator(0)

            +---------------------------------+
            | id var1 var2 var3 |
            |---------------------------------|
            1. | 1 .3576297 22.72038 0 |
            2. | 2 .4004426 20.00814 1 |
            3. | 3 .6893833 21.7884 1 |
            4. | 4 .5597356 29.39434 0 |
            5. | 5 .5744513 33.77373 0 |
            6. | 6 .2076905 16.93702 1 |
            +---------------------------------+




            // optionally create value labels for numeric variables such as id above

            label define idlabel 1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 6 "six"
            label values id idlabel

            list id, separator(0)

            +-------+
            | id |
            |-------|
            1. | one |
            2. | two |
            3. | three |
            4. | four |
            5. | five |
            6. | six |
            +-------+




            // create (random) date variables
            clear
            set obs 6

            // a daily date numeric variable
            display date("25/11/2018", "DMY")
            21513

            generate var1 = 21513 + _n

            // a random date variable within a specified interval
            generate var2 = floor( ( mdy(12,31,2018) - mdy(1,1,2017)+1 ) * ///
            runiform() + mdy(1,1,2017) )

            // a half-yearly date numeric variable
            display yh(2018, 1)
            116

            generate var3 = 116 + _n

            // see the raw results
            list var1 var2 var3, separator(0)

            +----------------------+
            | var1 var2 var3 |
            |----------------------|
            1. | 21514 21004 117 |
            2. | 21515 21351 118 |
            3. | 21516 21529 119 |
            4. | 21517 21532 120 |
            5. | 21518 21104 121 |
            6. | 21519 21523 122 |
            +----------------------+




            // see formatted results

            format %tdDD/NN/CCYY var1
            format %tdDD/NN/CCYY var2
            format %th var3

            list var1 var2 var3, separator(0)

            +----------------------------------+
            | var1 var2 var3 |
            |----------------------------------|
            1. | 26/11/2018 04/07/2017 2018h2 |
            2. | 27/11/2018 16/06/2018 2019h1 |
            3. | 28/11/2018 11/12/2018 2019h2 |
            4. | 29/11/2018 14/12/2018 2020h1 |
            5. | 30/11/2018 12/10/2017 2020h2 |
            6. | 01/12/2018 05/12/2018 2021h1 |
            +----------------------------------+




            /* generate data in long form */

            clear
            set obs 9

            // create an identifier increasing every three observations
            egen id = seq(), block(3)

            // create a year variable within each id
            bysort id: generate year = 2015 + _n

            // create a normally distributed random variable within each id
            bysort id: generate var = rnormal()

            // calculate the sum of var within each id
            bysort id: generate sum_var = sum(var)

            // note here the use of the `bysort` prefix, which sorts data and repeats
            // the command for each group of observations

            // see the results by id
            list, sepby(id)

            +-----------------------------------+
            | id year var sum_var |
            |-----------------------------------|
            1. | 1 2016 .1973079 .1973079 |
            2. | 1 2017 1.610224 1.807532 |
            3. | 1 2018 -.8034225 1.004109 |
            |-----------------------------------|
            4. | 2 2016 1.096012 1.096012 |
            5. | 2 2017 -.4407027 .6553089 |
            6. | 2 2018 -1.011427 -.3561177 |
            |-----------------------------------|
            7. | 3 2016 1.019227 1.019227 |
            8. | 3 2017 1.871976 2.891204 |
            9. | 3 2018 .4235664 3.31477 |
            +-----------------------------------+



          • Stata does not produce the results I want



          You should always include the output from Stata in your question. This should be copied and pasted directly from the Stata console. Then select the pasted output and click {} in the question editor.



          Example:



          enter image description here



          . sysuse auto
          (1978 Automobile Data)

          . regress price mpg i.foreign

          Source | SS df MS Number of obs = 74
          -------------+---------------------------------- F(2, 71) = 14.07
          Model | 180261702 2 90130850.8 Prob > F = 0.0000
          Residual | 454803695 71 6405685.84 R-squared = 0.2838
          -------------+---------------------------------- Adj R-squared = 0.2637
          Total | 635065396 73 8699525.97 Root MSE = 2530.9

          ------------------------------------------------------------------------------
          price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          mpg | -294.1955 55.69172 -5.28 0.000 -405.2417 -183.1494
          |
          foreign |
          Foreign | 1767.292 700.158 2.52 0.014 371.2169 3163.368
          _cons | 11905.42 1158.634 10.28 0.000 9595.164 14215.67
          ------------------------------------------------------------------------------


          If Stata gets back to you with unexpected results, it is most likely because it was not programmed correctly. Stata simply does what the user instructs it to do.



          That said, users on Stack Overflow are not mind-readers. Providing an example of the desired output will greatly increase your chances of getting a helpful response.



          If it is a graph, you can post a picture that illustrates the outcome.
          Otherwise, a table with an adequate amount of expected results is best.
          This can be generated using an online table creator (such as Table Generator or ASCII Table Generator) and pasted in your question appropriately formatted in code blocks.



          Example:



          Country Population Mean_age Sex_Ratio GDP

          United States of America 3999 23 1.01 5000

          Afghanistan 544 19 0.97 457

          China 10000 27 0.96 3400



          +--------------------------+------------+----------+-----------+------+
          | | Population | Mean_Age | Sex_Ratio | GDP |
          | Country | | | | |
          +--------------------------+------------+----------+-----------+------+
          | United States of America | 3999 | 23 | 1.01 | 5000 |
          +--------------------------+------------+----------+-----------+------+
          | Afghanistan | 544 | 19 | 0.97 | 457 |
          +--------------------------+------------+----------+-----------+------+
          | China | 10000 | 27 | 0.96 | 3400 |
          +--------------------------+------------+----------+-----------+------+




          Examples of high quality questions



          The following questions can be considered as good examples of how you should structure your own Stata-related programming question:




          • Change order of text in variable names


          • Axis labels off-center when broken over multiple lines


          • Equivalent of Stata macros in Python


          • Getting coefplot to use estimate titles or outcome labels


          • One-liner for wrapping some text around members of a list of numbers





          Further advice



          It is crucial that you also read the following pages on Stack Overflow:




          1. Site tour

          2. How to ask

          3. Minimal, Complete, and Verifiable example


          Finally, you may also find helpful the information on the Statalist FAQ.






          share













          share


          share








          edited Dec 3 '18 at 11:54

























          answered Nov 22 '18 at 21:41









          Pearly SpencerPearly Spencer

          10.6k1520




          10.6k1520








          • 1





            Excellent. Some small suggestions edited in.

            – Nick Cox
            Dec 3 '18 at 11:21














          • 1





            Excellent. Some small suggestions edited in.

            – Nick Cox
            Dec 3 '18 at 11:21








          1




          1





          Excellent. Some small suggestions edited in.

          – Nick Cox
          Dec 3 '18 at 11:21





          Excellent. Some small suggestions edited in.

          – Nick Cox
          Dec 3 '18 at 11:21



          Popular posts from this blog

          Create new schema in PostgreSQL using DBeaver

          Deepest pit of an array with Javascript: test on Codility

          Costa Masnaga