Create loop to calculate percentage under conditions and fill in results

up vote
0
down vote

favorite

I have tried to find some information, but did not really find what I was looking for.

This is my dataframe full.data (excerpt)

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                    0

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                    0

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                    0

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                    0

Austria 2015       Total 6.670398e+07   AT                    0

Austria 2015   Regulated 3.800309e+07   AT                    0

Austria 2015 Unregulated 2.870090e+07   AT                    0

Austria 2015         ETS 2.949206e+07   AT                    0

Austria 2016       Total 6.740209e+07   AT                    0

Austria 2016   Regulated 3.765177e+07   AT                    0

Austria 2016 Unregulated 2.975031e+07   AT                    0

Austria 2016         ETS 2.900012e+07   AT                    0

Austria 2017         ETS 3.055523e+07   AT                    0

Belgium 1990       Total 1.204844e+08   BE                    0

Belgium 1990   Regulated 7.861411e+07   BE                    0

Belgium 1990 Unregulated 4.187029e+07   BE                    0

Belgium 1991       Total 1.235447e+08   BE                    0

Belgium 1991   Regulated 7.981152e+07   BE                    0

Belgium 1991 Unregulated 4.373319e+07   BE                    0

Belgium 1992       Total 1.226578e+08   BE                    0

Belgium 1992   Regulated 7.828396e+07   BE                    0

Belgium 1992 Unregulated 4.437385e+07   BE                    0

Belgium 1993       Total 1.215573e+08   BE                    0

Belgium 1993   Regulated 7.675229e+07   BE                    0

Belgium 1993 Unregulated 4.480499e+07   BE                    0

Belgium 1994       Total 1.249382e+08   BE                    0

Belgium 1994   Regulated 8.064799e+07   BE                    0

Belgium 1994 Unregulated 4.429020e+07   BE                    0

....

I'm trying to fill into full.data$PercentageDifference the percentage of emissions where sector=ETS and where sector=Regulated(emissions sector=ETS is xx.y% of sector=Regulated) . This percentage value should be filed into PercentageDifference where sector=ETS. This should happen for every year and country. I would assume I would need a loop here. I have read that dplyr is useful for this, but did not really figure out how to do that. However, if there is a better way than dplyr, that's fine with me.

The result would look like this

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                72.6%

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                71.9%

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                77.3%

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                77.3%

I have not posted what I have done so far, as I did not manage doing a lot....

Thanks for help of any kind.

Nordsee

edited Nov 18 at 14:57

asked Nov 18 at 13:59

Nordsee

405

Please always provide a minimal data.frame in your question and show the expected result too. IMHO your question is difficult to understand and typing in the data to find and answer is a waste of time. THX :-)
– R Yoda
Nov 18 at 14:43

Contains "emissions" a percentage or an absolute value?
– R Yoda
Nov 18 at 14:44

1

@RYoda emisisons are absolute values/numeric. I have added an example outcome. Sorry I didn't add this earlier
– Nordsee
Nov 18 at 14:58

add a comment |

up vote
0
down vote

favorite

I have tried to find some information, but did not really find what I was looking for.

This is my dataframe full.data (excerpt)

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                    0

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                    0

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                    0

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                    0

Austria 2015       Total 6.670398e+07   AT                    0

Austria 2015   Regulated 3.800309e+07   AT                    0

Austria 2015 Unregulated 2.870090e+07   AT                    0

Austria 2015         ETS 2.949206e+07   AT                    0

Austria 2016       Total 6.740209e+07   AT                    0

Austria 2016   Regulated 3.765177e+07   AT                    0

Austria 2016 Unregulated 2.975031e+07   AT                    0

Austria 2016         ETS 2.900012e+07   AT                    0

Austria 2017         ETS 3.055523e+07   AT                    0

Belgium 1990       Total 1.204844e+08   BE                    0

Belgium 1990   Regulated 7.861411e+07   BE                    0

Belgium 1990 Unregulated 4.187029e+07   BE                    0

Belgium 1991       Total 1.235447e+08   BE                    0

Belgium 1991   Regulated 7.981152e+07   BE                    0

Belgium 1991 Unregulated 4.373319e+07   BE                    0

Belgium 1992       Total 1.226578e+08   BE                    0

Belgium 1992   Regulated 7.828396e+07   BE                    0

Belgium 1992 Unregulated 4.437385e+07   BE                    0

Belgium 1993       Total 1.215573e+08   BE                    0

Belgium 1993   Regulated 7.675229e+07   BE                    0

Belgium 1993 Unregulated 4.480499e+07   BE                    0

Belgium 1994       Total 1.249382e+08   BE                    0

Belgium 1994   Regulated 8.064799e+07   BE                    0

Belgium 1994 Unregulated 4.429020e+07   BE                    0

....

The result would look like this

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                72.6%

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                71.9%

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                77.3%

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                77.3%

I have not posted what I have done so far, as I did not manage doing a lot....

Thanks for help of any kind.

Nordsee

edited Nov 18 at 14:57

asked Nov 18 at 13:59

Nordsee

405

Please always provide a minimal data.frame in your question and show the expected result too. IMHO your question is difficult to understand and typing in the data to find and answer is a waste of time. THX :-)
– R Yoda
Nov 18 at 14:43

Contains "emissions" a percentage or an absolute value?
– R Yoda
Nov 18 at 14:44

1

@RYoda emisisons are absolute values/numeric. I have added an example outcome. Sorry I didn't add this earlier
– Nordsee
Nov 18 at 14:58

add a comment |

up vote
0
down vote

favorite

I have tried to find some information, but did not really find what I was looking for.

This is my dataframe full.data (excerpt)

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                    0

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                    0

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                    0

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                    0

Austria 2015       Total 6.670398e+07   AT                    0

Austria 2015   Regulated 3.800309e+07   AT                    0

Austria 2015 Unregulated 2.870090e+07   AT                    0

Austria 2015         ETS 2.949206e+07   AT                    0

Austria 2016       Total 6.740209e+07   AT                    0

Austria 2016   Regulated 3.765177e+07   AT                    0

Austria 2016 Unregulated 2.975031e+07   AT                    0

Austria 2016         ETS 2.900012e+07   AT                    0

Austria 2017         ETS 3.055523e+07   AT                    0

Belgium 1990       Total 1.204844e+08   BE                    0

Belgium 1990   Regulated 7.861411e+07   BE                    0

Belgium 1990 Unregulated 4.187029e+07   BE                    0

Belgium 1991       Total 1.235447e+08   BE                    0

Belgium 1991   Regulated 7.981152e+07   BE                    0

Belgium 1991 Unregulated 4.373319e+07   BE                    0

Belgium 1992       Total 1.226578e+08   BE                    0

Belgium 1992   Regulated 7.828396e+07   BE                    0

Belgium 1992 Unregulated 4.437385e+07   BE                    0

Belgium 1993       Total 1.215573e+08   BE                    0

Belgium 1993   Regulated 7.675229e+07   BE                    0

Belgium 1993 Unregulated 4.480499e+07   BE                    0

Belgium 1994       Total 1.249382e+08   BE                    0

Belgium 1994   Regulated 8.064799e+07   BE                    0

Belgium 1994 Unregulated 4.429020e+07   BE                    0

....

The result would look like this

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                72.6%

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                71.9%

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                77.3%

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                77.3%

I have not posted what I have done so far, as I did not manage doing a lot....

Thanks for help of any kind.

Nordsee

edited Nov 18 at 14:57

asked Nov 18 at 13:59

Nordsee

405

I have tried to find some information, but did not really find what I was looking for.

This is my dataframe full.data (excerpt)

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                    0

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                    0

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                    0

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                    0

Austria 2015       Total 6.670398e+07   AT                    0

Austria 2015   Regulated 3.800309e+07   AT                    0

Austria 2015 Unregulated 2.870090e+07   AT                    0

Austria 2015         ETS 2.949206e+07   AT                    0

Austria 2016       Total 6.740209e+07   AT                    0

Austria 2016   Regulated 3.765177e+07   AT                    0

Austria 2016 Unregulated 2.975031e+07   AT                    0

Austria 2016         ETS 2.900012e+07   AT                    0

Austria 2017         ETS 3.055523e+07   AT                    0

Belgium 1990       Total 1.204844e+08   BE                    0

Belgium 1990   Regulated 7.861411e+07   BE                    0

Belgium 1990 Unregulated 4.187029e+07   BE                    0

Belgium 1991       Total 1.235447e+08   BE                    0

Belgium 1991   Regulated 7.981152e+07   BE                    0

Belgium 1991 Unregulated 4.373319e+07   BE                    0

Belgium 1992       Total 1.226578e+08   BE                    0

Belgium 1992   Regulated 7.828396e+07   BE                    0

Belgium 1992 Unregulated 4.437385e+07   BE                    0

Belgium 1993       Total 1.215573e+08   BE                    0

Belgium 1993   Regulated 7.675229e+07   BE                    0

Belgium 1993 Unregulated 4.480499e+07   BE                    0

Belgium 1994       Total 1.249382e+08   BE                    0

Belgium 1994   Regulated 8.064799e+07   BE                    0

Belgium 1994 Unregulated 4.429020e+07   BE                    0

....

The result would look like this

country year      sector    emissions iso2 PercentageDifference

....

Austria 2011       Total 7.011567e+07   AT                    0

Austria 2011   Regulated 4.214836e+07   AT                    0

Austria 2011 Unregulated 2.796732e+07   AT                    0

Austria 2011         ETS 3.059942e+07   AT                72.6%

Austria 2012       Total 6.766140e+07   AT                    0

Austria 2012   Regulated 3.949445e+07   AT                    0

Austria 2012 Unregulated 2.816695e+07   AT                    0

Austria 2012         ETS 2.838706e+07   AT                71.9%

Austria 2013       Total 6.800123e+07   AT                    0

Austria 2013   Regulated 3.857396e+07   AT                    0

Austria 2013 Unregulated 2.942727e+07   AT                    0

Austria 2013         ETS 2.980441e+07   AT                77.3%

Austria 2014       Total 6.425333e+07   AT                    0

Austria 2014   Regulated 3.631107e+07   AT                    0

Austria 2014 Unregulated 2.794225e+07   AT                    0

Austria 2014         ETS 2.805597e+07   AT                77.3%

I have not posted what I have done so far, as I did not manage doing a lot....

Thanks for help of any kind.

Nordsee

r dplyr percentage

edited Nov 18 at 14:57

asked Nov 18 at 13:59

Nordsee

405

edited Nov 18 at 14:57

asked Nov 18 at 13:59

Nordsee

405

edited Nov 18 at 14:57

asked Nov 18 at 13:59

Nordsee

405

asked Nov 18 at 13:59

Nordsee

405

asked Nov 18 at 13:59

Nordsee

405

Please always provide a minimal data.frame in your question and show the expected result too. IMHO your question is difficult to understand and typing in the data to find and answer is a waste of time. THX :-)
– R Yoda
Nov 18 at 14:43

Contains "emissions" a percentage or an absolute value?
– R Yoda
Nov 18 at 14:44

1

@RYoda emisisons are absolute values/numeric. I have added an example outcome. Sorry I didn't add this earlier
– Nordsee
Nov 18 at 14:58

add a comment |

Please always provide a minimal data.frame in your question and show the expected result too. IMHO your question is difficult to understand and typing in the data to find and answer is a waste of time. THX :-)
– R Yoda
Nov 18 at 14:43

Contains "emissions" a percentage or an absolute value?
– R Yoda
Nov 18 at 14:44

1

@RYoda emisisons are absolute values/numeric. I have added an example outcome. Sorry I didn't add this earlier
– Nordsee
Nov 18 at 14:58

Please always provide a minimal data.frame in your question and show the expected result too. IMHO your question is difficult to understand and typing in the data to find and answer is a waste of time. THX :-)
– R Yoda
Nov 18 at 14:43

Contains "emissions" a percentage or an absolute value?
– R Yoda
Nov 18 at 14:44

@RYoda emisisons are absolute values/numeric. I have added an example outcome. Sorry I didn't add this earlier
– Nordsee
Nov 18 at 14:58

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

What about:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_),

    PercentageDifference = if_else(!is.na(PercentageDifference), paste0(PercentageDifference, "%"), NA_character_)

  )

First 10 rows:

  country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr> <chr>               

 1 Austria  2011 Total        70115670 AT    NA                  

 2 Austria  2011 Regulated    42148360 AT    NA                  

 3 Austria  2011 Unregulated  27967320 AT    NA                  

 4 Austria  2011 ETS          30599420 AT    72.6%               

 5 Austria  2012 Total        67661400 AT    NA                  

 6 Austria  2012 Regulated    39494450 AT    NA                  

 7 Austria  2012 Unregulated  28166950 AT    NA                  

 8 Austria  2012 ETS          28387060 AT    71.9%               

 9 Austria  2013 Total        68001230 AT    NA                  

10 Austria  2013 Regulated    38573960 AT    NA

It's important to know that the percentage column will be of type character then, as you have specified you would like to see % sign in it.

If you want to keep it numeric, you can just remove the last step in mutate, i.e. you can do:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_)

  )

Output (first 10 rows):

   country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr>                <dbl>

 1 Austria  2011 Total        70115670 AT                    NA  

 2 Austria  2011 Regulated    42148360 AT                    NA  

 3 Austria  2011 Unregulated  27967320 AT                    NA  

 4 Austria  2011 ETS          30599420 AT                    72.6

 5 Austria  2012 Total        67661400 AT                    NA  

 6 Austria  2012 Regulated    39494450 AT                    NA  

 7 Austria  2012 Unregulated  28166950 AT                    NA  

 8 Austria  2012 ETS          28387060 AT                    71.9

 9 Austria  2013 Total        68001230 AT                    NA  

10 Austria  2013 Regulated    38573960 AT                    NA

If you would like to avoid loading the zoo package, you can also use fill from tidyverse in a separate step, but it is much much slower.

edited Nov 18 at 16:08

answered Nov 18 at 16:02

arg0naut

1,527312

1

Thanks again for helping me again. Also thank you for giving me these two options
– Nordsee
Nov 18 at 16:56

add a comment |

up vote
1
down vote

Here comes a solution based on tydiverse/dplyr.

Files stack.txt contains pasted text from your example above.

library(tidyverse)

full_data <- read.table("stack.txt", quote=""", comment.char="") 

names(full_data) <-  c("country", "year", "sector", "emission", "iso", "perc")



full_data <- full_data %>% 

  select(-perc)

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>% 

  mutate(percentage = ETS/Regulated) %>% 

  select(country, year, percentage) %>%

  right_join(full_data) %>%

  select(country, year, sector, emission, iso, percentage) %>% 

  mutate(percentage = ifelse(sector == "ETS", percentage, 0))

Result:

   country year      sector  emission iso percentage

1  Austria 2011       Total  70115670  AT  0.0000000

2  Austria 2011   Regulated  42148360  AT  0.0000000

3  Austria 2011 Unregulated  27967320  AT  0.0000000

4  Austria 2011         ETS  30599420  AT  0.7259931

5  Austria 2012       Total  67661400  AT  0.0000000

6  Austria 2012   Regulated  39494450  AT  0.0000000

7  Austria 2012 Unregulated  28166950  AT  0.0000000

8  Austria 2012         ETS  28387060  AT  0.7187607

9  Austria 2013       Total  68001230  AT  0.0000000

10 Austria 2013   Regulated  38573960  AT  0.0000000

11 Austria 2013 Unregulated  29427270  AT  0.0000000

If you want more explanation, how it works, I'd suggest to break the pipeline and see intermediate results i.e

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>%

  mutate(percentage = ETS/Regulated)



   country year      ETS Regulated     Total Unregulated percentage

1  Austria 2011 30599420  42148360  70115670    27967320  0.7259931

2  Austria 2012 28387060  39494450  67661400    28166950  0.7187607

3  Austria 2013 29804410  38573960  68001230    29427270  0.7726562

4  Austria 2014 28055970  36311070  64253330    27942250  0.7726561

5  Austria 2015 29492060  38003090  66703980    28700900  0.7760437

6  Austria 2016 29000120  37651770  67402090    29750310  0.7702193

Regards
Paweł

edited Nov 18 at 15:06

answered Nov 18 at 14:55

Pawel Stradowski

13918

Thank you very much this!!
– Nordsee
Nov 18 at 16:56

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53361698%2fcreate-loop-to-calculate-percentage-under-conditions-and-fill-in-results%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

What about:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_),

    PercentageDifference = if_else(!is.na(PercentageDifference), paste0(PercentageDifference, "%"), NA_character_)

  )

First 10 rows:

  country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr> <chr>               

 1 Austria  2011 Total        70115670 AT    NA                  

 2 Austria  2011 Regulated    42148360 AT    NA                  

 3 Austria  2011 Unregulated  27967320 AT    NA                  

 4 Austria  2011 ETS          30599420 AT    72.6%               

 5 Austria  2012 Total        67661400 AT    NA                  

 6 Austria  2012 Regulated    39494450 AT    NA                  

 7 Austria  2012 Unregulated  28166950 AT    NA                  

 8 Austria  2012 ETS          28387060 AT    71.9%               

 9 Austria  2013 Total        68001230 AT    NA                  

10 Austria  2013 Regulated    38573960 AT    NA

It's important to know that the percentage column will be of type character then, as you have specified you would like to see % sign in it.

If you want to keep it numeric, you can just remove the last step in mutate, i.e. you can do:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_)

  )

Output (first 10 rows):

   country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr>                <dbl>

 1 Austria  2011 Total        70115670 AT                    NA  

 2 Austria  2011 Regulated    42148360 AT                    NA  

 3 Austria  2011 Unregulated  27967320 AT                    NA  

 4 Austria  2011 ETS          30599420 AT                    72.6

 5 Austria  2012 Total        67661400 AT                    NA  

 6 Austria  2012 Regulated    39494450 AT                    NA  

 7 Austria  2012 Unregulated  28166950 AT                    NA  

 8 Austria  2012 ETS          28387060 AT                    71.9

 9 Austria  2013 Total        68001230 AT                    NA  

10 Austria  2013 Regulated    38573960 AT                    NA

If you would like to avoid loading the zoo package, you can also use fill from tidyverse in a separate step, but it is much much slower.

edited Nov 18 at 16:08

answered Nov 18 at 16:02

arg0naut

1,527312

1

Thanks again for helping me again. Also thank you for giving me these two options
– Nordsee
Nov 18 at 16:56

add a comment |

up vote
1
down vote

accepted

What about:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_),

    PercentageDifference = if_else(!is.na(PercentageDifference), paste0(PercentageDifference, "%"), NA_character_)

  )

First 10 rows:

  country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr> <chr>               

 1 Austria  2011 Total        70115670 AT    NA                  

 2 Austria  2011 Regulated    42148360 AT    NA                  

 3 Austria  2011 Unregulated  27967320 AT    NA                  

 4 Austria  2011 ETS          30599420 AT    72.6%               

 5 Austria  2012 Total        67661400 AT    NA                  

 6 Austria  2012 Regulated    39494450 AT    NA                  

 7 Austria  2012 Unregulated  28166950 AT    NA                  

 8 Austria  2012 ETS          28387060 AT    71.9%               

 9 Austria  2013 Total        68001230 AT    NA                  

10 Austria  2013 Regulated    38573960 AT    NA

It's important to know that the percentage column will be of type character then, as you have specified you would like to see % sign in it.

If you want to keep it numeric, you can just remove the last step in mutate, i.e. you can do:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_)

  )

Output (first 10 rows):

   country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr>                <dbl>

 1 Austria  2011 Total        70115670 AT                    NA  

 2 Austria  2011 Regulated    42148360 AT                    NA  

 3 Austria  2011 Unregulated  27967320 AT                    NA  

 4 Austria  2011 ETS          30599420 AT                    72.6

 5 Austria  2012 Total        67661400 AT                    NA  

 6 Austria  2012 Regulated    39494450 AT                    NA  

 7 Austria  2012 Unregulated  28166950 AT                    NA  

 8 Austria  2012 ETS          28387060 AT                    71.9

 9 Austria  2013 Total        68001230 AT                    NA  

10 Austria  2013 Regulated    38573960 AT                    NA

If you would like to avoid loading the zoo package, you can also use fill from tidyverse in a separate step, but it is much much slower.

edited Nov 18 at 16:08

answered Nov 18 at 16:02

arg0naut

1,527312

1

Thanks again for helping me again. Also thank you for giving me these two options
– Nordsee
Nov 18 at 16:56

add a comment |

up vote
1
down vote

accepted

What about:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_),

    PercentageDifference = if_else(!is.na(PercentageDifference), paste0(PercentageDifference, "%"), NA_character_)

  )

First 10 rows:

  country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr> <chr>               

 1 Austria  2011 Total        70115670 AT    NA                  

 2 Austria  2011 Regulated    42148360 AT    NA                  

 3 Austria  2011 Unregulated  27967320 AT    NA                  

 4 Austria  2011 ETS          30599420 AT    72.6%               

 5 Austria  2012 Total        67661400 AT    NA                  

 6 Austria  2012 Regulated    39494450 AT    NA                  

 7 Austria  2012 Unregulated  28166950 AT    NA                  

 8 Austria  2012 ETS          28387060 AT    71.9%               

 9 Austria  2013 Total        68001230 AT    NA                  

10 Austria  2013 Regulated    38573960 AT    NA

It's important to know that the percentage column will be of type character then, as you have specified you would like to see % sign in it.

If you want to keep it numeric, you can just remove the last step in mutate, i.e. you can do:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_)

  )

Output (first 10 rows):

   country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr>                <dbl>

 1 Austria  2011 Total        70115670 AT                    NA  

 2 Austria  2011 Regulated    42148360 AT                    NA  

 3 Austria  2011 Unregulated  27967320 AT                    NA  

 4 Austria  2011 ETS          30599420 AT                    72.6

 5 Austria  2012 Total        67661400 AT                    NA  

 6 Austria  2012 Regulated    39494450 AT                    NA  

 7 Austria  2012 Unregulated  28166950 AT                    NA  

 8 Austria  2012 ETS          28387060 AT                    71.9

 9 Austria  2013 Total        68001230 AT                    NA  

10 Austria  2013 Regulated    38573960 AT                    NA

If you would like to avoid loading the zoo package, you can also use fill from tidyverse in a separate step, but it is much much slower.

edited Nov 18 at 16:08

answered Nov 18 at 16:02

arg0naut

1,527312

What about:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_),

    PercentageDifference = if_else(!is.na(PercentageDifference), paste0(PercentageDifference, "%"), NA_character_)

  )

First 10 rows:

  country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr> <chr>               

 1 Austria  2011 Total        70115670 AT    NA                  

 2 Austria  2011 Regulated    42148360 AT    NA                  

 3 Austria  2011 Unregulated  27967320 AT    NA                  

 4 Austria  2011 ETS          30599420 AT    72.6%               

 5 Austria  2012 Total        67661400 AT    NA                  

 6 Austria  2012 Regulated    39494450 AT    NA                  

 7 Austria  2012 Unregulated  28166950 AT    NA                  

 8 Austria  2012 ETS          28387060 AT    71.9%               

 9 Austria  2013 Total        68001230 AT    NA                  

10 Austria  2013 Regulated    38573960 AT    NA

It's important to know that the percentage column will be of type character then, as you have specified you would like to see % sign in it.

If you want to keep it numeric, you can just remove the last step in mutate, i.e. you can do:

library(tidyverse)

library(zoo)



df %>%

  group_by(country, year) %>%

  mutate(

    PercentageDifference = if_else(sector %in% c("ETS", "Regulated"), emissions, NA_real_),

    PercentageDifference = na.locf(PercentageDifference, na.rm = FALSE),

    PercentageDifference = if_else(sector == "ETS", round((PercentageDifference / lag(PercentageDifference)) * 100,1), NA_real_)

  )

Output (first 10 rows):

   country  year sector      emissions iso2  PercentageDifference

   <chr>   <int> <chr>           <dbl> <chr>                <dbl>

 1 Austria  2011 Total        70115670 AT                    NA  

 2 Austria  2011 Regulated    42148360 AT                    NA  

 3 Austria  2011 Unregulated  27967320 AT                    NA  

 4 Austria  2011 ETS          30599420 AT                    72.6

 5 Austria  2012 Total        67661400 AT                    NA  

 6 Austria  2012 Regulated    39494450 AT                    NA  

 7 Austria  2012 Unregulated  28166950 AT                    NA  

 8 Austria  2012 ETS          28387060 AT                    71.9

 9 Austria  2013 Total        68001230 AT                    NA  

10 Austria  2013 Regulated    38573960 AT                    NA

If you would like to avoid loading the zoo package, you can also use fill from tidyverse in a separate step, but it is much much slower.

edited Nov 18 at 16:08

answered Nov 18 at 16:02

arg0naut

1,527312

edited Nov 18 at 16:08

answered Nov 18 at 16:02

arg0naut

1,527312

answered Nov 18 at 16:02

arg0naut

1,527312

answered Nov 18 at 16:02

arg0naut

1,527312

1

Thanks again for helping me again. Also thank you for giving me these two options
– Nordsee
Nov 18 at 16:56

add a comment |

1

Thanks again for helping me again. Also thank you for giving me these two options
– Nordsee
Nov 18 at 16:56

Thanks again for helping me again. Also thank you for giving me these two options
– Nordsee
Nov 18 at 16:56

add a comment |

up vote
1
down vote

Here comes a solution based on tydiverse/dplyr.

Files stack.txt contains pasted text from your example above.

library(tidyverse)

full_data <- read.table("stack.txt", quote=""", comment.char="") 

names(full_data) <-  c("country", "year", "sector", "emission", "iso", "perc")



full_data <- full_data %>% 

  select(-perc)

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>% 

  mutate(percentage = ETS/Regulated) %>% 

  select(country, year, percentage) %>%

  right_join(full_data) %>%

  select(country, year, sector, emission, iso, percentage) %>% 

  mutate(percentage = ifelse(sector == "ETS", percentage, 0))

Result:

   country year      sector  emission iso percentage

1  Austria 2011       Total  70115670  AT  0.0000000

2  Austria 2011   Regulated  42148360  AT  0.0000000

3  Austria 2011 Unregulated  27967320  AT  0.0000000

4  Austria 2011         ETS  30599420  AT  0.7259931

5  Austria 2012       Total  67661400  AT  0.0000000

6  Austria 2012   Regulated  39494450  AT  0.0000000

7  Austria 2012 Unregulated  28166950  AT  0.0000000

8  Austria 2012         ETS  28387060  AT  0.7187607

9  Austria 2013       Total  68001230  AT  0.0000000

10 Austria 2013   Regulated  38573960  AT  0.0000000

11 Austria 2013 Unregulated  29427270  AT  0.0000000

If you want more explanation, how it works, I'd suggest to break the pipeline and see intermediate results i.e

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>%

  mutate(percentage = ETS/Regulated)



   country year      ETS Regulated     Total Unregulated percentage

1  Austria 2011 30599420  42148360  70115670    27967320  0.7259931

2  Austria 2012 28387060  39494450  67661400    28166950  0.7187607

3  Austria 2013 29804410  38573960  68001230    29427270  0.7726562

4  Austria 2014 28055970  36311070  64253330    27942250  0.7726561

5  Austria 2015 29492060  38003090  66703980    28700900  0.7760437

6  Austria 2016 29000120  37651770  67402090    29750310  0.7702193

Regards
Paweł

edited Nov 18 at 15:06

answered Nov 18 at 14:55

Pawel Stradowski

13918

Thank you very much this!!
– Nordsee
Nov 18 at 16:56

add a comment |

up vote
1
down vote

Here comes a solution based on tydiverse/dplyr.

Files stack.txt contains pasted text from your example above.

library(tidyverse)

full_data <- read.table("stack.txt", quote=""", comment.char="") 

names(full_data) <-  c("country", "year", "sector", "emission", "iso", "perc")



full_data <- full_data %>% 

  select(-perc)

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>% 

  mutate(percentage = ETS/Regulated) %>% 

  select(country, year, percentage) %>%

  right_join(full_data) %>%

  select(country, year, sector, emission, iso, percentage) %>% 

  mutate(percentage = ifelse(sector == "ETS", percentage, 0))

Result:

   country year      sector  emission iso percentage

1  Austria 2011       Total  70115670  AT  0.0000000

2  Austria 2011   Regulated  42148360  AT  0.0000000

3  Austria 2011 Unregulated  27967320  AT  0.0000000

4  Austria 2011         ETS  30599420  AT  0.7259931

5  Austria 2012       Total  67661400  AT  0.0000000

6  Austria 2012   Regulated  39494450  AT  0.0000000

7  Austria 2012 Unregulated  28166950  AT  0.0000000

8  Austria 2012         ETS  28387060  AT  0.7187607

9  Austria 2013       Total  68001230  AT  0.0000000

10 Austria 2013   Regulated  38573960  AT  0.0000000

11 Austria 2013 Unregulated  29427270  AT  0.0000000

If you want more explanation, how it works, I'd suggest to break the pipeline and see intermediate results i.e

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>%

  mutate(percentage = ETS/Regulated)



   country year      ETS Regulated     Total Unregulated percentage

1  Austria 2011 30599420  42148360  70115670    27967320  0.7259931

2  Austria 2012 28387060  39494450  67661400    28166950  0.7187607

3  Austria 2013 29804410  38573960  68001230    29427270  0.7726562

4  Austria 2014 28055970  36311070  64253330    27942250  0.7726561

5  Austria 2015 29492060  38003090  66703980    28700900  0.7760437

6  Austria 2016 29000120  37651770  67402090    29750310  0.7702193

Regards
Paweł

edited Nov 18 at 15:06

answered Nov 18 at 14:55

Pawel Stradowski

13918

Thank you very much this!!
– Nordsee
Nov 18 at 16:56

add a comment |

up vote
1
down vote

Here comes a solution based on tydiverse/dplyr.

Files stack.txt contains pasted text from your example above.

library(tidyverse)

full_data <- read.table("stack.txt", quote=""", comment.char="") 

names(full_data) <-  c("country", "year", "sector", "emission", "iso", "perc")



full_data <- full_data %>% 

  select(-perc)

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>% 

  mutate(percentage = ETS/Regulated) %>% 

  select(country, year, percentage) %>%

  right_join(full_data) %>%

  select(country, year, sector, emission, iso, percentage) %>% 

  mutate(percentage = ifelse(sector == "ETS", percentage, 0))

Result:

   country year      sector  emission iso percentage

1  Austria 2011       Total  70115670  AT  0.0000000

2  Austria 2011   Regulated  42148360  AT  0.0000000

3  Austria 2011 Unregulated  27967320  AT  0.0000000

4  Austria 2011         ETS  30599420  AT  0.7259931

5  Austria 2012       Total  67661400  AT  0.0000000

6  Austria 2012   Regulated  39494450  AT  0.0000000

7  Austria 2012 Unregulated  28166950  AT  0.0000000

8  Austria 2012         ETS  28387060  AT  0.7187607

9  Austria 2013       Total  68001230  AT  0.0000000

10 Austria 2013   Regulated  38573960  AT  0.0000000

11 Austria 2013 Unregulated  29427270  AT  0.0000000

If you want more explanation, how it works, I'd suggest to break the pipeline and see intermediate results i.e

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>%

  mutate(percentage = ETS/Regulated)



   country year      ETS Regulated     Total Unregulated percentage

1  Austria 2011 30599420  42148360  70115670    27967320  0.7259931

2  Austria 2012 28387060  39494450  67661400    28166950  0.7187607

3  Austria 2013 29804410  38573960  68001230    29427270  0.7726562

4  Austria 2014 28055970  36311070  64253330    27942250  0.7726561

5  Austria 2015 29492060  38003090  66703980    28700900  0.7760437

6  Austria 2016 29000120  37651770  67402090    29750310  0.7702193

Regards
Paweł

edited Nov 18 at 15:06

answered Nov 18 at 14:55

Pawel Stradowski

13918

Here comes a solution based on tydiverse/dplyr.

Files stack.txt contains pasted text from your example above.

library(tidyverse)

full_data <- read.table("stack.txt", quote=""", comment.char="") 

names(full_data) <-  c("country", "year", "sector", "emission", "iso", "perc")



full_data <- full_data %>% 

  select(-perc)

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>% 

  mutate(percentage = ETS/Regulated) %>% 

  select(country, year, percentage) %>%

  right_join(full_data) %>%

  select(country, year, sector, emission, iso, percentage) %>% 

  mutate(percentage = ifelse(sector == "ETS", percentage, 0))

Result:

   country year      sector  emission iso percentage

1  Austria 2011       Total  70115670  AT  0.0000000

2  Austria 2011   Regulated  42148360  AT  0.0000000

3  Austria 2011 Unregulated  27967320  AT  0.0000000

4  Austria 2011         ETS  30599420  AT  0.7259931

5  Austria 2012       Total  67661400  AT  0.0000000

6  Austria 2012   Regulated  39494450  AT  0.0000000

7  Austria 2012 Unregulated  28166950  AT  0.0000000

8  Austria 2012         ETS  28387060  AT  0.7187607

9  Austria 2013       Total  68001230  AT  0.0000000

10 Austria 2013   Regulated  38573960  AT  0.0000000

11 Austria 2013 Unregulated  29427270  AT  0.0000000

If you want more explanation, how it works, I'd suggest to break the pipeline and see intermediate results i.e

full_data %>% 

  select(-iso) %>% 

  spread(sector, emission) %>%

  mutate(percentage = ETS/Regulated)



   country year      ETS Regulated     Total Unregulated percentage

1  Austria 2011 30599420  42148360  70115670    27967320  0.7259931

2  Austria 2012 28387060  39494450  67661400    28166950  0.7187607

3  Austria 2013 29804410  38573960  68001230    29427270  0.7726562

4  Austria 2014 28055970  36311070  64253330    27942250  0.7726561

5  Austria 2015 29492060  38003090  66703980    28700900  0.7760437

6  Austria 2016 29000120  37651770  67402090    29750310  0.7702193

Regards
Paweł

edited Nov 18 at 15:06

answered Nov 18 at 14:55

Pawel Stradowski

13918

edited Nov 18 at 15:06

answered Nov 18 at 14:55

Pawel Stradowski

13918

answered Nov 18 at 14:55

Pawel Stradowski

13918

answered Nov 18 at 14:55

Pawel Stradowski

13918

Thank you very much this!!
– Nordsee
Nov 18 at 16:56

add a comment |

Thank you very much this!!
– Nordsee
Nov 18 at 16:56

Thank you very much this!!
– Nordsee
Nov 18 at 16:56

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk