Pandas assign a value to new row based on index on incoming live data

up vote
0
down vote

favorite

I am having trouble writing efficient code (without many loops) that assings a value to a cell in a pandas dataframe that is being updated every minute or so (live stream). In the training set I trained my model with one-hot encoded timestamp variables and it did better than continues variables, so that's what I want to use for production. The dataframe looks like this:

datetime              DOW_1     DOW_2    ... DOW_7    Month1   Month2   Month3 

`2018-07-01 09:30:00`  0          1            0         0       0        1

As you can see the columns are encoded with 0's and 1's to denote what month, day of week, (and I have more columns for day of year, is_holiday, etc...) I easily did this on training, validation, and test data using pd.get_dummies, but now that a live stream of data is coming in I cannot find an easy way to 'assign' month2 = 0 based on df.index.month

I tried doing something along the lines of this type of loop but it's quite tedious and slow..

i=0

while i < len(df):

    for m in range(1,13):

        if df.index.iloc[i].month == m:

            df['Month'+str(m)][i] = 1

            i+=1

        else:

            i+=1

Any better suggestions?

edited Nov 20 at 2:59

asked Nov 20 at 1:46

Matt Elgazar

449

Is it possible that the line df['Month'+str(m)][i] = i should assing 1 instead of i?
– Julian Peller
Nov 20 at 2:05

Oops sorry typo.. Yep!
– Matt Elgazar
Nov 20 at 2:58

add a comment |

up vote
0
down vote

favorite

datetime              DOW_1     DOW_2    ... DOW_7    Month1   Month2   Month3 

`2018-07-01 09:30:00`  0          1            0         0       0        1

I tried doing something along the lines of this type of loop but it's quite tedious and slow..

i=0

while i < len(df):

    for m in range(1,13):

        if df.index.iloc[i].month == m:

            df['Month'+str(m)][i] = 1

            i+=1

        else:

            i+=1

Any better suggestions?

edited Nov 20 at 2:59

asked Nov 20 at 1:46

Matt Elgazar

449

Is it possible that the line df['Month'+str(m)][i] = i should assing 1 instead of i?
– Julian Peller
Nov 20 at 2:05

Oops sorry typo.. Yep!
– Matt Elgazar
Nov 20 at 2:58

add a comment |

up vote
0
down vote

favorite

datetime              DOW_1     DOW_2    ... DOW_7    Month1   Month2   Month3 

`2018-07-01 09:30:00`  0          1            0         0       0        1

I tried doing something along the lines of this type of loop but it's quite tedious and slow..

i=0

while i < len(df):

    for m in range(1,13):

        if df.index.iloc[i].month == m:

            df['Month'+str(m)][i] = 1

            i+=1

        else:

            i+=1

Any better suggestions?

edited Nov 20 at 2:59

asked Nov 20 at 1:46

Matt Elgazar

449

datetime              DOW_1     DOW_2    ... DOW_7    Month1   Month2   Month3 

`2018-07-01 09:30:00`  0          1            0         0       0        1

I tried doing something along the lines of this type of loop but it's quite tedious and slow..

i=0

while i < len(df):

    for m in range(1,13):

        if df.index.iloc[i].month == m:

            df['Month'+str(m)][i] = 1

            i+=1

        else:

            i+=1

Any better suggestions?

python pandas

edited Nov 20 at 2:59

asked Nov 20 at 1:46

Matt Elgazar

449

edited Nov 20 at 2:59

asked Nov 20 at 1:46

Matt Elgazar

449

edited Nov 20 at 2:59

asked Nov 20 at 1:46

Matt Elgazar

449

asked Nov 20 at 1:46

Matt Elgazar

449

asked Nov 20 at 1:46

Matt Elgazar

449

Is it possible that the line df['Month'+str(m)][i] = i should assing 1 instead of i?
– Julian Peller
Nov 20 at 2:05

Oops sorry typo.. Yep!
– Matt Elgazar
Nov 20 at 2:58

add a comment |

Is it possible that the line df['Month'+str(m)][i] = i should assing 1 instead of i?
– Julian Peller
Nov 20 at 2:05

Oops sorry typo.. Yep!
– Matt Elgazar
Nov 20 at 2:58

Is it possible that the line df['Month'+str(m)][i] = i should assing 1 instead of i?
– Julian Peller
Nov 20 at 2:05

Oops sorry typo.. Yep!
– Matt Elgazar
Nov 20 at 2:58

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

I'm still thinking about a solution that removes even the for, but you can at least avoid the external while over len(df) using .loc:

for m in range(1, 13):

    df.loc[df.index.month == m, 'Month'+str(m)] = 1

edited Nov 20 at 2:17

answered Nov 20 at 2:08

Julian Peller

849511

This is great - pretty much exactly what I was looking for. I don't know if there's a way to do it without any loops since some of the columns don't exist and they have to be created. The only last issue is it leaves the other columns with NaN values (i.e. month_1 has all 1's but month 2-12 has NaN's). Nice work, I'll mark it as correct!
– Matt Elgazar
Nov 20 at 7:57

Thanks. Regarding the NaN, you are right. I can think of 2 options to handle that: casting all NaN to zero, if possible: df = df.fillna(0) or adding a second line to the for filling the zeros: df.loc[df.index.month != m, 'Month'+str(m)] = 0 (note the != instead of ==). I couldn't figure out a completely for-free solution... I don't think it exists.
– Julian Peller
Nov 20 at 14:43

Nice! yeah the extra line makes more sense since I'm looping over a lot of columns
– Matt Elgazar
Nov 20 at 15:37

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53385093%2fpandas-assign-a-value-to-new-row-based-on-index-on-incoming-live-data%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

I'm still thinking about a solution that removes even the for, but you can at least avoid the external while over len(df) using .loc:

for m in range(1, 13):

    df.loc[df.index.month == m, 'Month'+str(m)] = 1

edited Nov 20 at 2:17

answered Nov 20 at 2:08

Julian Peller

849511

This is great - pretty much exactly what I was looking for. I don't know if there's a way to do it without any loops since some of the columns don't exist and they have to be created. The only last issue is it leaves the other columns with NaN values (i.e. month_1 has all 1's but month 2-12 has NaN's). Nice work, I'll mark it as correct!
– Matt Elgazar
Nov 20 at 7:57

Thanks. Regarding the NaN, you are right. I can think of 2 options to handle that: casting all NaN to zero, if possible: df = df.fillna(0) or adding a second line to the for filling the zeros: df.loc[df.index.month != m, 'Month'+str(m)] = 0 (note the != instead of ==). I couldn't figure out a completely for-free solution... I don't think it exists.
– Julian Peller
Nov 20 at 14:43

Nice! yeah the extra line makes more sense since I'm looping over a lot of columns
– Matt Elgazar
Nov 20 at 15:37

add a comment |

up vote
1
down vote

accepted

I'm still thinking about a solution that removes even the for, but you can at least avoid the external while over len(df) using .loc:

for m in range(1, 13):

    df.loc[df.index.month == m, 'Month'+str(m)] = 1

edited Nov 20 at 2:17

answered Nov 20 at 2:08

Julian Peller

849511

This is great - pretty much exactly what I was looking for. I don't know if there's a way to do it without any loops since some of the columns don't exist and they have to be created. The only last issue is it leaves the other columns with NaN values (i.e. month_1 has all 1's but month 2-12 has NaN's). Nice work, I'll mark it as correct!
– Matt Elgazar
Nov 20 at 7:57

Thanks. Regarding the NaN, you are right. I can think of 2 options to handle that: casting all NaN to zero, if possible: df = df.fillna(0) or adding a second line to the for filling the zeros: df.loc[df.index.month != m, 'Month'+str(m)] = 0 (note the != instead of ==). I couldn't figure out a completely for-free solution... I don't think it exists.
– Julian Peller
Nov 20 at 14:43

Nice! yeah the extra line makes more sense since I'm looping over a lot of columns
– Matt Elgazar
Nov 20 at 15:37

add a comment |

up vote
1
down vote

accepted

I'm still thinking about a solution that removes even the for, but you can at least avoid the external while over len(df) using .loc:

for m in range(1, 13):

    df.loc[df.index.month == m, 'Month'+str(m)] = 1

edited Nov 20 at 2:17

answered Nov 20 at 2:08

Julian Peller

849511

I'm still thinking about a solution that removes even the for, but you can at least avoid the external while over len(df) using .loc:

for m in range(1, 13):

    df.loc[df.index.month == m, 'Month'+str(m)] = 1

edited Nov 20 at 2:17

answered Nov 20 at 2:08

Julian Peller

849511

edited Nov 20 at 2:17

answered Nov 20 at 2:08

Julian Peller

849511

answered Nov 20 at 2:08

Julian Peller

849511

answered Nov 20 at 2:08

Julian Peller

849511

This is great - pretty much exactly what I was looking for. I don't know if there's a way to do it without any loops since some of the columns don't exist and they have to be created. The only last issue is it leaves the other columns with NaN values (i.e. month_1 has all 1's but month 2-12 has NaN's). Nice work, I'll mark it as correct!
– Matt Elgazar
Nov 20 at 7:57

Thanks. Regarding the NaN, you are right. I can think of 2 options to handle that: casting all NaN to zero, if possible: df = df.fillna(0) or adding a second line to the for filling the zeros: df.loc[df.index.month != m, 'Month'+str(m)] = 0 (note the != instead of ==). I couldn't figure out a completely for-free solution... I don't think it exists.
– Julian Peller
Nov 20 at 14:43

Nice! yeah the extra line makes more sense since I'm looping over a lot of columns
– Matt Elgazar
Nov 20 at 15:37

add a comment |

This is great - pretty much exactly what I was looking for. I don't know if there's a way to do it without any loops since some of the columns don't exist and they have to be created. The only last issue is it leaves the other columns with NaN values (i.e. month_1 has all 1's but month 2-12 has NaN's). Nice work, I'll mark it as correct!
– Matt Elgazar
Nov 20 at 7:57

Thanks. Regarding the NaN, you are right. I can think of 2 options to handle that: casting all NaN to zero, if possible: df = df.fillna(0) or adding a second line to the for filling the zeros: df.loc[df.index.month != m, 'Month'+str(m)] = 0 (note the != instead of ==). I couldn't figure out a completely for-free solution... I don't think it exists.
– Julian Peller
Nov 20 at 14:43

Nice! yeah the extra line makes more sense since I'm looping over a lot of columns
– Matt Elgazar
Nov 20 at 15:37

This is great - pretty much exactly what I was looking for. I don't know if there's a way to do it without any loops since some of the columns don't exist and they have to be created. The only last issue is it leaves the other columns with NaN values (i.e. month_1 has all 1's but month 2-12 has NaN's). Nice work, I'll mark it as correct!
– Matt Elgazar
Nov 20 at 7:57

Thanks. Regarding the NaN, you are right. I can think of 2 options to handle that: casting all NaN to zero, if possible: df = df.fillna(0) or adding a second line to the for filling the zeros: df.loc[df.index.month != m, 'Month'+str(m)] = 0 (note the != instead of ==). I couldn't figure out a completely for-free solution... I don't think it exists.
– Julian Peller
Nov 20 at 14:43

Nice! yeah the extra line makes more sense since I'm looping over a lot of columns
– Matt Elgazar
Nov 20 at 15:37

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk