I need to group by and get the rank in python
I have a dataframe , refer below code to generate it :
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
Suppose i wanted to group it by card and wanted to know for each card which group code has highest amount ? and create a new dataframe with that card number and group code with highest amount.
Kindly help at the earliest.
python pandas-groupby
add a comment |
I have a dataframe , refer below code to generate it :
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
Suppose i wanted to group it by card and wanted to know for each card which group code has highest amount ? and create a new dataframe with that card number and group code with highest amount.
Kindly help at the earliest.
python pandas-groupby
add a comment |
I have a dataframe , refer below code to generate it :
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
Suppose i wanted to group it by card and wanted to know for each card which group code has highest amount ? and create a new dataframe with that card number and group code with highest amount.
Kindly help at the earliest.
python pandas-groupby
I have a dataframe , refer below code to generate it :
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
Suppose i wanted to group it by card and wanted to know for each card which group code has highest amount ? and create a new dataframe with that card number and group code with highest amount.
Kindly help at the earliest.
python pandas-groupby
python pandas-groupby
asked Nov 21 '18 at 10:38
SheriffSheriff
478
478
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You could do:
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
mask = df.groupby('card')['amount'].transform(max) == df['amount']
result = df[mask][['card', 'group_code', 'amount']]
print(result)
Output
card group_code amount
1 YYY 111 200
6 XXX 333 600
UPDATE
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
agg = df.groupby(['card', 'group_code']).agg({'amount':'sum'}).reset_index()
mask = agg.groupby('card')['amount'].transform(max) == agg['amount']
result = agg[mask]
print(result)
Output
card group_code amount
0 XXX 111 725
2 YYY 111 325
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53410249%2fi-need-to-group-by-and-get-the-rank-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You could do:
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
mask = df.groupby('card')['amount'].transform(max) == df['amount']
result = df[mask][['card', 'group_code', 'amount']]
print(result)
Output
card group_code amount
1 YYY 111 200
6 XXX 333 600
UPDATE
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
agg = df.groupby(['card', 'group_code']).agg({'amount':'sum'}).reset_index()
mask = agg.groupby('card')['amount'].transform(max) == agg['amount']
result = agg[mask]
print(result)
Output
card group_code amount
0 XXX 111 725
2 YYY 111 325
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
add a comment |
You could do:
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
mask = df.groupby('card')['amount'].transform(max) == df['amount']
result = df[mask][['card', 'group_code', 'amount']]
print(result)
Output
card group_code amount
1 YYY 111 200
6 XXX 333 600
UPDATE
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
agg = df.groupby(['card', 'group_code']).agg({'amount':'sum'}).reset_index()
mask = agg.groupby('card')['amount'].transform(max) == agg['amount']
result = agg[mask]
print(result)
Output
card group_code amount
0 XXX 111 725
2 YYY 111 325
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
add a comment |
You could do:
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
mask = df.groupby('card')['amount'].transform(max) == df['amount']
result = df[mask][['card', 'group_code', 'amount']]
print(result)
Output
card group_code amount
1 YYY 111 200
6 XXX 333 600
UPDATE
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
agg = df.groupby(['card', 'group_code']).agg({'amount':'sum'}).reset_index()
mask = agg.groupby('card')['amount'].transform(max) == agg['amount']
result = agg[mask]
print(result)
Output
card group_code amount
0 XXX 111 725
2 YYY 111 325
You could do:
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
mask = df.groupby('card')['amount'].transform(max) == df['amount']
result = df[mask][['card', 'group_code', 'amount']]
print(result)
Output
card group_code amount
1 YYY 111 200
6 XXX 333 600
UPDATE
import pandas as pd
df = pd.DataFrame({'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
agg = df.groupby(['card', 'group_code']).agg({'amount':'sum'}).reset_index()
mask = agg.groupby('card')['amount'].transform(max) == agg['amount']
result = agg[mask]
print(result)
Output
card group_code amount
0 XXX 111 725
2 YYY 111 325
edited Nov 21 '18 at 11:14
answered Nov 21 '18 at 10:45
Daniel MesejoDaniel Mesejo
15.3k21029
15.3k21029
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
add a comment |
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
Thanks for helping. But i think we are getting it wrong. In the DF, for the card - XXX we have 2 groups - 111,333. Amount grouped by 111 : 100+400+225 = 725. Amount grouped by 333 : 600. So for card XXX it should Group code 111 and amount 725
– Sheriff
Nov 21 '18 at 11:01
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
@Sheriff see the update.
– Daniel Mesejo
Nov 21 '18 at 11:14
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
Great Thanks. I would require bit more here. Instead of getting the Maximum sum . In larger picture, i have a huge huge data set with 14 GB. In that case can you help me in getting the Top 3 Group codes for a particular Card based on the sum of Amount.
– Sheriff
Nov 21 '18 at 12:48
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53410249%2fi-need-to-group-by-and-get-the-rank-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown