Pandas DataFrames: Create new rows with calculations across existing rows





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







6















How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?



Source DataFrame



df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
'Industry': ['Finance', 'Finance', 'Retail',
'Retail', 'Energy', 'Energy',
'Retail', 'Retail'],
'Field': ['Import', 'Export','Import',
'Export','Import', 'Export',
'Import', 'Export'],
'Value': [100, 50, 80, 10, 20, 5, 30, 10]})

Country Industry Field Value
0 USA Finance Import 100
1 USA Finance Export 50
2 USA Retail Import 80
3 USA Retail Export 10
4 USA Energy Import 20
5 USA Energy Export 5
6 Canada Retail Import 30
7 Canada Retail Export 10


Target DataFrame



Net = Import - Export



    Country Industry    Field   Value
0 USA Finance Net 50
1 USA Retail Net 70
2 USA Energy Net 15
3 Canada Retail Net 20









share|improve this question































    6















    How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?



    Source DataFrame



    df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
    'Industry': ['Finance', 'Finance', 'Retail',
    'Retail', 'Energy', 'Energy',
    'Retail', 'Retail'],
    'Field': ['Import', 'Export','Import',
    'Export','Import', 'Export',
    'Import', 'Export'],
    'Value': [100, 50, 80, 10, 20, 5, 30, 10]})

    Country Industry Field Value
    0 USA Finance Import 100
    1 USA Finance Export 50
    2 USA Retail Import 80
    3 USA Retail Export 10
    4 USA Energy Import 20
    5 USA Energy Export 5
    6 Canada Retail Import 30
    7 Canada Retail Export 10


    Target DataFrame



    Net = Import - Export



        Country Industry    Field   Value
    0 USA Finance Net 50
    1 USA Retail Net 70
    2 USA Energy Net 15
    3 Canada Retail Net 20









    share|improve this question



























      6












      6








      6








      How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?



      Source DataFrame



      df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
      'Industry': ['Finance', 'Finance', 'Retail',
      'Retail', 'Energy', 'Energy',
      'Retail', 'Retail'],
      'Field': ['Import', 'Export','Import',
      'Export','Import', 'Export',
      'Import', 'Export'],
      'Value': [100, 50, 80, 10, 20, 5, 30, 10]})

      Country Industry Field Value
      0 USA Finance Import 100
      1 USA Finance Export 50
      2 USA Retail Import 80
      3 USA Retail Export 10
      4 USA Energy Import 20
      5 USA Energy Export 5
      6 Canada Retail Import 30
      7 Canada Retail Export 10


      Target DataFrame



      Net = Import - Export



          Country Industry    Field   Value
      0 USA Finance Net 50
      1 USA Retail Net 70
      2 USA Energy Net 15
      3 Canada Retail Net 20









      share|improve this question
















      How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?



      Source DataFrame



      df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
      'Industry': ['Finance', 'Finance', 'Retail',
      'Retail', 'Energy', 'Energy',
      'Retail', 'Retail'],
      'Field': ['Import', 'Export','Import',
      'Export','Import', 'Export',
      'Import', 'Export'],
      'Value': [100, 50, 80, 10, 20, 5, 30, 10]})

      Country Industry Field Value
      0 USA Finance Import 100
      1 USA Finance Export 50
      2 USA Retail Import 80
      3 USA Retail Export 10
      4 USA Energy Import 20
      5 USA Energy Export 5
      6 Canada Retail Import 30
      7 Canada Retail Export 10


      Target DataFrame



      Net = Import - Export



          Country Industry    Field   Value
      0 USA Finance Net 50
      1 USA Retail Net 70
      2 USA Energy Net 15
      3 Canada Retail Net 20






      python pandas dataframe






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 7 hours ago









      Scott Boston

      58.6k73258




      58.6k73258










      asked 8 hours ago









      LorenzLorenz

      595




      595
























          5 Answers
          5






          active

          oldest

          votes


















          8














          There are quite possibly many ways. Here's one using groupby and unstack:



          (df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
          .sum()
          .unstack('Field')
          .eval('Import - Export')
          .reset_index(name='Value'))

          Country Industry Value
          0 USA Finance 50
          1 USA Retail 70
          2 USA Energy 15
          3 Canada Retail 20





          share|improve this answer





















          • 1





            By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

            – BallpointBen
            8 hours ago






          • 1





            @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

            – coldspeed
            8 hours ago











          • Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

            – Lorenz
            5 hours ago











          • @Lorenz Oops... fixed, thanks!

            – coldspeed
            5 hours ago











          • @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

            – BallpointBen
            3 hours ago



















          4














          IIUC



          df=df.set_index(['Country','Industry'])

          Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
          Newdf
          Country Industry Value Field
          0 USA Finance -50 Net
          1 USA Retail -70 Net
          2 USA Energy -15 Net
          3 Canada Retail -20 Net




          pivot_table



          df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
          diff(axis=1).
          dropna(1).
          rename(columns={'Import':'Value'}).
          reset_index()
          Out[112]:
          Field Country Industry Value
          0 Canada Retail 20.0
          1 USA Energy 15.0
          2 USA Finance 50.0
          3 USA Retail 70.0





          share|improve this answer

































            2














            You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:



            df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
            df['Field'] = 'Net'
            df.dropna(inplace=True)
            df.reset_index(drop=True, inplace=True)

            print(df)
            Country Industry Field Value
            0 USA Finance Net 50.0
            1 USA Retail Net 70.0
            2 USA Energy Net 15.0
            3 Canada Retail Net 20.0





            share|improve this answer































              2














              You can do it this way to add those rows to your original dataframe:



              df.set_index(['Country','Industry','Field'])
              .unstack()['Value']
              .eval('Net = Import - Export')
              .stack().rename('Value').reset_index()


              Output:



                 Country Industry   Field  Value
              0 Canada Retail Export 10
              1 Canada Retail Import 30
              2 Canada Retail Net 20
              3 USA Energy Export 5
              4 USA Energy Import 20
              5 USA Energy Net 15
              6 USA Finance Export 50
              7 USA Finance Import 100
              8 USA Finance Net 50
              9 USA Retail Export 10
              10 USA Retail Import 80
              11 USA Retail Net 70





              share|improve this answer
























              • Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                – Lorenz
                5 hours ago






              • 1





                Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                – Lorenz
                3 hours ago



















              1














              This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)



              >>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
              >>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
              Country Industry
              Canada Retail 20
              USA Energy 15
              Finance 50
              Retail 70
              Name: Value, dtype: int64





              share|improve this answer
























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55670192%2fpandas-dataframes-create-new-rows-with-calculations-across-existing-rows%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                5 Answers
                5






                active

                oldest

                votes








                5 Answers
                5






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                8














                There are quite possibly many ways. Here's one using groupby and unstack:



                (df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
                .sum()
                .unstack('Field')
                .eval('Import - Export')
                .reset_index(name='Value'))

                Country Industry Value
                0 USA Finance 50
                1 USA Retail 70
                2 USA Energy 15
                3 Canada Retail 20





                share|improve this answer





















                • 1





                  By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

                  – BallpointBen
                  8 hours ago






                • 1





                  @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

                  – coldspeed
                  8 hours ago











                • Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

                  – Lorenz
                  5 hours ago











                • @Lorenz Oops... fixed, thanks!

                  – coldspeed
                  5 hours ago











                • @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

                  – BallpointBen
                  3 hours ago
















                8














                There are quite possibly many ways. Here's one using groupby and unstack:



                (df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
                .sum()
                .unstack('Field')
                .eval('Import - Export')
                .reset_index(name='Value'))

                Country Industry Value
                0 USA Finance 50
                1 USA Retail 70
                2 USA Energy 15
                3 Canada Retail 20





                share|improve this answer





















                • 1





                  By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

                  – BallpointBen
                  8 hours ago






                • 1





                  @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

                  – coldspeed
                  8 hours ago











                • Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

                  – Lorenz
                  5 hours ago











                • @Lorenz Oops... fixed, thanks!

                  – coldspeed
                  5 hours ago











                • @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

                  – BallpointBen
                  3 hours ago














                8












                8








                8







                There are quite possibly many ways. Here's one using groupby and unstack:



                (df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
                .sum()
                .unstack('Field')
                .eval('Import - Export')
                .reset_index(name='Value'))

                Country Industry Value
                0 USA Finance 50
                1 USA Retail 70
                2 USA Energy 15
                3 Canada Retail 20





                share|improve this answer















                There are quite possibly many ways. Here's one using groupby and unstack:



                (df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
                .sum()
                .unstack('Field')
                .eval('Import - Export')
                .reset_index(name='Value'))

                Country Industry Value
                0 USA Finance 50
                1 USA Retail 70
                2 USA Energy 15
                3 Canada Retail 20






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 5 hours ago

























                answered 8 hours ago









                coldspeedcoldspeed

                142k25159247




                142k25159247








                • 1





                  By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

                  – BallpointBen
                  8 hours ago






                • 1





                  @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

                  – coldspeed
                  8 hours ago











                • Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

                  – Lorenz
                  5 hours ago











                • @Lorenz Oops... fixed, thanks!

                  – coldspeed
                  5 hours ago











                • @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

                  – BallpointBen
                  3 hours ago














                • 1





                  By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

                  – BallpointBen
                  8 hours ago






                • 1





                  @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

                  – coldspeed
                  8 hours ago











                • Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

                  – Lorenz
                  5 hours ago











                • @Lorenz Oops... fixed, thanks!

                  – coldspeed
                  5 hours ago











                • @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

                  – BallpointBen
                  3 hours ago








                1




                1





                By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

                – BallpointBen
                8 hours ago





                By far the best answer. The unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done

                – BallpointBen
                8 hours ago




                1




                1





                @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

                – coldspeed
                8 hours ago





                @BallpointBen eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.

                – coldspeed
                8 hours ago













                Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

                – Lorenz
                5 hours ago





                Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.

                – Lorenz
                5 hours ago













                @Lorenz Oops... fixed, thanks!

                – coldspeed
                5 hours ago





                @Lorenz Oops... fixed, thanks!

                – coldspeed
                5 hours ago













                @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

                – BallpointBen
                3 hours ago





                @coldspeed Actually I think there’s a better way… see my answer. unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.

                – BallpointBen
                3 hours ago













                4














                IIUC



                df=df.set_index(['Country','Industry'])

                Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
                Newdf
                Country Industry Value Field
                0 USA Finance -50 Net
                1 USA Retail -70 Net
                2 USA Energy -15 Net
                3 Canada Retail -20 Net




                pivot_table



                df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
                diff(axis=1).
                dropna(1).
                rename(columns={'Import':'Value'}).
                reset_index()
                Out[112]:
                Field Country Industry Value
                0 Canada Retail 20.0
                1 USA Energy 15.0
                2 USA Finance 50.0
                3 USA Retail 70.0





                share|improve this answer






























                  4














                  IIUC



                  df=df.set_index(['Country','Industry'])

                  Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
                  Newdf
                  Country Industry Value Field
                  0 USA Finance -50 Net
                  1 USA Retail -70 Net
                  2 USA Energy -15 Net
                  3 Canada Retail -20 Net




                  pivot_table



                  df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
                  diff(axis=1).
                  dropna(1).
                  rename(columns={'Import':'Value'}).
                  reset_index()
                  Out[112]:
                  Field Country Industry Value
                  0 Canada Retail 20.0
                  1 USA Energy 15.0
                  2 USA Finance 50.0
                  3 USA Retail 70.0





                  share|improve this answer




























                    4












                    4








                    4







                    IIUC



                    df=df.set_index(['Country','Industry'])

                    Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
                    Newdf
                    Country Industry Value Field
                    0 USA Finance -50 Net
                    1 USA Retail -70 Net
                    2 USA Energy -15 Net
                    3 Canada Retail -20 Net




                    pivot_table



                    df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
                    diff(axis=1).
                    dropna(1).
                    rename(columns={'Import':'Value'}).
                    reset_index()
                    Out[112]:
                    Field Country Industry Value
                    0 Canada Retail 20.0
                    1 USA Energy 15.0
                    2 USA Finance 50.0
                    3 USA Retail 70.0





                    share|improve this answer















                    IIUC



                    df=df.set_index(['Country','Industry'])

                    Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
                    Newdf
                    Country Industry Value Field
                    0 USA Finance -50 Net
                    1 USA Retail -70 Net
                    2 USA Energy -15 Net
                    3 Canada Retail -20 Net




                    pivot_table



                    df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
                    diff(axis=1).
                    dropna(1).
                    rename(columns={'Import':'Value'}).
                    reset_index()
                    Out[112]:
                    Field Country Industry Value
                    0 Canada Retail 20.0
                    1 USA Energy 15.0
                    2 USA Finance 50.0
                    3 USA Retail 70.0






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited 7 hours ago

























                    answered 8 hours ago









                    Wen-BenWen-Ben

                    125k83871




                    125k83871























                        2














                        You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:



                        df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
                        df['Field'] = 'Net'
                        df.dropna(inplace=True)
                        df.reset_index(drop=True, inplace=True)

                        print(df)
                        Country Industry Field Value
                        0 USA Finance Net 50.0
                        1 USA Retail Net 70.0
                        2 USA Energy Net 15.0
                        3 Canada Retail Net 20.0





                        share|improve this answer




























                          2














                          You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:



                          df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
                          df['Field'] = 'Net'
                          df.dropna(inplace=True)
                          df.reset_index(drop=True, inplace=True)

                          print(df)
                          Country Industry Field Value
                          0 USA Finance Net 50.0
                          1 USA Retail Net 70.0
                          2 USA Energy Net 15.0
                          3 Canada Retail Net 20.0





                          share|improve this answer


























                            2












                            2








                            2







                            You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:



                            df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
                            df['Field'] = 'Net'
                            df.dropna(inplace=True)
                            df.reset_index(drop=True, inplace=True)

                            print(df)
                            Country Industry Field Value
                            0 USA Finance Net 50.0
                            1 USA Retail Net 70.0
                            2 USA Energy Net 15.0
                            3 Canada Retail Net 20.0





                            share|improve this answer













                            You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:



                            df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
                            df['Field'] = 'Net'
                            df.dropna(inplace=True)
                            df.reset_index(drop=True, inplace=True)

                            print(df)
                            Country Industry Field Value
                            0 USA Finance Net 50.0
                            1 USA Retail Net 70.0
                            2 USA Energy Net 15.0
                            3 Canada Retail Net 20.0






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 8 hours ago









                            ErfanErfan

                            3,2111419




                            3,2111419























                                2














                                You can do it this way to add those rows to your original dataframe:



                                df.set_index(['Country','Industry','Field'])
                                .unstack()['Value']
                                .eval('Net = Import - Export')
                                .stack().rename('Value').reset_index()


                                Output:



                                   Country Industry   Field  Value
                                0 Canada Retail Export 10
                                1 Canada Retail Import 30
                                2 Canada Retail Net 20
                                3 USA Energy Export 5
                                4 USA Energy Import 20
                                5 USA Energy Net 15
                                6 USA Finance Export 50
                                7 USA Finance Import 100
                                8 USA Finance Net 50
                                9 USA Retail Export 10
                                10 USA Retail Import 80
                                11 USA Retail Net 70





                                share|improve this answer
























                                • Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                                  – Lorenz
                                  5 hours ago






                                • 1





                                  Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                                  – Lorenz
                                  3 hours ago
















                                2














                                You can do it this way to add those rows to your original dataframe:



                                df.set_index(['Country','Industry','Field'])
                                .unstack()['Value']
                                .eval('Net = Import - Export')
                                .stack().rename('Value').reset_index()


                                Output:



                                   Country Industry   Field  Value
                                0 Canada Retail Export 10
                                1 Canada Retail Import 30
                                2 Canada Retail Net 20
                                3 USA Energy Export 5
                                4 USA Energy Import 20
                                5 USA Energy Net 15
                                6 USA Finance Export 50
                                7 USA Finance Import 100
                                8 USA Finance Net 50
                                9 USA Retail Export 10
                                10 USA Retail Import 80
                                11 USA Retail Net 70





                                share|improve this answer
























                                • Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                                  – Lorenz
                                  5 hours ago






                                • 1





                                  Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                                  – Lorenz
                                  3 hours ago














                                2












                                2








                                2







                                You can do it this way to add those rows to your original dataframe:



                                df.set_index(['Country','Industry','Field'])
                                .unstack()['Value']
                                .eval('Net = Import - Export')
                                .stack().rename('Value').reset_index()


                                Output:



                                   Country Industry   Field  Value
                                0 Canada Retail Export 10
                                1 Canada Retail Import 30
                                2 Canada Retail Net 20
                                3 USA Energy Export 5
                                4 USA Energy Import 20
                                5 USA Energy Net 15
                                6 USA Finance Export 50
                                7 USA Finance Import 100
                                8 USA Finance Net 50
                                9 USA Retail Export 10
                                10 USA Retail Import 80
                                11 USA Retail Net 70





                                share|improve this answer













                                You can do it this way to add those rows to your original dataframe:



                                df.set_index(['Country','Industry','Field'])
                                .unstack()['Value']
                                .eval('Net = Import - Export')
                                .stack().rename('Value').reset_index()


                                Output:



                                   Country Industry   Field  Value
                                0 Canada Retail Export 10
                                1 Canada Retail Import 30
                                2 Canada Retail Net 20
                                3 USA Energy Export 5
                                4 USA Energy Import 20
                                5 USA Energy Net 15
                                6 USA Finance Export 50
                                7 USA Finance Import 100
                                8 USA Finance Net 50
                                9 USA Retail Export 10
                                10 USA Retail Import 80
                                11 USA Retail Net 70






                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered 8 hours ago









                                Scott BostonScott Boston

                                58.6k73258




                                58.6k73258













                                • Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                                  – Lorenz
                                  5 hours ago






                                • 1





                                  Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                                  – Lorenz
                                  3 hours ago



















                                • Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                                  – Lorenz
                                  5 hours ago






                                • 1





                                  Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                                  – Lorenz
                                  3 hours ago

















                                Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                                – Lorenz
                                5 hours ago





                                Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,

                                – Lorenz
                                5 hours ago




                                1




                                1





                                Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                                – Lorenz
                                3 hours ago





                                Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!

                                – Lorenz
                                3 hours ago











                                1














                                This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)



                                >>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
                                >>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
                                Country Industry
                                Canada Retail 20
                                USA Energy 15
                                Finance 50
                                Retail 70
                                Name: Value, dtype: int64





                                share|improve this answer




























                                  1














                                  This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)



                                  >>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
                                  >>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
                                  Country Industry
                                  Canada Retail 20
                                  USA Energy 15
                                  Finance 50
                                  Retail 70
                                  Name: Value, dtype: int64





                                  share|improve this answer


























                                    1












                                    1








                                    1







                                    This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)



                                    >>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
                                    >>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
                                    Country Industry
                                    Canada Retail 20
                                    USA Energy 15
                                    Finance 50
                                    Retail 70
                                    Name: Value, dtype: int64





                                    share|improve this answer













                                    This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)



                                    >>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
                                    >>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
                                    Country Industry
                                    Canada Retail 20
                                    USA Energy 15
                                    Finance 50
                                    Retail 70
                                    Name: Value, dtype: int64






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered 3 hours ago









                                    BallpointBenBallpointBen

                                    3,7481639




                                    3,7481639






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55670192%2fpandas-dataframes-create-new-rows-with-calculations-across-existing-rows%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Identifying “long and narrow” polygons in with PostGISlength and width of polygonWhy postgis st_overlaps reports Qgis' “avoid intersections” generated polygon as overlapping with others?Adjusting polygons to boundary and filling holesDrawing polygons with fixed area?How to remove spikes in Polygons with PostGISDeleting sliver polygons after difference operation in QGIS?Snapping boundaries in PostGISSplit polygon into parts adding attributes based on underlying polygon in QGISSplitting overlap between polygons and assign to nearest polygon using PostGIS?Expanding polygons and clipping at midpoint?Removing Intersection of Buffers in Same Layers

                                        Masuk log Menu navigasi

                                        อาณาจักร (ชีววิทยา) ดูเพิ่ม อ้างอิง รายการเลือกการนำทาง10.1086/39456810.5962/bhl.title.447410.1126/science.163.3863.150576276010.1007/BF01796092408502"Phylogenetic structure of the prokaryotic domain: the primary kingdoms"10.1073/pnas.74.11.5088432104270744"Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya"1990PNAS...87.4576W10.1073/pnas.87.12.4576541592112744PubMedJump the queueexpand by handPubMedJump the queueexpand by handPubMedJump the queueexpand by hand"A revised six-kingdom system of life"10.1111/j.1469-185X.1998.tb00030.x9809012"Only six kingdoms of life"10.1098/rspb.2004.2705169172415306349"Kingdoms Protozoa and Chromista and the eozoan root of the eukaryotic tree"10.1098/rsbl.2009.0948288006020031978เพิ่มข้อมูล