pandas - calculating top 10 products for a given query based on its priority in python -


suppose given dataframe like:

                       query  productid  priority index 0                        3ds    2125233  0.018946 1                        rca    2009324  0.027599 2                       nook    1517163  0.009443 3                        rca    2877125  0.012054 4                        rca    2877134  0.005557 5              flatscreentvs    2416092  0.011961 6                    macbook    3108172  0.010459 7                        3ds    2264036  0.165948 8                        rca    8280834  0.004006 9                 memorycard    2740208  0.013744 10               acpowercord    2584273  0.006865 11                zaggiphone    1230537  0.136073 12            watchthethrone    3168067  0.104679 13     remotecontrolextender    7997055  0.113058 14                 camcorder    2009041  0.017809 15                       3ds    1988047  0.031711 16                       3ds    1686079  0.043783 17        wirelessheadphones    3770439  0.014714 18        wirelessheadphones    2602403  0.008525 19                 samsung40    2126065  0.018066 

i want find top 2 product_ids on basis of priority respect given query.

for eg. if have query=3ds top 2 products should be:

1. 1988047  2. 1686079  

this equivalent oracle's row_number() analytic function:

in [172]: df.assign(rn=df.sort_values('priority', ascending=0).groupby('query').cumcount() + 1).query('rn < 3').sort_values(['query','rn']) out[172]:                        query  productid  priority  rn index 7                        3ds    2264036  0.165948   1 16                       3ds    1686079  0.043783   2 10               acpowercord    2584273  0.006865   1 14                 camcorder    2009041  0.017809   1 5              flatscreentvs    2416092  0.011961   1 6                    macbook    3108172  0.010459   1 9                 memorycard    2740208  0.013744   1 2                       nook    1517163  0.009443   1 1                        rca    2009324  0.027599   1 3                        rca    2877125  0.012054   2 13     remotecontrolextender    7997055  0.113058   1 19                 samsung40    2126065  0.018066   1 12            watchthethrone    3168067  0.104679   1 17        wirelessheadphones    3770439  0.014714   1 18        wirelessheadphones    2602403  0.008525   2 11                zaggiphone    1230537  0.136073   1 

show productid selected query:

in [180]: (df.assign(rn=df.sort_values('priority', ascending=0).groupby('query').cumcount() + 1)    .....:    .query('query=="3ds" , rn < 3')['productid']    .....: ) out[180]: index 7     2264036 16    1686079 name: productid, dtype: int64 

Comments

Popular posts from this blog

magento2 - Magento 2 admin grid add filter to collection -

Android volley - avoid multiple requests of the same kind to the server? -

Combining PHP Registration and Login into one class with multiple functions in one PHP file -