"group By" Function In Python For Array
I've tried Pandas and Numpy but haven't seen the result I want. I have a simple array that consists of several lines of this: [[customer_number, customer_name, invoice balance],[c
Solution 1:
You can make a dict that is keyed to the tuple of account name/number. Then loop through and collect the sums in the dict. Afterward you can convert the dict items()
back a list:
accounts = {}
for num, name, balance in l:
accounts[(num, name)] = accounts.get((num, name), 0) + balance
result = [[num, name, balance] for (num, name), balance in accounts.items()]
result will be:
[[Decimal('1111'), 'Customer1', Decimal('522.09')],
[Decimal('1112'), 'Customer2', Decimal('177.15')],
[Decimal('1113'), 'Customer3', Decimal('201.60')]]
Solution 2:
Just to show you that you can do this with pandas
also:
In [1]: import pandas as pd
In [2]: from decimal import Decimal
In [3]: data = [[Decimal('1111'), 'Customer1', Decimal('31.50')],
...: [Decimal('1112'), 'Customer2', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('90.00')],
...: [Decimal('1113'), 'Customer3', Decimal('30.88')],
...: [Decimal('1112'), 'Customer2', Decimal('30.88')],
...: [Decimal('1112'), 'Customer2', Decimal('15.00')],
...: [Decimal('1111'), 'Customer1', Decimal('37.93')],
...: [Decimal('1113'), 'Customer3', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('30.88')],
...: [Decimal('1113'), 'Customer3', Decimal('26.60')],
...: [Decimal('1113'), 'Customer3', Decimal('44.22')],
...: [Decimal('1112'), 'Customer2', Decimal('32.93')],
...: [Decimal('1111'), 'Customer1', Decimal('20.00')],
...: [Decimal('1113'), 'Customer3', Decimal('38.14')],
...: [Decimal('1111'), 'Customer1', Decimal('16.60')],
...: [Decimal('1112'), 'Customer2', Decimal('67.46')],
...: [Decimal('1111'), 'Customer1', Decimal('30.88')],
...: [Decimal('1113'), 'Customer3', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('233.42')]]
In [4]: df = pd.DataFrame(data, columns=['customer_id', 'customer_name', 'invoice_balance'])
In [5]: df
Out[5]:
customer_id customer_name invoice_balance
01111 Customer1 31.5011112 Customer2 30.8821111 Customer1 90.0031113 Customer3 30.8841112 Customer2 30.8851112 Customer2 15.0061111 Customer1 37.9371113 Customer3 30.8881111 Customer1 30.8891111 Customer1 30.88101113 Customer3 26.60111113 Customer3 44.22121112 Customer2 32.93131111 Customer1 20.00141113 Customer3 38.14151111 Customer1 16.60161112 Customer2 67.46171111 Customer1 30.88181113 Customer3 30.88191111 Customer1 233.42
Now, you can use a sql-esque declarative approach with pandas:
In[6]: df.groupby(['customer_id', 'customer_name'])['invoice_balance'].sum()
Out[6]:
customer_idcustomer_name1111Customer1522.091112Customer2177.151113Customer3201.60Name: invoice_balance, dtype: object
Of course, I probably wouldn't add pandas as a dependency to your project just for this. but it is possible.
Solution 3:
# always use decimal type for money, not floatfrom decimal import Decimal
# input data
data = [
[ 1, 'Bob', Decimal('1.23') ],
[ 2, 'Alice', Decimal('2.34') ],
[ 1, 'Bob', Decimal('3.45') ],
[ 2, 'Alice', Decimal('4.56') ],
]
# sum balances into buckets by customer number
buckets = {}
for num, name, balance in data:
buckets.setdefault(num, [num, name, Decimal('0.00')])[2] += balance
# print the resultfor bucket in buckets.values():
print(bucket)
Output:
[1, 'Bob', Decimal('4.68')][2, 'Alice', Decimal('6.90')]
Post a Comment for ""group By" Function In Python For Array"