"GROUP BY" Function In Python For Array
I've tried Pandas and Numpy but haven't seen the result I want. I have a simple array that consists of several lines of this: [[customer_number, customer_name, invoice balance],[c
Solution 1:
You can make a dict that is keyed to the tuple of account name/number. Then loop through and collect the sums in the dict. Afterward you can convert the dict items()
back a list:
accounts = {}
for num, name, balance in l:
accounts[(num, name)] = accounts.get((num, name), 0) + balance
result = [[num, name, balance] for (num, name), balance in accounts.items()]
result will be:
[[Decimal('1111'), 'Customer1', Decimal('522.09')],
[Decimal('1112'), 'Customer2', Decimal('177.15')],
[Decimal('1113'), 'Customer3', Decimal('201.60')]]
Solution 2:
Just to show you that you can do this with pandas
also:
In [1]: import pandas as pd
In [2]: from decimal import Decimal
In [3]: data = [[Decimal('1111'), 'Customer1', Decimal('31.50')],
...: [Decimal('1112'), 'Customer2', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('90.00')],
...: [Decimal('1113'), 'Customer3', Decimal('30.88')],
...: [Decimal('1112'), 'Customer2', Decimal('30.88')],
...: [Decimal('1112'), 'Customer2', Decimal('15.00')],
...: [Decimal('1111'), 'Customer1', Decimal('37.93')],
...: [Decimal('1113'), 'Customer3', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('30.88')],
...: [Decimal('1113'), 'Customer3', Decimal('26.60')],
...: [Decimal('1113'), 'Customer3', Decimal('44.22')],
...: [Decimal('1112'), 'Customer2', Decimal('32.93')],
...: [Decimal('1111'), 'Customer1', Decimal('20.00')],
...: [Decimal('1113'), 'Customer3', Decimal('38.14')],
...: [Decimal('1111'), 'Customer1', Decimal('16.60')],
...: [Decimal('1112'), 'Customer2', Decimal('67.46')],
...: [Decimal('1111'), 'Customer1', Decimal('30.88')],
...: [Decimal('1113'), 'Customer3', Decimal('30.88')],
...: [Decimal('1111'), 'Customer1', Decimal('233.42')]]
In [4]: df = pd.DataFrame(data, columns=['customer_id', 'customer_name', 'invoice_balance'])
In [5]: df
Out[5]:
customer_id customer_name invoice_balance
0 1111 Customer1 31.50
1 1112 Customer2 30.88
2 1111 Customer1 90.00
3 1113 Customer3 30.88
4 1112 Customer2 30.88
5 1112 Customer2 15.00
6 1111 Customer1 37.93
7 1113 Customer3 30.88
8 1111 Customer1 30.88
9 1111 Customer1 30.88
10 1113 Customer3 26.60
11 1113 Customer3 44.22
12 1112 Customer2 32.93
13 1111 Customer1 20.00
14 1113 Customer3 38.14
15 1111 Customer1 16.60
16 1112 Customer2 67.46
17 1111 Customer1 30.88
18 1113 Customer3 30.88
19 1111 Customer1 233.42
Now, you can use a sql-esque declarative approach with pandas:
In [6]: df.groupby(['customer_id', 'customer_name'])['invoice_balance'].sum()
Out[6]:
customer_id customer_name
1111 Customer1 522.09
1112 Customer2 177.15
1113 Customer3 201.60
Name: invoice_balance, dtype: object
Of course, I probably wouldn't add pandas as a dependency to your project just for this. but it is possible.
Solution 3:
# always use decimal type for money, not float
from decimal import Decimal
# input data
data = [
[ 1, 'Bob', Decimal('1.23') ],
[ 2, 'Alice', Decimal('2.34') ],
[ 1, 'Bob', Decimal('3.45') ],
[ 2, 'Alice', Decimal('4.56') ],
]
# sum balances into buckets by customer number
buckets = {}
for num, name, balance in data:
buckets.setdefault(num, [num, name, Decimal('0.00')])[2] += balance
# print the result
for bucket in buckets.values():
print(bucket)
Output:
[1, 'Bob', Decimal('4.68')]
[2, 'Alice', Decimal('6.90')]
Post a Comment for ""GROUP BY" Function In Python For Array"