operation | txn_quantity | cumulative_quantity | txn_amount | cost_of_purchase | sell_ratio | NET_COST |
---|---|---|---|---|---|---|
buy | 250 | 250 | 5000 | 5000 | 0 | 0 |
sell | 100 | 150 | 3000 | 0 | 0.4 | 0 |
buy | 150 | 300 | 1500 | 1500 | 0 | 0 |
sell | 225 | 75 | 4000 | 0 | 0.75 | 0 |
Above I have a simulated transaction table. My ultimate goal is to derive the average cost for each row, and the most accurate way to get this is (net cost / cumulative quantity).
In order to calculate net cost, I need to do a conditional running sum:
- If it’s a buy, add cost_of_purchase
- If it’s a sell, minus (previous row net cost * sell ratio)
Below is the expected output of NET_COST, which then can be used to derive AVERAGE_COST
operation | txn_quantity | cumulative_quantity | txn_amount | cost_of_purchase | sell_ratio | NET_COST | AVERAGE_COST |
---|---|---|---|---|---|---|---|
buy | 250 | 250 | 5000 | 5000 | 0 | 5000 | 20 |
sell | 100 | 150 | 3000 | 0 | 0.4 | 3000 | 20 |
buy | 150 | 300 | 1500 | 1500 | 0 | 4500 | 15 |
sell | 225 | 75 | 4000 | 0 | 0.75 | 1125 | 15 |
For further clarity, here’s what’s happening in each cell for NET_COST
NET_COST |
---|
0 + 5000 |
5000 – (5000 * 0.4) |
3000 + 1500 |
4500 – (4500 * 0.75) |
Is this achievable in SQL (Impala / Hive)?
You need to sign in to view this answers
Leave feedback about this