OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How can I compare a value in one column to all values that are BEFORE it in another column to find the number of unique values that are less than?

  • Thread starter Thread starter AmirX
  • Start date Start date
A

AmirX

Guest
This is my DataFrame:

Code:
import pandas as pd
df = pd.DataFrame(
    {
        'a': [100, 100, 105, 106, 106, 107, 108, 109],
        'b': [99, 100, 110, 107, 100, 110, 120, 106],
    }
)

Expected output is creating column x:

Code:
     a    b    x
0  100   99    0
1  100  100    1
2  105  110    2
3  106  107    3
4  106  100    1
5  107  110    4
6  108  120    5
7  109  106    6

Logic:

This is somehow an extension to this answer. I explain the logic by examples and I start from row 1:

For row 1, the b column value is 100. Then in order to get x for row 1 this value should be compared with all UNIQUE values in a that are on the same row or before it to find out how many values in a are less than or equal to it. The only unique value that is on the same row or before it is 100, so 1 is chosen for x.

For row 2, there are two unique values in a that are less than or equal to 110 which are 100, 105.

The logic is the same for the rest of rows.

This is my attempt based on the linked answer but it does not work:

Code:
t = df.a.unique()
m1 = np.arange(len(t))[:,None] >= np.arange(len(t))

h = df['b'].to_numpy()
m2 = t <= h[:, None]
<p>This is my DataFrame:</p>
<pre><code>import pandas as pd
df = pd.DataFrame(
{
'a': [100, 100, 105, 106, 106, 107, 108, 109],
'b': [99, 100, 110, 107, 100, 110, 120, 106],
}
)
</code></pre>
<p>Expected output is creating column <code>x</code>:</p>
<pre><code> a b x
0 100 99 0
1 100 100 1
2 105 110 2
3 106 107 3
4 106 100 1
5 107 110 4
6 108 120 5
7 109 106 6
</code></pre>
<p>Logic:</p>
<p>This is somehow an extension to this <a href="https://stackoverflow.com/a/78515922/10200497">answer</a>. I explain the logic by examples and I start from row <code>1</code>:</p>
<p>For row <code>1</code>, the <code>b</code> column value is 100. Then in order to get <code>x</code> for row <code>1</code> this value should be compared with all UNIQUE values in <code>a</code> that are on the same row or before it to find out how many values in <code>a</code> are less than or equal to it. The only unique value that is on the same row or before it is 100, so 1 is chosen for <code>x</code>.</p>
<p>For row <code>2</code>, there are two unique values in <code>a</code> that are less than or equal to 110 which are 100, 105.</p>
<p>The logic is the same for the rest of rows.</p>
<p>This is my attempt based on the linked answer but it does not work:</p>
<pre><code>t = df.a.unique()
m1 = np.arange(len(t))[:,None] >= np.arange(len(t))

h = df['b'].to_numpy()
m2 = t <= h[:, None]
</code></pre>
 

Latest posts

Top