OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Polars selectors alias with when/then/otherwise

  • Thread starter Thread starter levant pied
  • Start date Start date
L

levant pied

Guest
Say I have this:

Code:
df = polars.DataFrame(dict(
  j=numpy.random.randint(10, 99, 10),
  k=numpy.random.randint(10, 99, 10),
  l=numpy.random.randint(10, 99, 10),
  ))
  
print(df)

shape: (10, 4)
 j (i64)  k (i64)  l (i64)
 32       82       34
 67       40       53
 11       81       86
 10       13       36
 70       80       62
 91       31       90
 18       59       51
 98       67       92
 23       13       25
 57       78       74
shape: (10, 3)

and I want to apply the same when/then/otherwise condition on multiple columns:

Code:
dfj = (df
  .select(
    polars
      .when(polars.selectors.numeric() < 50)
      .then(polars.lit(1))
      .otherwise(polars.lit(2))
    )
  )

This fails with:

Code:
polars.exceptions.DuplicateError: the name: 'literal' is duplicate

How do I make this use the currently selected column as the alias? I.e. I want the equivalent of this:

Code:
dfj = (df
  .select(
    polars
      .when(polars.col(c) < 50)
      .then(polars.lit(1))
      .otherwise(polars.lit(2))
      .alias(c)
    for c in df.columns
    )
  )

print(dfj)

 j (i32)  k (i32)  l (i32)
 1        2        1
 2        1        2
 1        2        2
 1        1        1
 2        2        2
 2        1        2
 1        2        2
 2        2        2
 1        1        1
 2        2        2
shape: (10, 3)
<p>Say I have this:</p>
<pre><code>df = polars.DataFrame(dict(
j=numpy.random.randint(10, 99, 10),
k=numpy.random.randint(10, 99, 10),
l=numpy.random.randint(10, 99, 10),
))

print(df)

shape: (10, 4)
j (i64) k (i64) l (i64)
32 82 34
67 40 53
11 81 86
10 13 36
70 80 62
91 31 90
18 59 51
98 67 92
23 13 25
57 78 74
shape: (10, 3)
</code></pre>
<p>and I want to apply the same <code>when</code>/<code>then</code>/<code>otherwise</code> condition on multiple columns:</p>
<pre><code>dfj = (df
.select(
polars
.when(polars.selectors.numeric() < 50)
.then(polars.lit(1))
.otherwise(polars.lit(2))
)
)
</code></pre>
<p>This fails with:</p>
<pre><code>polars.exceptions.DuplicateError: the name: 'literal' is duplicate
</code></pre>
<p>How do I make this use the currently selected column as the alias? I.e. I want the equivalent of this:</p>
<pre><code>dfj = (df
.select(
polars
.when(polars.col(c) < 50)
.then(polars.lit(1))
.otherwise(polars.lit(2))
.alias(c)
for c in df.columns
)
)

print(dfj)

j (i32) k (i32) l (i32)
1 2 1
2 1 2
1 2 2
1 1 1
2 2 2
2 1 2
1 2 2
2 2 2
1 1 1
2 2 2
shape: (10, 3)
</code></pre>
 
Top