OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Pandas: Group by a field that is a list [duplicate]

  • Thread starter Thread starter phibel
  • Start date Start date
P

phibel

Guest
I have a class for movies:

Code:
class Movie:
    def __init__(self,
                 title: str,
                 director: str,
                 actors: list[str]):
        self.title = title
        self.director = director
        self.actors: list[str] = actors

And I've created a list with 3 example movies:

Code:
movies = [Movie('Barton Fink', 'Joel Coen', ['John Turturro', 'John Goodman', 'Judy Davis']),
          Movie('The Big Lebowski', 'Joel Coen', ['Jeff Bridges', 'John Goodman', 'Steve Buscemi', 'John Turturro']),
          Movie('The Big Easy', 'Jim McBride', ['Dennis Quaid', 'Ellen Barkin', 'Ned Beatty']),
         ]

Now I want to use Pandas to get the number of occurrences of all actors. Something like this:

Code:
John Goodman: 3
John Turturro: 2
Judy Davis': 1
...

For the directors it works this way:

Code:
df = DataFrame([vars(m) for m in movies])
grouped = df.groupby(['director']).size().sort_values(ascending=False)
print(df)

But for the actors not:

Code:
df = DataFrame([vars(m) for m in movies])
grouped = df.groupby(['actors']).size().sort_values(ascending=False)
print(df)

Error: (<class 'TypeError'>, TypeError("unhashable type: 'list'"), <traceback object at 0x00000274C91E4340>)

How can I achieve grouping by actors?
<p>I have a class for movies:</p>
<pre><code>class Movie:
def __init__(self,
title: str,
director: str,
actors: list[str]):
self.title = title
self.director = director
self.actors: list[str] = actors
</code></pre>
<p>And I've created a list with 3 example movies:</p>
<pre><code>movies = [Movie('Barton Fink', 'Joel Coen', ['John Turturro', 'John Goodman', 'Judy Davis']),
Movie('The Big Lebowski', 'Joel Coen', ['Jeff Bridges', 'John Goodman', 'Steve Buscemi', 'John Turturro']),
Movie('The Big Easy', 'Jim McBride', ['Dennis Quaid', 'Ellen Barkin', 'Ned Beatty']),
]
</code></pre>
<p>Now I want to use Pandas to get the number of occurrences of all actors. Something like this:</p>
<pre><code>John Goodman: 3
John Turturro: 2
Judy Davis': 1
...
</code></pre>
<p>For the directors it works this way:</p>
<pre><code>df = DataFrame([vars(m) for m in movies])
grouped = df.groupby(['director']).size().sort_values(ascending=False)
print(df)
</code></pre>
<p>But for the actors not:</p>
<pre><code>df = DataFrame([vars(m) for m in movies])
grouped = df.groupby(['actors']).size().sort_values(ascending=False)
print(df)
</code></pre>
<p>Error: (<class 'TypeError'>, TypeError("unhashable type: 'list'"), <traceback object at 0x00000274C91E4340>)</p>
<p>How can I achieve grouping by actors?</p>
 

Latest posts

Online statistics

Members online
0
Guests online
3
Total visitors
3
Ads by Eonads
Top