python - Combine elements in list of Tuples? -
i'm working on program takes in imdb text file, , outputs top actors (by movie appearances) based on user input n.
however, i'm running issue i'm having slots taken actors in same amount of movies, need avoid. rather, if 2 actors in 5 movies, example, number 5 should appear , actors names should combined , separated semicolon.
i've tried multiple workarounds , nothing has yet worked. suggestions?
if __name__ == "__main__": imdb_file = raw_input("enter name of imdb file ==> ").strip() print imdb_file n= input('enter number of top individuals ==> ') print n actors_to_movies = {} line in open(imdb_file): words = line.strip().split('|') actor = words[0].strip() movie = words[1].strip() if not actor in actors_to_movies: actors_to_movies[actor] = set() actors_to_movies[actor].add(movie) movie_list= sorted(list(actors_to_movies[actor])) #arranges dictionary list of tuples# d = [ (x, actors_to_movies[x]) x in actors_to_movies] descending = sorted(d, key = lambda x: len(x[1]), reverse=true) #prints tuples in descending order n number of times (user input)# in range(n): print str(len(descending[i][1]))+':', descending[i][0]
there useful method itertools.groupby
it allows split list groups key. using can write function prints top actors:
import itertools def print_top_actors(actor_info_list, top=5): """ :param: actor_info_list should contain tuples of (actor_name, movie_count) """ actor_info_list.sort(key=lambda x: x[1], reverse=true) i, (movie_count, actor_iter) in enumerate(itertools.groupby(actor_info_list)): if >= top: break print movie_count, ';'.join(actor actor, movie_count in actor_iter)
and example of usage:
>>> print_top_actors( ... [ ... ("dicaprio", 100500), ... ("pitt", 100500), ... ("foo", 10), ... ("bar", 10), ... ("baz", 10), ... ("qux", 3), ... ("lol", 1) ... ], top = 3) 100500 dicaprio;pitt 10 foo;bar;baz 3 qux
Comments
Post a Comment