python - Collecting tweets using screen names and saving them using Tweepy -
i have list of twitter screen names , want collect 3200 tweets per screen name. below codes have adapted https://gist.github.com/yanofsky/5436496
#initialize list hold tweepy tweets alltweets = [] #screen names r=['user_a', 'user_b', 'user_c'] #saving tweets writefile=open("tweets.csv", "wb") w=csv.writer(writefile) in r: #make initial request recent tweets (200 maximum allowed count) new_tweets = api.user_timeline(screen_name = i, count=200) #save recent tweets alltweets.extend(new_tweets) #save id of oldest tweet less 1 oldest = alltweets[-1].id - 1 #keep grabbing tweets until there no tweets left grab while len(new_tweets) > 0: print "getting tweets before %s" % (oldest) #all subsiquent requests use max_id param prevent duplicates new_tweets = api.user_timeline(screen_name = i[0],count=200,max_id=oldest) #save recent tweets alltweets.extend(new_tweets) #update id of oldest tweet less 1 oldest = alltweets[-1].id - 1 print "...%s tweets downloaded far" % (len(alltweets)) #write csv tweet in alltweets: w.writerow([i, tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")]) writefile.close() at end, final csv file contains 3200 tweets user_a, 6400 tweets user_b, , 9600 tweets user_c. not correct in above codes. there should 3200 tweets each user. can point me wrong in codes? thanks.
because using .extend() add alltweets, every iteration of for loop causing next user's tweets added previous one. want clear alltweets @ start of each for loop iteration:
for in r: alltweets = [] ...
Comments
Post a Comment