python - Collecting tweets using screen names and saving them using Tweepy -
i have list of twitter screen names , want collect 3200 tweets per screen name. below codes have adapted https://gist.github.com/yanofsky/5436496
#initialize list hold tweepy tweets alltweets = [] #screen names r=['user_a', 'user_b', 'user_c'] #saving tweets writefile=open("tweets.csv", "wb") w=csv.writer(writefile) in r: #make initial request recent tweets (200 maximum allowed count) new_tweets = api.user_timeline(screen_name = i, count=200) #save recent tweets alltweets.extend(new_tweets) #save id of oldest tweet less 1 oldest = alltweets[-1].id - 1 #keep grabbing tweets until there no tweets left grab while len(new_tweets) > 0: print "getting tweets before %s" % (oldest) #all subsiquent requests use max_id param prevent duplicates new_tweets = api.user_timeline(screen_name = i[0],count=200,max_id=oldest) #save recent tweets alltweets.extend(new_tweets) #update id of oldest tweet less 1 oldest = alltweets[-1].id - 1 print "...%s tweets downloaded far" % (len(alltweets)) #write csv tweet in alltweets: w.writerow([i, tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")]) writefile.close()
at end, final csv file contains 3200 tweets user_a, 6400 tweets user_b, , 9600 tweets user_c. not correct in above codes. there should 3200 tweets each user. can point me wrong in codes? thanks.
because using .extend()
add alltweets
, every iteration of for
loop causing next user's tweets added previous one. want clear alltweets
@ start of each for
loop iteration:
for in r: alltweets = [] ...
Comments
Post a Comment