How big the spark stream window could be? -
i have data flows need calculated. thinking use spark stream job. there 1 thing not sure , feel worry about.
my requirements :
data comes in csv files every 5 minutes. need report on data of recent 5 minutes, 1 hour , 1 day. if setup spark stream calculation. need interval 5 minutes. need setup 2 window 1 hour , 1 day.
every 5 minutes there 1gb data comes in. 1 hour window calculate 12gb (60/5) data , 1 day window calculate 288gb(24*60/5) data.
i not have experience on spark. worries me.
can spark handle such big window ?
how ram need calculation 288 gb data? more 288 gb ram? (i know may depend on disk i/o, cpu , calculation pattern. want estimated answer based on experience)
if calculation on 1 day / 1 hour data expensive in stream. have better suggestion?
Comments
Post a Comment