jvm - How do I write a correct micro-benchmark in Java? -
how write (and run) correct micro-benchmark in java?
i'm looking here code samples , comments illustrating various things think about.
example: should benchmark measure time/iteration or iterations/time, , why?
tips writing micro benchmarks from creators of java hotspot:
rule 0: read reputable paper on jvms , micro-benchmarking. 1 brian goetz, 2005. not expect micro-benchmarks; measure limited range of jvm performance characteristics.
rule 1: include warmup phase runs test kernel way through, enough trigger initializations , compilations before timing phase(s). (fewer iterations ok on warmup phase. rule of thumb several tens of thousands of inner loop iterations.)
rule 2: run -xx:+printcompilation
, -verbose:gc
, etc., can verify compiler , other parts of jvm not doing unexpected work during timing phase.
rule 2.1: print messages @ beginning , end of timing , warmup phases, can verify there no output rule 2 during timing phase.
rule 3: aware of difference between -client , -server, , osr , regular compilations. -xx:+printcompilation
flag reports osr compilations at-sign denote non-initial entry point, example: trouble$1::run @ 2 (41 bytes)
. prefer server client, , regular osr, if after best performance.
rule 4: aware of initialization effects. not print first time during timing phase, since printing loads , initializes classes. not load new classes outside of warmup phase (or final reporting phase), unless testing class loading (and in case load test classes). rule 2 first line of defense against such effects.
rule 5: aware of deoptimization , recompilation effects. not take code path first time in timing phase, because compiler may junk , recompile code, based on earlier optimistic assumption path not going used @ all. rule 2 first line of defense against such effects.
rule 6: use appropriate tools read compiler's mind, , expect surprised code produces. inspect code before forming theories makes faster or slower.
rule 7: reduce noise in measurements. run benchmark on quiet machine, , run several times, discarding outliers. use -xbatch
serialize compiler application, , consider setting -xx:cicompilercount=1
prevent compiler running in parallel itself.
rule 8: use library benchmark more efficient , debugged sole purpose. such jmh, caliper or bill , paul's excellent ucsd benchmarks java.
Comments
Post a Comment