profiling - How to organize data (writing your own profiler) -


i thinking using reflection generate profiler. lets generating code without problem; how measure or organize results? i'm concerned cpu time memory suggestions welcome

there lots of bad ways write profilers.

i wrote thought pretty 1 on 20 years ago. is, made decent demo, when came down serious performance tuning concluded there nothing works better, , gives better results, dumb old manual method, and here's why.

anyway, if you're writing profiler, here's think should do:

  • it should sample stack @ unpredictable times, , each stack sample should contain line number information, not functions, in code being tuned. it's not important have in system functions can't edit.

  • it should able sample during blocked time i/o, sleeps, , locking, because result in slowness cpu operations.

  • it should have hot-key user can use, enable sampling during times care (like not when waiting user something).

  • do not assume necessary measurement precision, necessitating large number of frequent samples. incredibly basic, , major reversal of common wisdom. reason simple - doesn't measure problems if price pay failure find them. that's happens profilers - speedups hide them, user content finding maybe 1 or 2 small speedups while giant ones away. giant speedups ones take large percentage of time, , number of stack samples takes find them inversely proportional time take. if program spends 30% of time doing avoidable, takes (on average) 2/0.3 = 6.67 samples before seen twice, , that's enough pinpoint it.

    answer question, if number of samples small, doesn't matter how store them. print them file if - whatever. doesn't have fast, because don't sample while you're saving sample.

  • what does allow speedups found when user looks @ , understands individual samples. profilers have kinds of ui - hot spots, call counts, hot paths, call graphs, call trees, flame graphs, phony 3-digit "statistics", blah, blah. if it's done, that's timing information. doesn't tell why time spent, , that's need know. make eye candy if want, let user see actual samples.

... , luck.

added: sample looks this:

main:27, myfunc:16, otherfunc:9, ..., somefunc;132 

that means main @ line 27, calling myfunc. myfunc @ line 16, calling otherfunc, , on. @ end, it's in somefunc @ line 132, not calling (or calling can't identify). no need line ranges. (if you're tempted worry recursion - don't. if same function shows more once in sample, that's recursion. doesn't affect anything.)

you don't need lot of samples. when did it, sampling not automatic @ all. have user press both shift keys simultaneously, , trigger sample. user grab 10 or 20 samples, crucial user take samples during phase of program's execution annoys user slowness, between time button clicked , time ui responds. way have hot-key runs sampling on timer while pressed. if program command-line app no user input, can sample time while executes. frequency of sampling not have fast. goal moderate number of samples during program phase subjectively slow. if take many samples at, when @ them need select @ random.

the thing when examining sample @ each line of code in sample can understand why program spending instant of time. if doing might avoided, , if see similar thing on sample, you've found speedup. how of speedup? (the math here):

enter image description here

for example, if @ 3 samples, , on 2 of them see avoidable code, fixing give speedup - maybe less, maybe more, on average 4x. (that's mean giant speedup. way studying individual samples, not measuring anything.)

there's video here.


Comments

Popular posts from this blog

html - Styling progress bar with inline style -

java - Oracle Sql developer error: could not install some modules -

How to use autoclose brackets in Jupyter notebook? -