MLton has a profiling facility, mlprof, that is similar in usage to gprof. Here is an example run from within the examples/profiling directory showing how to generate profiling information.
% mlton -p -keep g profiling.sml % profiling % mlprof profiling mlmon.out 0.49 seconds of CPU time tak_0 87.76% IntInf_smallMul (C) 12.24%In summary, generating profiling information takes three steps.
In translating from SML to CPS, MLton attempts to preserve source names, but due to anonymous functions, inlining, monomorphisation, closure conversion, and other optimizations, there is often quite a difference between the source and the optimized CPS program. The source names are always mangled by adding a suffix of _n for some n (e.g. tak_0). Names in the source may disappear (e.g. f is nowhere to be found because it has been turned into a loop within main) or new names may appear from the SML basis library code that is prefixed to your program.
You can display profiling information at the CPS basic block level with mlprof -d 1.
% mlprof -d 1 profiling mlmon.out 0.49 seconds of CPU time tak_0 87.76% tak_0 27.91% L_58,L_71 18.60% L_53,L_60 13.95% L_56,L_72 11.63% L_54,L_73 6.98% L_60 6.98% L_58 4.65% L_56 4.65% L_52 4.65% IntInf_smallMul (C) 12.24%Here, the indented tak_0 refers to the basic block that is then entry of the CPS tak_0 function. The label L_58,L_71 basic block in profiling.cps, within the tak_0 function called L_58. This basic block also appears as label L_71 in profiling.0.S.
You can display profiling information at the assembly basic block level with mlprof -d 2. Other mlprof options are -s, which will print information about static C functions, and -t, which will limit mlprof to only print information about functions (or blocks) whose percentage of time is above a certain threshold.
You may find it useful to use the -keep dot switch when compiling -p, since this saves the the CPS function call graph (.call-graph.dot) for the whole program. Also, for each CPS function f, it saves the intraprocedural control-flow graph (.f.cfg.dot) and dominator tree (.f.dom.dot).
The profiler works by catching the SIGPROF signal 100 times per second and
recording where in the executable the program counter is. Thus, if you compile
-p, your program should not use the following, which will interfere with
the profiler.
MLton.Itimer.set {which = MLton.Itimer.Prof, ...} |
MLton.Signal.Handler.set (MLton.Signal.prof, ...) |