Call graphs to analyze code dependencies, or just because.
Here's one of the intermediate results on the path to extending the rcov code coverage tool for Ruby: recording call graph information, so we can analyze the dependencies between our modules. The following figure is quite expressive:
The diagram tells at once what's happening below:
class Bar def foo 1+1 end def bar(x) 10.times{|i| x.bar(i) } end end class Test def initialize @bar = Bar.new end def test a = 1 b = 2 @bar.foo end def test2 @bar.bar(self) end def bar(x) if x > 5 @bar.foo else test end end end t = Test.new t.test t.test2
Recording the call graph
This is fairly easy to implement: set_trace_func can be used to record the appropriate events (call, c-call, return and c-return). We just need to keep track of the caller when we return from a method. The basic structure would look like
deps = Hash.new{|h,k| h[k] = Hash.new{|h2,k2| h2[k2] = [] } } call_stack = [] set_trace_func lambda { |event, file, line, id, binding, classname| case event when 'call', 'c-call' call_stack << [classname.to_s, id] when 'return', 'c-return' curr_class, curr_meth = call_stack.pop prev_class, prev_meth = call_stack.last || ["TOP", "TOP"] # slow, don't care deps[prev_class][prev_meth] |= [[curr_class, curr_meth]] end }
Refinement
There are several problems in the naïve code shown above:
- set_trace_func is slow and we can only have one per process
- the call graph is huge because it records lots of uninteresting calls like Fixnum#+
- (related to the latter) the call graph can be confusing because we'll have lots of calls originating in methods taking a block, like Enumerable#each. Also, it'd be better to skip methods like Class.new so we don't lose control flow information when different paths coalesce.
There's only one way to address the first issue: using the event_hook subsystem of the (C) API, the same way rcov's rcovrt does to achieve the 200X speedup compared to other code coverage tools. I'll end up doing that once I decide how to add this all to rcov.
As for the two others, they can be solved in pure Ruby: method calls from uninteresting classes can be ignored if you are careful enough to ensure the stack doesn't underflow.
Once the code we're interested in has been executed, the call graph information will be stored in a structure resembling
{"Test"=>
{:bar=>[["Test", :test], ["Bar", :foo]],
:test2=>[["Bar", :bar]],
:test=>[["Bar", :foo]]},
"Bar"=>{:bar=>[["Test", :bar]]},
"TOP"=>{""=>[["Test", :initialize], ["Test", :test], ["Test", :test2]]}}
The interpretation is quite simple: Test#bar calls Test#test and Bar#foo, Test#test2 calls Bar#bar, etc...
It should be rather easy to process that data further to merge methods across a functional unit or extract interaction patterns, but for the time being a mere graphical representation is OK.
Code
You can use the following snippet as follows:
ruby -record-call-graph whatever.rb
will create an OUTPUT.dot file that can be processed with Graphviz; that information will also be dumped in textual form to stderr.
# Copyright (c) 2006 Mauricio Fernandez <mfp@acm.org> # Use and modification subject to Ruby's license. deps = Hash.new{|h,k| h[k] = Hash.new{|h2,k2| h2[k2] = [] } } call_stack = [["TOP", "", 2**30]] IGNORE = %w[Array Bignum Comparable Config Class Dir Enumerable ENV Fixnum Float Hash IO Integer Kernel OptionParser Module Object Range Regexp String Time].map{|x| Regexp.new(x)} + [/REXML::/] set_trace_func lambda { |event, file, line, id, binding, classname| case event when 'call', 'c-call' if IGNORE.any?{|re| re =~ classname.to_s } call_stack.last[2] += 1 else call_stack << [classname.to_s, id, 1] end when 'return', 'c-return' curr_class, curr_meth, counter = call_stack.last if counter == 1 call_stack.pop prev_class, prev_meth, = call_stack.last # slow, don't care deps[prev_class][prev_meth] |= [[curr_class, curr_meth]] else call_stack.last[2] -= 1 end end } label_escape = lambda do |x| s = x.to_s pairs = {/\|/, '\|', /\</, '\<', /\>/, '\>'} pairs.each{|pre, post| s.gsub!(pre, post)} s end END{ set_trace_func(nil) File.open("OUTPUT.dot", "w") do |f| f.puts <<EOF digraph call_graph { ratio = fill; edge [weight = 3]; EOF all_classes = Hash.new{|h,k| h[k] = {} } deps.each do |klass, src_meths| src_meths.each do |src_meth, dest_meths| all_classes[klass][src_meth] = true dest_meths.each do |dst_class, dst_meth| all_classes[dst_class][dst_meth] = true end end end all_classes.each do |klass, meths| f.puts %!subgraph "cluster_#{klass}" { fontcolor = red; color = blue; label = "#{label_escape[klass]}";! meths.keys.each do |m| label = label_escape[m] f.puts %!"#{klass}##{m}"; ! # [label="#{label}"]; ! end f.puts "}" end deps.each do |src_class, meth_map| meth_map.each do |src_meth, dest_map| dest_map.each do |dest_class, dest_meth| f.puts %!"#{src_class}##{src_meth}" -> "#{dest_class}##{dest_meth}"; ! end end end f.puts "}" require 'pp' PP.pp(deps, $stderr) end }
I have included a number of classes to be ignored in the IGNORE array, but the list is far from exhaustive: anyway, some more work will be needed to ensure the call graph is reasonable (coalescing, dynamic tracing activation, etc.).
- 45 http://www.artima.com/forums/flat.jsp?forum=123&thread=152032
- 22 http://planetruby.0x42.net
- 16 http://abstractplain.net/blog/?p=952
- 11 http://chneukirchen.org/anarchaia
- 10 http://www.artima.com/buzz/community.jsp?forum=123
- 8 http://anarchaia.org
- 8 http://chneukirchen.org/anarchaia/archive/2006/03/15.html
- 7 http://abstractplain.net/blog/?cat=48
- 6 http://www.anarchaia.org
- 3 http://search.msn.com/results.aspx?q=meth graphs&mkt=en-US&form=QBRE
Keyword(s):[blog] [ruby] [call] [graph] [rcov] [set_trace_func]
References: