CPAN vs. RAA: costs
I tried to improve my estimate of RAA's cost by running the script shown below against 11% of the archive (by project count); that subset would cost around $20M (600000 lines of code), leaving the total cost of the RAA under $191 million. I then compared it to a revision of the cost of CPAN computed in 2004 which lowers the original estimate substantially.
The final figure is somewhat biased because I didn't pick the projects randomly (so the remainder should be smaller on average), but it still serves as an upper bound.
Comparison with CPAN
The cost of CPAN
was estimated to be under $677 million in 2004.
That analysis was faulty because it considered all of CPAN as a single
project with 15.5 million LOCs, which would inflate the numbers due to
the nonlinear effort estimate equation
.
The error introduced will be smaller than

where P is the number of projects and L the average project size.
Unfortunately, I couldn't find any size statistics for the CPAN, so I just
took 5000 as a very conservative estimate of CPAN's size in 2004 (knowing
that it's close to 10000 modules now) --- the smaller the number of projects, the
less important the bias introduced in the original analysis. Retaining
that number, the 2004 result was bloated by at most
, leaving CPAN's cost in 2004 between
$442M and $677, depending on the size distribution of CPAN's modules.
RAA's cost in 2006 $ is under $191M --- let's make it $100 million, assuming that the 89% I didn't analyze is smaller on average than the 11% I did consider. Inflation is well under the error margin for CPAN's cost, so there's no need to convert it into 2006-dollars. So the final, quotable result is
CPAN would cost around 5 times more than RAA according to the COCOMO basic model.
The cost of the interpreters
Surprisingly, ruby (including its standard lib) costs more than the corresponding perl distribution: it's $20M vs. $15M, due to Ruby's richer standard library. By the way, since perl is hosted in CPAN and ruby isn't in RAA, those $20M could be added to the $100M used in the above analysis...
Counting lines of code
The original CPAN estimate was done with SLOCCount. I also used it for perl and ruby (the interpreters plus stdlibs themselves), but wrote a small script for the RAA subset:
require 'find' def stats_for_dir(dname) nfiles = lines = 0 Find.find(dname) do |fname| next unless File.file? fname next if %w[svn darcs setup.rb install.rb].any?{|x| Regexp.new(Regexp.escape(x)) =~ fname } if fname =~ /\.(rb|c|h)$/ or File.open(fname){|f| f.gets =~ /^#!.*ruby/} $stderr.puts(" " * 10 + fname) nfiles += 1 File.open(fname){|f| f.each{ lines += 1 } } end end puts "%-35s %3d %5d " % [dname, nfiles, lines] $stderr.puts "%-35s %3d %5d " % [dname, nfiles, lines] [nfiles, lines] end i = 0 all_stats = {} ARGF.each do |dname| dir = dname.chomp #next if (i += 1) > 4 all_stats[dir] = stats_for_dir(dir) end total_files = total_locs = 0 files_sq = locs_sq = 0 all_stats.each do |name, (files, locs)| total_files += files total_locs += locs files_sq += files ** 2 locs_sq += locs ** 2 end avg_files, avg_locs = 1.0 * total_files / all_stats.size, 1.0 * total_locs / all_stats.size stddev_files = Math.sqrt(1.0 * files_sq / all_stats.size - avg_files ** 2) stddev_locs = Math.sqrt(1.0 * locs_sq / all_stats.size - avg_locs ** 2) $stderr.puts <<EOF % [total_files, avg_files, stddev_files, total_locs, avg_locs, stddev_locs] #{all_stats.size} libs/apps analyzed Total Avg stddev Files: %-6d %-6.1f %-5f LoCs: %-6d %-6d %-5f EOF
My stats
| Total | Avg | stddev | |
|---|---|---|---|
| Files | 4052 | 26.5 | 67.588345 |
| LoCs: | 607552 | 3970 | 8983.547211 |
Cost estimate:
- man months 1650.3
- cost $20511561
Rails is the largest project, and hence the most expensive one, at $2.7M. rb-gsl is (very unexpectedly) quite a close second ($2.2M)...
| Name | Files | LOCs | Cost | Man months |
|---|---|---|---|---|
| BlueCloth-1.0.0 | 9 | 3958 | 126471 | 10.2 |
| FXRuby-1.0.29 | 449 | 48168 | 1743970 | 140.3 |
| Getopt | 19 | 3474 | 110284 | 8.9 |
| Linguistics-1.02 | 15 | 6781 | 222587 | 17.9 |
| PluginFactory-1.0.0 | 6 | 449 | 12867 | 1.0 |
| PrettyException-0.9.3 | 2 | 1092 | 32717 | 2.6 |
| RHDL-0.4.3 | 23 | 2274 | 70676 | 5.7 |
| RedCloth-3.0.0 | 4 | 1403 | 42565 | 3.4 |
| Ruby-HashSlice-1.03 | 2 | 184 | 5043 | 0.4 |
| RubyInline-3.1.0 | 7 | 1215 | 36597 | 2.9 |
| SpeedReader-0.5 | 16 | 1597 | 48765 | 3.9 |
| Test-Unit-Mock-0.03 | 4 | 1264 | 38148 | 3.1 |
| aeditor-1.9 | 24 | 10900 | 366387 | 29.5 |
| aes | 4 | 931 | 27671 | 2.2 |
| amrita-1.0.2 | 82 | 11175 | 376099 | 30.3 |
| ansicolor-0.0.3 | 3 | 177 | 4841 | 0.4 |
| archive-tar-minitar-0.5.1 | 5 | 2456 | 76627 | 6.2 |
| arrayfields-3.4.0 | 3 | 692 | 20265 | 1.6 |
| aspectr-0-3-5 | 4 | 644 | 18791 | 1.5 |
| bdb-0.5.4 | 52 | 17930 | 617876 | 49.7 |
| bdbxml-0.5.2 | 30 | 2018 | 62346 | 5.0 |
| bitset-0.6.2 | 5 | 1746 | 53553 | 4.3 |
| bloom | 3 | 233 | 6461 | 0.5 |
| borges-1.1.0 | 157 | 9953 | 333038 | 26.8 |
| breakpoint | 6 | 687 | 20111 | 1.6 |
| builder-1.2.2 | 9 | 1017 | 30361 | 2.4 |
| bz2-0.2.2 | 7 | 2563 | 80136 | 6.4 |
| cache-0.1.0 | 1 | 362 | 10263 | 0.8 |
| captcha-0.1.2 | 4 | 567 | 16440 | 1.3 |
| cast_256 | 3 | 549 | 15892 | 1.3 |
| cgikit-1.2.1 | 61 | 11405 | 384231 | 30.9 |
| chun | 1 | 577 | 16744 | 1.3 |
| copland-1.0.0 | 117 | 11339 | 381896 | 30.7 |
| copland-lib-0.1.0 | 21 | 1479 | 44989 | 3.6 |
| copland-remote-0.1.0 | 18 | 1676 | 51301 | 4.1 |
| copland-webrick-0.1.0 | 16 | 1433 | 43521 | 3.5 |
| criteria-1.1a | 11 | 1101 | 33000 | 2.7 |
| crosscase | 8 | 1238 | 37324 | 3.0 |
| crypt-fog-0.1.0 | 3 | 121 | 3247 | 0.3 |
| crypt-isaac_0.9 | 1 | 165 | 4497 | 0.4 |
| cstemplate-0.5.1 | 2 | 904 | 26829 | 2.2 |
| dbdbd-0.2.2 | 5 | 569 | 16501 | 1.3 |
| dbus-0.1.10 | 25 | 4062 | 129962 | 10.5 |
| dev-utils-1.0.1 | 10 | 1181 | 35522 | 2.9 |
| diff-0.4 | 7 | 525 | 15163 | 1.2 |
| diff-lcs-1.1.2 | 11 | 2950 | 92887 | 7.5 |
| directorywatcher | 1 | 245 | 6811 | 0.5 |
| djb-netstrings-ruby-0.1.0 | 2 | 110 | 2938 | 0.2 |
| dpklib-1.0.6 | 133 | 8441 | 280127 | 22.5 |
| drbfire-0-1-0 | 4 | 505 | 14557 | 1.2 |
| entryCache-1.1 | 5 | 344 | 9728 | 0.8 |
| extensions-0.6.0 | 35 | 3491 | 110851 | 8.9 |
| extmath-2.3 | 2 | 1425 | 43266 | 3.5 |
| flattenx-0.1.0 | 3 | 158 | 4297 | 0.3 |
| flexmock-0.0.3 | 4 | 260 | 7250 | 0.6 |
| formvalidator-0.1.3 | 9 | 1595 | 48701 | 3.9 |
| fsdb-0.4 | 29 | 3430 | 108818 | 8.8 |
| gemfinder-1.9.6 | 16 | 1364 | 41323 | 3.3 |
| gurgitate-mail-1.4.1 | 7 | 621 | 18087 | 1.5 |
| hobix-0.3 | 24 | 3571 | 113519 | 9.1 |
| html-parser-19990912p2 | 4 | 1098 | 32905 | 2.6 |
| htmltokenizer | 1 | 259 | 7221 | 0.6 |
| ikko-0.1 | 1 | 273 | 7631 | 0.6 |
| instiki-0.9.1 | 33 | 3746 | 119368 | 9.6 |
| interface-0.1.0 | 6 | 211 | 5822 | 0.5 |
| iowa_0.9.2 | 49 | 5336 | 173068 | 13.9 |
| iterator-0.8 | 16 | 2532 | 79118 | 6.4 |
| jabber4r-0.6.0 | 12 | 2827 | 88824 | 7.1 |
| kansas_0.2 | 16 | 2231 | 69273 | 5.6 |
| keyedlist | 2 | 311 | 8750 | 0.7 |
| kirbybase-1.6 | 3 | 1215 | 36597 | 2.9 |
| lafcadio-0.4.0 | 132 | 6972 | 229175 | 18.4 |
| libgnucap-ruby-0.1 | 3 | 277 | 7749 | 0.6 |
| libxml-0.3.4 | 56 | 6493 | 212671 | 17.1 |
| lingua-0.5 | 5 | 443 | 12687 | 1.0 |
| log4r-1.0.5 | 46 | 2924 | 92027 | 7.4 |
| madeleine-0.6.1 | 18 | 3068 | 96792 | 7.8 |
| mahoro-0.1 | 5 | 425 | 12146 | 1.0 |
| math-const-1.0.1 | 2 | 275 | 7690 | 0.6 |
| metatags-1.0 | 8 | 484 | 13922 | 1.1 |
| midilib-0.8.3 | 20 | 2794 | 87736 | 7.1 |
| mime-types-1.13.1 | 3 | 1635 | 49984 | 4.0 |
| mw-template-0.9.1 | 13 | 2269 | 70512 | 5.7 |
| narray-0.5.7p4 | 52 | 12743 | 431695 | 34.7 |
| needle-1.2.0 | 65 | 6072 | 198216 | 15.9 |
| needle-extras-1.0.0 | 10 | 621 | 18087 | 1.5 |
| net-sftp-0.5.0 | 73 | 5200 | 168440 | 13.6 |
| net-ssh-0.6.0 | 121 | 14267 | 486062 | 39.1 |
| nora-0.0.20041021 | 46 | 5842 | 190340 | 15.3 |
| objectgraph-1.0.1 | 2 | 232 | 6432 | 0.5 |
| objectpool-0.2.0 | 4 | 306 | 8603 | 0.7 |
| patch | 0 | 0 | 0 | 0.0 |
| permutation | 3 | 730 | 21435 | 1.7 |
| pqa-1.3 | 2 | 814 | 24032 | 1.9 |
| proclib | 2 | 262 | 7309 | 0.6 |
| purple-0.5.1 | 61 | 32342 | 1147883 | 92.4 |
| racc | 16 | 5051 | 163376 | 13.1 |
| raggle-0.3.2 | 7 | 5705 | 185656 | 14.9 |
| rails-1.0.0 | 581 | 72560 | 2681480 | 215.7 |
| rake-0.4.15 | 31 | 3874 | 123654 | 9.9 |
| rb-gsl-1.5.2 | 332 | 58838 | 2151707 | 173.1 |
| rb2html-1.1 | 7 | 707 | 20726 | 1.7 |
| rbmhshow-0.4.1 | 16 | 2389 | 74433 | 6.0 |
| rbprof | 2 | 578 | 16775 | 1.3 |
| rbtree-0.1.2 | 5 | 3865 | 123352 | 9.9 |
| rcov-0.2.0 | 3 | 1805 | 55455 | 4.5 |
| regexp-engine-0.12 | 30 | 8125 | 269126 | 21.7 |
| rgl-0.2.2 | 25 | 3089 | 97488 | 7.8 |
| rice-0.0.0.2 | 18 | 2147 | 66537 | 5.4 |
| rlimit-1.0 | 3 | 117 | 3135 | 0.3 |
| rubilicious-0.1.0 | 4 | 618 | 17996 | 1.4 |
| ruby-aes-1.8.0 | 7 | 1064 | 31836 | 2.6 |
| ruby-bsearch-1.5 | 3 | 202 | 5562 | 0.4 |
| ruby-crypt-random-1.3 | 4 | 423 | 12086 | 1.0 |
| ruby-dict-0.9.2 | 2 | 870 | 25771 | 2.1 |
| ruby-gettext-package-0.8.0 | 34 | 2197 | 68165 | 5.5 |
| ruby-goto | 2 | 68 | 1773 | 0.1 |
| ruby-htmltools | 15 | 2458 | 76692 | 6.2 |
| ruby-libneural | 6 | 517 | 14921 | 1.2 |
| ruby-progressbar-0.8 | 2 | 267 | 7455 | 0.6 |
| ruby-romkan-0.4 | 2 | 364 | 10322 | 0.8 |
| ruby-termios-0.9.4 | 8 | 1216 | 36628 | 2.9 |
| rubymail-0.17 | 24 | 7828 | 258806 | 20.8 |
| rubypants-0.2.0 | 2 | 652 | 19037 | 1.5 |
| rubywebdialogs | 15 | 7557 | 249407 | 20.1 |
| rubyzip-0.5.5 | 18 | 7029 | 231142 | 18.6 |
| runt-0.2.0 | 12 | 1571 | 47932 | 3.9 |
| ruvi-0.4.12 | 34 | 10680 | 358626 | 28.9 |
| ruwiki-0.9.0 | 44 | 8290 | 274868 | 22.1 |
| sds-0.3 | 15 | 3572 | 113553 | 9.1 |
| session-2.1.9 | 9 | 1654 | 50594 | 4.1 |
| simplemail-0.3 | 4 | 610 | 17751 | 1.4 |
| snmp-0.3.0 | 18 | 2806 | 88132 | 7.1 |
| sqlite-ruby-2.2.2 | 27 | 6298 | 205970 | 16.6 |
| statistics-020920 | 2 | 292 | 8190 | 0.7 |
| stream-0.5 | 7 | 967 | 28796 | 2.3 |
| sympop-0.9.1 | 1 | 78 | 2048 | 0.2 |
| sys-host-0.5.0 | 9 | 804 | 23722 | 1.9 |
| sys-proctable-0.6.4 | 27 | 3293 | 104259 | 8.4 |
| sys-uptime-0.4.0 | 7 | 482 | 13862 | 1.1 |
| test-report-0.3.0 | 8 | 1111 | 33315 | 2.7 |
| tex-hyphen-0.2 | 3 | 4974 | 160762 | 12.9 |
| text-format-0.64 | 6 | 3080 | 97189 | 7.8 |
| tldlib | 5 | 553 | 16014 | 1.3 |
| tmail-0.10.8 | 43 | 10307 | 345486 | 27.8 |
| types | 2 | 1988 | 61373 | 4.9 |
| webfetcher-0.5.5 | 2 | 1375 | 41673 | 3.4 |
| webgen-0.2.0 | 29 | 3037 | 95765 | 7.7 |
| webunit | 61 | 6566 | 215183 | 17.3 |
| xhtmldiff-1.2.1 | 2 | 208 | 5735 | 0.5 |
| xmlresume2x-0.2.1 | 5 | 495 | 14255 | 1.1 |
- 83 http://www.linuxjournal.com/article/8970
- 28 http://www.artima.com/forums/flat.jsp?forum=123&thread=153478
- 21 http://www.artima.com/buzz/community.jsp?forum=123
- 21 http://planetruby.0x42.net
- 15 http://anarchaia.org
- 13 http://chneukirchen.org/anarchaia
- 5 http://www.anarchaia.org
- 4 http://chneukirchen.org/anarchaia/archive/2006/03/24.html
- 4 http://anarchaia.org/archive/2006/03.html
- 3 http://anarchaia.org/archive/2006/03/24.html
Keyword(s):[blog] [ruby] [raa] [cpan] [cost] [estimate] [cocomo] [subpar] [frontpage]
References:[Ruby]