I finally bit the bullet and rewrote eigenclass using the Ocsigen web server + framework for OCaml. It is simpler, faster, more reliable, and easier to extend than the customized wiki implementation in Ruby (Hiki) I'd been using. It is also easier to deploy because it's self-contained: a single (native code) executable contains both the Ocsigen web server and the application code, so I don't have to use any special Apache modules, FastCGI or any sort of adapter. (The ability to create standalone, native-code executables was added recently to Ocsigen and is thus available on the devel branch, soon to be released as Ocsigen 1.2.)

I'd read somewhere that the Ocsigen server hadn't received much (any?) optimization work, so I benchmarked it against Lighttpd, Apache and mongrel, both at static file serving and dynamic contents (a minimal "hello world" service), to see if that could represent a problem. It turns out it isn't: the OCaml+Ocsigen combo is very fast. It serves minimal dynamic requests an order of magnitude faster than Rails with a pack of mongrels behind nginx, and uses 40 times less memory. More surprisingly, it handles more requests per core than lighttpd with a minimal FastCGI server written in C! (lighttpd wasn't able to handle ab's load with max_procs = 1, and generated way too many 5xx errors, so I had to use several FastCGI processes). It also serves static files at rates exceeding Apache's (per core).

The following figures were obtained using ApacheBench (ab) locally, on a 3GHz, dual core Athlon64 64 X2.

Dynamic contents

Reqs/sec Mem usage (resident memory, RSS)
Rails with mongrel, 1 process 260 49MB
Rails with mongrel via nginx (rev proxy), 1 proc 220 ~51MB
Rails with mongrel, 4 processes via nginx 430 ~200MB
Ocsigen (1 process) 5800 4.5MB
lighttpd with FastCGI app in C, 20 procs 9300 4.5MB

Obviously, these figures represent only upper bounds, since the "dynamic" content was but "hello world", and few sites (certainly not eigenclass.org) need to handle thousands of requests per second. The interesting thing is that, if anything, the difference is going to become even more favorable for Ocsigen+OCaml if the page involves any significant amount of computation, as OCaml is typically 100 times faster than interpreted languages like Ruby. For instance, the OCaml code that processes the markdown-like markup used for this very page is fast enough to sustain over 2000 requests per second without caching the generated HTML. A quick test shows that Ruby's bluecloth library is around 200 times slower, so I would be getting maybe 20 reqs/sec (using both cores) on the AMD64 box (much faster than the one running eigenclass.org) with Rails + Mongrel + nginx. Of course, caching would solve this.; this is not a panacea, though, as it introduces other problems (expiration, invalidation, resource limitation, etc.) and is not always applicable.

At the end of the day, this means that OCaml + Ocsigen allow me to write code that can be deployed trivially (I can even link the executable statically so that it doesn't depend on libs like SQLite or libssl), and is more than fast enough with a single process (no load balancing needed) and no caching (no memcached or whatever).

File serving

(Small 13-byte file.)

Reqs/sec
mongrel ~1000
Ocsigen (1 process) ~4500
Apache2 (multiple workers) ~8500
lighttpd ~12000

This shows that even though Ocsigen could use some optimization on the static serving front (dynamic contents are served faster), it's still quite reasonable. As I said, I've heard Ocsigen has undergone few if any optimizations, so the outlook is quite positive.


Comments

  1. test

    mfp, 07 January 2009 at 20:40#
  2. That is really awesome, like the new layout as well.

    kig, 07 January 2009 at 20:52#
  3. Can you please publish the eigenclass source? I'd love to see how Ocsigen works...

    Bob, 09 January 2009 at 12:59#
  4. I got intrigued and ran ab with an tntnet app. I am writing. Here are some results for comparison. I didn't capture ram usage completely for apache but it was 1-2MB residnt, for tntnet. Apache served same file but statically.

    Test config: concurrency 5, number of request 100,000, using ab, against localhost, document size 1023 bytes
    OS: arch linux 64 bit, 2.6.27.8
    machine: athlon X2, dual core, 1.9GHz, cpu freq. throttled
    
    tntnet 5 threads, dynamic content: 2620.44Kbytes/sec, 2260.60#/sec
    apache 2.2, static file: 2028.20Kbytes/sec, 1562.69#/sec
    

    Regards Shridhar

    Shridhar Daithankar, 09 January 2009 at 15:12#
  5. Bob, I'm releasing it once I document it a bit (gotta add the configuration file and copyright notices to the git repos).

    Fortunately, there's a nice tutorial for Eliom (Ocsigen's web framework) at http://ocsigen.org/tutorialdev1 that illustrates Ocsigen/Eliom's features better than my code does (it doesn't use Ocsigen's more advanced functionality).

    Essentially, you can declare services, specify their parameters (and their types, allowing Eliom to perform static type checking and to "overload" services for the same base URL but different parameters), attach them to an URL and give their handler at once.

    Here's a very small example taken from my code, for the service than handles /R2/writings/xxx, which is declared as follows (I'm changing it a bit for clarity of exposition):

    let rec page_service =
      register_new_service
        ~path:["writings"] ~get_params:(suffix (string "page")) serve_page
    

    This indicates that the service attached at writings takes a string parameter given as a suffix (otherwise, it could only be given with ?page=xxxx; using suffix allows both styles). serve_page is a function taking the request ino and a string parameter (and no POST parameters) which returns the response. Eliom/OCaml will check statically that serve_page has got the right type. Moreover, Eliom can ensure statically that all links are valid (i.e. that you're linking to an existent service and that you're giving it the right parameters).

    Now, you can use OCaml to its full power to structure the handlers, using higher-order functions, partial application, functors...

    Another feature of Ocsigen is that it includes a typed HTML/XHTML module that ensures (again, statically) that the generated markup is valid. The code will look a bit like the XML builders you often find in dynamic languages, the key difference being that it's typed and wrong markup just doesn't compile.

    This is a function that generates an HTML page (with some extras) containing the supplied title and body:

    let rec page_with_title sp thetitle thebody =
      html
        (head (title (pcdata thetitle)) [css_link css_uri (); ctype_meta; rss2_link sp])
        (body (thebody @ analytics))
    

    The compiler knows which elements are valid inside body and any other tags, and will complain if you try to generate invalid XHTML/HTML 4.01 (here I show the error you get in the toplevel aka. REPL; the compiler will behave similarly):

    # page_with_title () "the title" [ol (pcdata "foo") []];;
                                         --------------
    This expression has type [> `PCDATA ] XHTML.M.elt but is here used with type
      [< `Li ] XHTML.M.elt
    The second variant type does not allow tag(s) `PCDATA
    

    The request handler looks like this:

    and serve_page sp page () = match Pages.get_entry pages page with
        (*         ^       ^                                       *)
        (*         |    this () represents (empty) post parameters *)
        (*    this holds additional request info                   *)
        None -> not_found ()
      | Some node ->
          let thetitle = Node.title node in
          let toplink = a ~service:toplevel_service ~sp [pcdata !toplevel_link] ()
          in page_with_title sp thetitle
               [div_with_id "article_body"
                  (div_with_id "header"
                     [h1 [a ~service:page_service ~sp [pcdata thetitle] page];
                      with_class p "date" [pcdata (format_date (Node.date node))];
                      p [toplink]] ::
                   (node_body_with_comments ~sp node @ [footer]))]
    

    That a ~service:page_service ~sp [pcdata thetitle] page is a "permalink". Eliom knows that the service takes only one parameter of type string, and the compiler would complain if I tried to give it no or more parameters with the wrong types --- it would also tell me if I were trying to link to a non-existent service (page_service would not be defined).

    mfp, 09 January 2009 at 15:36#
  6. I think something is wrong here, all of these numbers are REALLY low. Apache and lighttpd should be more or less identical for small static files (with lighty pulling ahead for large static files served off nfs or via proxy), as both will be limited by bandwidth, and both should be way higher unless you are on a 1mbit hub.

    Brian McCallister, 09 January 2009 at 18:03#
  7. I think something is wrong here, all of these numbers are REALLY low.

    Does this also apply to the "dynamic" test, i.e., should I be getting thousands of reqs/s out of nginx + mongrel + RoR? If so, how? Some quick googling yields a few results that seem in line with mine (mongrel + RoR serving at most a few hundred requests per core on comparable machines).

    Apache and lighttpd should be more or less identical for small static files (with lighty pulling ahead for large static files served off nfs or via proxy), as both will be limited by bandwidth, and both should be way higher unless you are on a 1mbit hub.

    How many requests per second should I be getting, what would be a normal ballpark figure (Linux 2.6.26-1-amd64, 3GHz dual core)? I'm running the tests locally, not over the wire, and the servers are far from being limited by bandwidth, they're CPU-bound.

    I have to confess I'm not that interested in static file serving performance; dynamic contents are normally the limiting factor.

    mfp, 09 January 2009 at 20:43#
  8. hey, i like that you are trying to get performance. however, the above code in the comments is quite noisy. It is easy to outperform rails on a web performance basis. It is hard to compete with it on a 'get things done quickly and nicely' basis. I cannot state 'programmer performance', because some of the people who work on rails code should not be called programmers.

    Dru Nelson, 09 January 2009 at 21:08#
  9. How's it compare against, say, Erlyweb?

    http://erlyweb.org/

    Daniel Berger, 09 January 2009 at 21:45#
  10. Dan, I didn't know I was signing up for a web framework shootout :)

    I read that Erlyweb is 4 to 6 times faster than Rails. If that's true, it'd be in the ~2000-2500 reqs/sec range on my box, which sounds very good! If the page involves substantial amounts of computation, Erlang's performance shortcomings might become a problem, though. String manipulation is known to be one of Erlang's major weakness (e.g., by default strings use 16 bytes per character(!)), which could also prove challenging.

    The thing implementations based on Erlang should own everything else at is handling massive numbers of simultaneous connections, e.g. for streaming. (Ocsigen also uses lightweight threads which can be spawned by the millions, but still employs select to monitor connections --- I reckon it'd take a few dozen lines of code to switch to something like epoll; in fact I have that half-written).

    Dru, I'm sure Rails is almost unbeatable regarding devel speed for many (most?) CRUD applications thanks to its convention over configuration philosophy. It seems to me that the head start disappears quickly once you deviate from the default behavior, though.

    The code you're seeing corresponds mostly to HTML generation using Rails' Builder. In Ocsigen, this is called XHTML.M, and has got two advantages over Rails' counterpart:

    1. it is typed and guarantees statically that the generated markup is valid (invalid markup does not compile)

    2. it is much faster

    For instance,

    let rec page_with_title sp thetitle thebody =
      html
        (head (title (pcdata thetitle)) [css_link css_uri (); ctype_meta; rss2_link sp])
        (body (thebody @ analytics))
    

    corresponds roughly to this in Ruby/Rails:

    def page_with_title(title, buffer = "", &body)
      xm = Builder::XmlMarkup.new(buffer)
      xm.body {
        xm.head {
          xm.title title
          xm.link("href" => @css_uri, "type" => "text/css", "rel" => "stylesheet")
          xm.meta("content" => "text/html; charset=UTF-8",
                  "http-equiv" => "Content-Type")
          rss2_link(xm)
        }
        xm.body {
          body.call(xm)
          analytics(xm)
        }
      }
      xm.target!
    end
    

    Could this be done using something equivalent to erb? Certainly, Ocsigen allows you to use whatever templating engine you choose, and there are some similar in spirit to erb (only much faster :).

    Continuing with the analogies, a ~service:page_service ~sp [pcdata thetitle] page is equivalent to link_to thetitle, { :controller => "pages", :action => "show", :page => page }.

    Granted,

    register_new_service
        ~path:["writings"] ~get_params:(suffix (string "page")) serve_page
    

    is "noise" compared to an implicit Rails route, but not so much if you consider it corresponds to something like (I had to google for this so it might be a bit off)

    map.connect 'writings/:page`,
      :controller => 'pages', :action => 'serve_page',
      :requirements => { :page => /\S+/ }
    

    At the end of the day, you have to map URLs to actions and generate HTML, so the code isn't that different if you're deviating from the defaults.

    mfp, 10 January 2009 at 01:51#
    • Rails was designed for developer productivity. Hardware is (relatively) cheap

    • The real bottlenecks on high-volume apps tend to be bandwidth to/from the site and database / disk access -- i.e. io bound, not cpu bound.

    Any examples of high-volume sites running OCaml + Ocsigen in the real world? "hello world" is not a very compelling demonstration.

    Jeremy, 10 January 2009 at 06:56#
  11. Jeremy (comment #11):

    There was a time when Ruby on Rails was not used on any high volume sites too (and that was only about 3 years ago).

    The trouble with the "hardware is (relatively) cheap" argument is that it doesn't apply when you need to scale your website up to the million user range. At that point, memory usage in particular kills you because you cannot put enough memory in each physical server to utilize the CPUs. I have real experience of this when running a relatively high volume Perl-based website. The interesting thing about this benchmark to my mind is not the order of magnitude improvement in speed (that bit is just really nice), but the order of magnitude reduction in memory usage, which means I can run 10-20 times as many clients on each server.

    Rich, 10 January 2009 at 11:28#
  12. Have you tried return a binary file with Ocaml and others things ? In Nurpawiki which I use, I see that image files take a moment to appear. It would be better if you give differences between a simple hello world and images (or other binary files).

    And what about Phusion Passenger and Apache2 for a rails application ? It could be interesting to see apache2 + phusion passenger results !

    Blankoworld, 10 January 2009 at 13:07#
  13. Very cool. Although this really is awesome and I'll have to play with it, sometimes seeing these web frameworks in e.g., Smalltalk and Ocaml remind me of

    http://www.coboloncogs.org/HOME.HTM

    :P Looking forward to learning more.

    DanF, 10 January 2009 at 18:39#
  14. Rich: Rails serves high-volume web sites, check http://rubyonrails.org/applications.

    Xavier Noria, 11 January 2009 at 02:27#
  15. Jeremy:

    I'm not trying to rebut any of that. Yes,

    1. "Rails was designed for developer productivity."

    2. Most sites are ultimately limited by the DB.

    You're fixating on the trivial "hello world" figures, when the thing that matters the most here is "trivial deployment". I'm sure you'll agree on the importance of easy deployment after all the hoops RoR has gone through --- first it was FastCGI, then Mongrels, then Phusion (and I'm missing some)...

    I must say, however, that it seems there's a false dichotomy hiding behind your first assertion: it makes it look as if one had to choose between "developer productivity" and the qualities Rails lacks such as performance or trivial deployment without external servers + modules. I think it's possible to have both while adding better maintainability and decreasing the development effort in new ways such as preventing many bugs altogether (e.g. broken links, invalid markup, bad page parameters, broken SQL queries...). Ocsigen, HAppS, or Ur/Web represent interesting steps in that direction.

    Any examples of high-volume sites running OCaml + Ocsigen in the real world?

    Uses of OCaml are not advertised the way Rails is, so I only know about the companies which gave themselves away by releasing OCaml software under open-source licenses :). I know skydeck uses OCaml (seemingly for its frontend too). The OCaml-based search technology from wink.com will soon power reunion.com, which sustains some ~20M visits (est. >100M pageviews) a month; I believe the frontend used to be OCaml too (unsure).

    I don't know if anybody is using Ocsigen on a high-volume site. I find this question funny, though: it sounds a bit like the "anybody using Ruby on Rails on a site with heavy traffic?" questions you heard back in 2005, with the key difference that, if anything, OCaml/Ocsigen is better equipped for higher loads thanks to the better performance, lower memory usage, and higher reliability (no memory leaks like those that have plagued RoR for years, more robust runtime, fewer problems with segfaulting C extensions, etc.).

    Blankoworld:

    I did a quick test with binary files and didn't appreciate any difference.

    And what about Phusion Passenger and Apache2 for a rails application ? It could be interesting to see apache2 + phusion passenger results !

    As I've said repeatedly, the "hello world" figures matter very little, but if you insist, what about this: you set up Phusion Passenger + Apache2 + Rails on an EC2 image of your choice (so the HW is known and the results can be readily compared to other measurements) and I give you a single-file, standalone server + app using Ocsigen for you to benchmark. Deal? :-)

    Alternatively, you can take the results from the comparisons between Mongrel and Phusion Passenger setups which indicate that the differences are minimal. Memory usage, in particular, changes very little.

    mfp, 12 January 2009 at 19:45#
  16. Rails serves high-volume web sites, check http://rubyonrails.org/applications.

    Imagine the energy savings if all those sites were powered by Ocsigen. Ocaml = Green!

    wh, 12 January 2009 at 20:04#
  17. Could somebody make the same benchmark, but with JRuby and YARV?

    Fixxer, 13 January 2009 at 06:35#
  18. I am really glad to hear that you are releasing the Eigenclass source. I have started writing an app with Eliom+Ocsigen and the one thing I notice is a lack of real-world source code that can be used as an example.

    The tutorial is pretty good, but tutorials only get you so far. Even the snippet you posted in the comments answered one question I had (how to use path elements as arguments).

    Thank you.

    Alan Falloon, 19 January 2009 at 16:14#
  19. Excellent benchmark. Don't mind the entrenched rubyists, they may just be afraid that their edge is disappearing as others catch up. Rails isn't on the way out, but it is going the way of PHP and that both frightens and intrigues me. More and more developers are going to other languages to try to gain an edge now that rails is becoming mainstream.

    So please keep up the research and love of OCaml. I'll ditch RoR when I see a clear advantage by some other language but I prefer to let the smart people do the pioneering for a little while. Though I do want to play with Haskell and see how it compares considering the incredible benchmarks I've been seeing out of the Haskell camp. :)

    Regards!

    Chuck Vose, 19 January 2009 at 17:25#