Fast image enlargement through seam insertion in OCaml
I have implemented the last major feature missing in my content-aware image resizer based on seam carving/insertion, image enlargement via seam insertion. It took a whole 50 lines of OCaml code which you can find below. My program now does everything the Gimp LiquidRescale plug-in does, so a direct comparison can be done.
Update [2007-11-19] Carlo Baldassi, LiquidRescale's author, explains his design choice. The data structure could support real-time rescaling (multi-size images), at the cost of worse performance. See his comment.
Looking at the rendering part only, I have needed under 500 lines of OCaml code *1. LiquidRescale's render.{c,h} comprise over 2100 lines of code. On my box, seam carving and insertion in OCaml are both over 5 times faster than LiquidRescale's (remember I'm not comparing my code to a Python script; LiquidRescale is written in C).
Here's a tiny sample of what seam insertion is about. Given this image
if you just rescale it, it will look like this
The image can be enlarged while preserving the aspect ratios of the salient objects using seam insertion:
Content-aware image enlargement is not as flashy as object removal with seam carving, but it's still somewhat useful.
How seam insertion works
Seam insertion is very easy once you have seam carving, since it can be trivially built atop it as reflected also by the code, where seam insertion is performed by a functor which abstracts over the seam carving module:
module Make(Carving: Seamcarving.S) = struct ... end
This means that I automatically get biased seam insertion, as I already had biased seam carving, which is used to remove objects from an image (see my first post for an explanation of the seam carving algorithm itself).
There are basically two steps to image enlargement through seam insertion:
- pretend we're carving seams from the image as if we were shrinking it, and record all the seams (connected paths) that were removed
- on the original image, follow those paths and insert new pixels whose colors are the average of that of their neighbors
It's conceptually easy, and also simple in practice. I wrote that code two weeks ago, as well as a 5-line cheap command-line tool to invoke it. Unfortunately, it didn't work on the very first run after it typed. I left it there at that time because it was 3am or so (my code gets buggier past ~1-2am). As I revisited it today, I found the bug within a few seconds. I was doing
let insert_seams t n =
let rec loop t i acc =
if i < n then begin
let t, path = Carving.seam_carve_h' t in
loop t (i - 1) (path :: acc)
end else Array.of_list (List.rev acc) in
...
Obviously, that (i - 1) should have been (i + 1), but I prefer to change the test to i > 0. I'll stick to decreasing counters in tail-recursive functions from now on.
Other than that, as far as I can tell everything works fine. As for the command-line, I had just copy-pasted the code used for seam carving, but that code duplication felt so dirty I ended up creating a functor that takes a RESIZER (which can either enlarge or shrink the image) and instantiates a full-blown command line tool. I did it in auto mode, without much thought, and this time everything just worked once it typed, even though I was refactoring blindly.
(*
* seamcarve, content-aware image resizing using seam carving
* Copyright (C) 2007 Mauricio Fernandez <mfp@acm.org>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*)
open Seamcarving
module Make(Carving: Seamcarving.S) =
struct
type t = {
carver : Carving.t;
image : image;
}
let make ecomputation img =
let seam_carver = Carving.make ecomputation (Seamcarving.copy_image img) in
{ carver = seam_carver; image = img }
let insert_seams t n =
let rec loop t i acc =
if i > 0 then begin
let t, path = Carving.seam_carve_h' t in
loop t (i - 1) (path :: acc)
end else Array.of_list (List.rev acc) in
let paths = loop t.carver n [] in
let h = t.image.height in
let w = Array.length paths in
let m = Array.make_matrix h w 0 in
(* path matrix, each path is a column *)
Array.iteri (fun i p ->
for j = 0 to h - 1 do
m.(j).(i) <- p.(i)
done)
paths;
(* adjust paths depending on previous ones *)
for j = 0 to h - 1 do
let row = m.(j) in
for i = 0 to w - 2 do
let rx = row.(i) in
for k = i + 1 to w - 1 do
let v = row.(k) in
if v >= rx then row.(k) <- v + 2
done
done
done;
let insert_seam row pos imgwidth =
Array.blit row pos row (pos+1) (imgwidth - pos);
if pos > 0 then row.(pos) <- average_color row.(pos-1) row.(pos) in
let imgwidth = t.image.width in
let img2 = Seamcarving.enlarge_image t.image n in
(* insert seams *)
for j = 0 to h - 1 do
let srow = m.(j) in
let row = img2.rgb.(j) in
for i = 0 to w - 1 do
insert_seam row srow.(i) (imgwidth + i)
done
done;
{ img2 with width = imgwidth + n }
end
using your code - gilles (2007-10-24 (Wed) 08:07:52)
How can I use your code ? Can it be used from within GIMP or only from the command line ?
mfp 2007-10-24 (Wed) 08:54:36
The build instructions are in README.txt. Command-line only. If I ever use this seriously it will be for batch resizing of thousands of images; I don't personally care about speed under GIMP. It should be possible to hack LiquidScale to delegate the work to my code, but I'm not very interested in that.
$ carve Content-aware rescaling using seam carving. Usage: carve (-vert <n>| -horiz <n>) [options] <files> -o Set output file. -vert Change height by given amount of pixels. -horiz Change width by given amount of pixels. -bias Use energy bias from given file. -verbose Verbose mode. -help Display this list of options --help Display this list of options
Jessica 2008-01-28 (Mon) 11:48:47
aewdsa saf wefrasf adsf sdaf
... vs CoreImageTool - freno (2007-10-25 (Don) 04:24:23)
Just wondering if this could be (done &) compared with CoreImageTool, http://www.entropy.ch/software/macosx/coreimagetool/ .
Anyway, very cool!
mfp 2007-10-25 (Thr) 05:02:07
This is a command line front end to Apple’s Core Image framework introduced in Mac OS X 10.4. It makes core image filter processing available to scripting languages like bash or Perl/Python/Ruby/PHP used for command-line or web applications.
I couldn't see any filter based on seam carving/insertion in the Core Image filter reference, so I cannot compare my implementation to Apple's nonexistent one :-)
Coreimage? - Ahmed.S (2007-10-31 (Wed) 13:05:45)
Isn't Core Image a Mac only framework? Unless you're developing and deploying on the Mac, for the mac folk..besides as mfp stated, Core Image doesn't have a reference to a similar filter. Although, I'm not sure if a 3rd party filter implementation like the one in the aforementioned thread can be used with Coreimage. In any case, I would prefer keeping it as non-proprietary as possible.
LiquidRescale and performance issues - Carlo (10-11-2007 (Sab) 21:00:59)
Hi,
I'm the author of the GIMP LiquidRescale plug-in. I was quite impressed by the difference in performance of the two algorithms, so I started hacking on my version. I have a number of comments, also referred to your previous posts on this topic, which I think you'll be interested into, since I see you do care about performance issues:
- First, let me say that the version of the plugin you've been using is an older one, the latest one (0.3.0 currently) uses an array of pointers to the data points, which is the one which is actually carved. This results in a much faster code, which was then slowed down by a sloppy implementation of a new feature I have added (rigidity of the seams), but, even without that, it is still at least 3 times slower than yours.
- The reason to use cursors reading the image and skipping the invisible point is the possibility of real-time scaling. This is not used in the GIMP plugin, but, since the very beginning, I was planning to split the code into a GIMP UI + a carving library. It could also be useful for preview purposes. (This idea of creating a multi-size image is due to the algorithm inventors and can be seen in action in their famous video.)
- The implementation was done in that 'silly' way simply because I started off writing in C++ and thinking in terms of objects. Anyway, I have removed the all-in-one data structure by now. In fact, the code is simpler like this, but not faster, see below.
- After deep testing, I think that your analisys of why LiquidRescale is slower is not correct (unfortunately!). It's not a caching issue. In fact, on large enough images, where speed is really relevant, the time of the computation is totally dominated by the update function for the cost of the seams + the carving operation, which are O(N^2), N being the linear size of the image. Everything else (e.g. energy update) is just irrelevant. In your code, the update function is extremely simple and fully optimized. As soon as you add some features (e.g. the possibility to skip some pixels to get rid of the 45 degrees limit, which has the simple consequence of using a for loop instead of the int_min3 function plus a couple of comparisons) everything slows down. Consider that, since in your code there are already so few basic operations per pixel, any additional one (even just adding a conditional check!) produces a relatively big increase in computation time. I have modified your seamcarving.ml file, have a look here (using a for loop nearly doubles execution time on a 2000x2000 image) and here (using a recursive function is even worse).
- That said, I was obviously unable to reach the speed of your algorithm without giving up the idea of adding any extra features. Instead, I optimized the update cost function so that it only updates the values which actually changed. This is still O(N^2), but with a smaller factor. It can be used in two modes, a slower exact one and a much faster approximate one. The speed of the approximate version is comparable to that of your program.
- You can have a look at the code, it's here (it's not released yet but it works with GIMP 2.4). If you start GIMP from the command line, you will also see the timings.
- Thanks for making me consider the optimization issue more deeply. Challenge helps progress!
mfp 2007-11-13 (Tue) 15:14:40
Thank you for the detailed explanation and for clearing up the mystery behind your data structure; it all makes sense now.
It's a pity I was wrong about cache efficiency :); it seems we're farther from the memory bandwidth barrier than I thought. How well does the algorithm work in practice when you remove the 8-connectivity constraint (mainly regarding visual artifacts)? Clearly, an excessively high dx distorts the image (as shown in the original paper for dx=image width); I wonder if, say, dx=3 will work worse than the dx=1 default. I'll have to try your code on a few images. If only small values of dx are usable, the code could be specialized for them, getting rid of the loop overhead and approaching the performance achieved by the simpler int_min3-based update.
As for real-time scaling, maybe it's best (simpler and faster), even if less elegant, to create the multi-size image representation after seam carving, similarly to how we do seam insertion. What do you think?
Carlo 20-11-2007 (Mar) 12:21:45
Hi,
About dx > 1 : I have now a working implementation of this (in a git repo, here). In fact, it does tend to introduce artifacts, but I have added an additional energy cost to pixel skipping, and it can help in some cases. Currently, this is controlled by a single parameter, called rigidity (but I have to devise a better interface). An idea I wanted to test is also the use of a rigidity map, to use different values for different parts of the image.
About extending the int_min3 scheme : I think it would work, but in my case the introduction of the additional skip/rigidity cost is forcing me to keep track of the seams preferred direction at every pixel, so I cannot just make a comparison, I have to take some action. Maybe I will split the code and do an optimized version for the standard case.
About adding the multi-size representation afterwards, it's possible of course using your code. In this case the advatage of having only one array of pointers to carve is the fact that you can add other maps to the image (as the one about seams rigidity I mentioned above) at no computational cost (recall the carving is O(N^2)). I don't know if this will turn out to be useful in practice, though.
*1 120 lines for the sobel filter, 180 for the seam carving algorithm, 50 for seam insertion, 50 for energy biasing and some 60 lines for miscellaneous image manipulation functions (rotation, loading, saving, etc.)
Keyword(s):[seam] [carving] [insertion] [ocaml] [liquidrescale] [blog] [frontpage]
References:[Fast content-aware image resizing]