HREF Considered Harmful
http://www.avibryant.com/
Avi Bryanten-US2008-09-02T12:50:07-07:00Chrome, V8 and Strongtalk
http://www.avibryant.com/2008/09/chrome-v8-and-s.html
There's lots to like about Google's new web browser, Chrome, which was released today. When I read the awesome comic strip introduction yesterday, however, the thing that stood out most for me was in very small type: the name Lars...<p>There's lots to like about Google's new web browser, <a href="http://www.google.com/chrome">Chrome</a>, which was released today. When I read the awesome <a href="http://www.google.com/googlebooks/chrome/">comic strip introduction</a> yesterday, however, the thing that stood out most for me was in very small type: the name Lars Bak attached to the V8 JavaScript engine. I know of Lars from his work on Self, Strongtalk, HotSpot and OOVM, and his involvement in V8 says a lot about the kind of language implementation it will be. David Griswold has posted some <a href="http://groups.google.com/group/strongtalk-general/browse_thread/thread/40eb8f405fbd3041">more information </a>on the Strongtalk list:
</p><blockquote><p>
The V8 development team has multiple members of the original
Animorphic team; it is headed by Lars Bak, who was the technical lead
for both Strongtalk and the HotSpot Java VM (as well as a huge
contributor to the original Self VM). I think that you will find
that V8 has a lot of the creamy goodness of the Strongtalk and Self
VMs, with many big architectural improvements
</p></blockquote><p>
I'll post more on this later, but things are getting interesting...</p>
<p>Update: the V8 code is already <a href="http://code.google.com/apis/v8/">available</a>, and builds and runs fine on Mac OS X. From the <a href="http://code.google.com/apis/v8/design.html">design docs</a>, it's pretty clear that this is indeed what I was hoping for: a mainstream, open source dynamic language implementation that learned and applies the lessons from Smalltalk, Self and Strongtalk. Most telling are that the only two papers cited in that document are titled "An Efficient Implementation of Self" and "An Efficient Implementation of the Smalltalk-80 System".</p>
<p>The "classes as nodes in a state machine" trick for expando properties is especially neat.</p>
<p>The bad news: V8 is over 100,000 lines of C++.</p>
<p> </p>
<p><img src="http://www.avibryant.com/files/picture_19.png" /></p>Avi2008-09-02T12:50:07-07:00MagLev recap
http://www.avibryant.com/2008/06/maglev-recap.html
There has been a huge response to the MagLev demo I gave on Friday, most of it enthusiastic, though not without the inevitable skepticism that comes with any announcement. For those who weren't at RailsConf, here's a quick summary of...<p>There has been a huge response to the MagLev <a href="http://www.avibryant.com/2008/05/maglev.html">demo</a> I gave on Friday, <a href="http://blog.obiefernandez.com/content/2008/05/maglev-is-gemst.html">most</a> <a href="http://www.cincomsmalltalk.com/userblogs/antony/blogView?showComments=true&printTitle=MagLev,_game_over_..._very_possibly.&entry=3389637476">of</a> <a href="http://dotneverland.blogspot.com/2008/04/maglev-ruby-vm.html">it</a> <a href="http://antoniocangiano.com/2008/05/31/maglev-rocks/">enthusiastic</a>, though not without the inevitable <a href="http://headius.blogspot.com/2008/06/maglev.html">skepticism</a> that comes with any announcement.</p>
<p>For those who weren't at RailsConf, here's a quick summary of how the demo went.</p>
<p>I started off by describing MagLev as a "full stack Ruby implementation", in the same way that Rails is a full stack web framework. To understand what I mean by that, see my <a href="http://www.avibryant.com/2008/03/ive-had-a-numbe.html">earlier post</a> on the Gemstone architecture: not only does MagLev provide a new (and fast) VM for Ruby, but it also provides an integrated shared memory object cache, and integrated transparent persistence. This fully replaces the typical Rails stack of many mongrel instances + several memcached instances + MySQL.</p>
<p>As a first demo, I showed a "magic trick" with two maglev instances running an irb-like shell in side by side terminal windows. A $hat global was defined in each, which just wraps an array and lets you put things in it. In the left window, I put a Rabbit into the $hat. I then looked at the $hat on the right and showed that the Rabbit had magically been transported there.</p>
<pre>
>> $hat
=> #<Hat:0x0c184bfd01 @contents=[
() ()
( '.' )
(")_(")
]>
</pre>
<p>How is this possible? Because they're the same hat. The integrated VMs, cache, and storage conspire to create an illusion that global state is shared across all instances: no matter how many VMs you add, over however many machines, they all see and work with the same set of Ruby objects.</p>
<p>There's no limit to what kinds of objects can be shared this way: procs and classes work just as well as arrays and strings. This isn't RPC - the objects are copied into a shared cache when they're created or modified, and if (but only if) another VM needs the object, it will pull it out of the cache and work on the local copy. All of these copies are kept in sync, and any changes are also written to disk by the storage engine so that the entire model is persistent.</p>
<p>This only applies to globally reachable objects - local variables, method arguments and so on aren't generally shared.</p>
<p>Obviously, with this kind of synchronization there has to be some concern for concurrency. MagLev handles this with transactions. Each VM has its own transaction state. When a VM enters a transaction, all of its changes are only locally visible until it is asked to commit. At that point, all of its changes get recorded to the cache and to disk and are available to every other VM.</p>
<p>A transaction can be aborted, in which case *everything* that has happened in that VM since the last commit (object modifications, creation, method or class definition, etc) will get rolled back. A transaction commit can also fail if it conflicts with concurrent changes elsewhere (for example, two VMs modifying the same instance variable of the same object at once).</p>
<p>Because these shared objects are stored on disk, and are lazily loaded into the VMs only when needed, it means you can work with datasets that have many, many more objects than would fit into available RAM. I showed a dataset that I had loaded in which contained 100 million movie reviews, and took up somewhere around 10GB. I could instantly pull in a single movie, modify it, and commit that change, without needing to load the other couple hundred million objects into RAM.</p>
<p>As a final demo, I showed how far MagLev has currently gotten with compatibility by running a simple WEBrick servlet.</p>
<p>At this point, Bob Walker took over. He gave some company background on Gemstone (they've been working on multi-user persistent dynamic language VMs since 1982), and some technical details on MagLev (the VM is a modified version of their Gemstone/S Smalltalk VM, with some Ruby-specific bytecodes; the bytecode is JITted to native code before execution). Then he showed some micro-benchmarks: for what it's worth, MagLev is anywhere from 6 times to (in the extreme case) 111 times faster than the standard 1.8.6 Ruby interpreter on things like fibonacci, block execution, method dispatch, and so on.</p>
<p>Bob then talked about scale. Gemstone has many customers running things like commodities exchanges, derivatives trading, container shipping, and so on that operate at very large scale on top of the same underlying technology as MagLev. Here are a couple of recent unsolicited quotes from a <a href="http://discuss.joelonsoftware.com/default.asp?biz.5.594244.20">thread</a> on the Joel on Software forums:</p>
<blockquote>
"I work for a major shipping company. We have a massive OODB and
Smalltalk Application (500 gig range) with 3 million lines of code.
We have 2000 plus daily users. We can do 700 transactions a second
before slowing down. We also have a Java + SQL +EMS system. On a
good day they can do 70 transactions a second, with three times the
hardware." --Timo (Saturday, February 16, 2008)
<p> "Along side with the major shipping company, we are a major <br />
commodities exchange using GS and ST and while our operational DB is <br />
small (about 5 GB at the start of the trading day to less than 75 GB <br />
and the end) we are probably one of the fastest. We easily handle <br />
transaction rates approaching 6000/sec with about 8000+ daily <br />
users. Our average data center round trip times are in the 2-3 ms <br />
range." --GemStone Weenie (Monday, February 18, 2008)<br />
</blockquote></p>
<p>It's worth noting that that's 6000 writes per second, sustained, and that this application peaks at about 3x that. By comparison, Twitter was once reported as having 600 requests/s (read and write).</p>
<p>Bob then moved onto the vision for MagLev going forward. A few important points:<br />
<ul><br />
<li>It doesn't run Rails yet, but it will.<br />
<li>It will be RubySpec compliant.<br />
<li>The Ruby source will be released. The C source code for the VM most likely will remain closed (but anything is possible).<br />
<li>There will be a free version which will work for most uses, and a paid version for large-scale deployment.<br />
<li>Look for another announcement/demo at RailsConf Europe in September.<br />
</ul></p>
<p>After that we retired to the DoubleTree for a keg of Ruby ale.</p>Avi2008-06-01T11:36:26-07:00MagLev
http://www.avibryant.com/2008/05/maglev.html
People have by now probably gotten a little sick of me saying that we really need to take Smalltalk technology and apply it to Ruby - see, for example, pretty much anything I wrote in November 2006, or my RailsConf...<p>People have by now probably gotten a little sick of me saying that we really need to take Smalltalk technology and apply it to Ruby - see, for example, pretty much anything I wrote in <a href="http://www.avibryant.com/2006/11/index.html">November 2006</a>, or my <a href="http://railsconf.blip.tv/file/568689/">RailsConf keynote</a> from last year.</p>
<p>This year, I'm going back to RailsConf, but with some concrete good news: we've finally <a href="http://ruby.gemstone.com/">done it</a>, and it rocks.</p>
<p>If you're want to see the first real demo* of Ruby running on Gemstone's very cool originally-for-Smalltalk platform, drop by on <a href="http://en.oreilly.com/rails2008/public/schedule/detail/4351">Friday afternoon</a>. I'm excited to see what everyone thinks.</p>
<p>(* I did do a teaser at <a href="http://www.meshconference.com/meshu/avi-bryant.php">MeshU</a>)</p>Avi2008-05-28T21:47:20-07:00Those who misremember history...
http://www.avibryant.com/2008/05/those-who-misre.html
In Dynamic Languages Strike Back, Steve Yegge says StrongTalk was really interesting. They added a static type system, an optional static type system on top of Smalltalk that sped it up like 20x, or maybe it was 12x. Why do...<p>In <a href="http://steve-yegge.blogspot.com/2008/05/dynamic-languages-strike-back.html">Dynamic Languages Strike Back</a>, Steve Yegge says<br />
<blockquote><br />
StrongTalk was really interesting. They added a static type system, an optional static type system on top of Smalltalk that sped it up like 20x, or maybe it was 12x.<br />
</blockquote></p>
<p>Why do people make this stuff up? The following two statements are true:</p>
<ol>
<li>Strongtalk has an optional static type system.
<li>Strongtalk is 15-20x faster than most other Smalltalk systems.
</ol>
<p>What's false is the causal link Steve is claiming between them. They are entirely independent. Strongtalk was that much faster whether you used the <i>optional</i> static type system or not. Strongtalk's optimizing compiler completely ignored the types, and it made your program run not one iota faster to add them.</p>
<p>Update: see also Dave Griswold on<a href="http://www.strongtalk.org/history.html">Strongtalk's history</a>:<br />
<blockquote><br />
... we had a type system and a compilation technology, which together were perfectly suited for a great production Smalltalk system, since they were independent of each other. This independence was critical, since the system would need to accept untyped as well as typed code, so that people could use the type system as much or as little as they wanted to, without impacting performance.<br />
</blockquote></p>Avi2008-05-12T16:50:25-07:00Ruby and other gems
http://www.avibryant.com/2008/03/ive-had-a-numbe.html
I've had a number of conversations recently about Gemstone Smalltalk, largely in the wake of their announcement of support for my web framework, Seaside. It's complicated to explain Gemstone to people. It's not just an object database (though it is...<p>I've had a number of conversations recently about Gemstone Smalltalk, largely in the wake of their announcement of support for my web framework, Seaside. It's complicated to explain Gemstone to people. It's not just an object database (though it is that), and it's not just a Smalltalk implementation (though it's that, too). The best thing I can compare it to is a Ruby on Rails deployment: not the framework, but the entire cluster of servers and software that goes into a large scale Rails app. Which is to say, perhaps, that Gemstone is best understood not as a piece of software but as an architecture.</p>
<p>At a high level, a typical Rails deployment looks like this: a cluster of servers supports one storage engine, several memory caches, and many worker processes. In Rails, the storage engine is always a relational database (usually MySQL), and sits on an especially hefty server by itself. Any number of other smaller, identical servers are each configured to run one memory cache (memcached) and 8-12 or so worker processes (Ruby interpreters running Rails and the Mongrel web server, generally just referred to as "mongrels").</p>
<p>The mongrels accept the web requests and run the actual application code. The objects inside these worker processes are live objects: they're sending and receiving messages, executing methods, changing state, and so on. They exist only inside the memory of a particular mongrel, for the duration of a single request that the mongrel is processing.</p>
<p>Many objects need to be persisted for longer than that, and these get written to and read from the storage engine - in Rails, using ActiveRecord. The storage engine is centralized (though it may be replicated to protect against failure), so that all of the worker processes see a consistent view of the data: if one of the mongrels modifies an object and commits that change to MySQL, the others will see that change the next time they need to load that object. The objects inside the storage engine are dead - they don't do anything until they're loaded into a worker process - but they're well preserved: they're kept on disk, not memory, so they'll survive a server reboot or other catastrophe.</p>
<p>Loading from and saving to the storage engine is relatively slow, and keeping objects there eats disk space, so the memory cache is an important third player in this game. A mongrel that's gone to the work of retrieving an object from MySQL might stash a copy in memcached for the other mongrels to retrieve, more quickly, if and when they need the same one. An object that's expensive to build - like a piece of complex HTML - but not important enough to save to disk might also be placed there for the convenience of the other workers on the same server. In Rails, the cache has to be managed carefully, so that you don't get out of sync with the consistent view of data maintained by the storage engine, but the work pays off with lower loads and faster response times. Objects in the cache are dead - usually marshalled into a meaningless string - and also transient, since the cache is purely in memory.</p>
<p>What about Gemstone? As it happens, the architecture is exactly the same: there's a single storage engine (called a "stone"), a memory cache on each server (the "shared page cache"), and any number of Smalltalk VM worker processes ("gems"). The gems handle the requests and run the code, and they stash objects in the page cache for speed and in the stone for persistence. The difference is, in Gemstone, these have all been designed from the ground up to work together as quickly and seamlessly as possible. In particular, this means two things:</p>
<p>1. Each part of the architecture uses exactly the same format to store the objects: whether it's a live object running in a gem, a cached object in the page cache, or a stored object on disk, the sequence of bytes is exactly the same. Unlike in Rails, where you have to be mapping and marshalling at every step, in Gemstone copying objects from storage to cache to worker process is pretty much just that - a simple byte copy. This makes it <b>fast</b>.</p>
<p>2. Objects are automatically kept in sync between each part of the system. The worker processes always load objects from the memory cache, because they can trust it to grab a recent copy from storage if needed. They also always save to the cache, because it will write the same change through to the storage without being asked. The gems also keep track of which objects have changed so that you don't have to, and will update the cache - and get updates from other gems back - automatically and transparently. The effect is as if all of your worker processes were running their objects inside a single, consistent and impossibly large chunk of persistent memory. This makes it <b>easy</b>.</p>
<p>To be extra clear, here's the mapping I'm trying to describe:</p>
<table border="1">
<tr><th></th><th colspan="2">Rails</th><th colspan="2">Gemstone</th></tr>
<tr><th></th><th>Provided By</th><th>Stores</th><th>Provided By</th><th>Stores</th></tr>
<tr><th>Storage Engine</th><td>MySQL</td><td>objects mapped to relational tables</td><td>"Stone" object store</td><td>Smalltalk objects</td></tr>
<tr><th>Memory Cache</th><td>memcached</td><td>objects marshalled to strings</td><td>Shared page cache</td><td>Smalltalk objects</td></tr>
<tr><th>Worker Process</th><td>MRI/Mongrel</td><td>Ruby objects</td><td>"Gem" Smalltalk VM</td><td>Smalltalk objects</td></tr>
</table>
<p>So there you have it: Gemstone, it's like Rails, but faster and easier. If only it ran Ruby...</p>Avi2008-03-08T01:02:38-08:00Don't Panic
http://www.avibryant.com/2008/01/dont-panic.html
So what happened was, I was at my house on Galiano Island with my shiny new iPhone and without, at the time, either high speed internet or EDGE coverage, and I thought "gee, wouldn't it be nice if...". And I...<p>So <a href="http://www.tbray.org/ongoing/When/200x/2003/04/06/WhatHappenedWas">what happened was</a>, I was at my house on <a href="http://en.wikipedia.org/wiki/Galiano_Island">Galiano Island</a> with my shiny new iPhone and without, at the time, either high speed internet or EDGE coverage, and I thought "gee, wouldn't it be nice if...". And I did some hacking, and then I mentioned it to <a href="http://collison.ie">Patrick Collison</a> who was sharing office space with us and he ignored my hacking and did a ton of his own, and even though I now have DSL and EDGE out there it *is* nice: all of Wikipedia, stored and searchable in 2GB of your iPhone's flash drive. Get it <a href="http://collison.ie/wikipedia-iphone/">here</a>.</p>
<p>It's not perfect yet - there's no images, just text, and the parser is pretty basic and doesn't know about tables and stuff, and clicking on links can be flaky and slow, and if you do happen to have a network around it's probably a better experience to just go to wikipedia.org, but: there's really nothing quite like holding the sum of human knowledge in the palm of your hand. Patrick, I owe you many drams of whiskey whenever you're back in town.</p>Avi2008-01-21T00:43:00-08:00DNA as Code
http://www.avibryant.com/2008/01/dna-as-code.html
Over the holidays I was chatting with my brother the biophysicist about his research. Roughly speaking, he is trying to create DNA sequences that encode molecular motors. I was trying to understand what it meant to hack DNA from a...<p>Over the holidays I was chatting with my brother the biophysicist about his research. Roughly speaking, he is trying to create DNA sequences that encode molecular motors. I was trying to understand what it meant to hack DNA from a programmer's perspective. Today I read <a href="http://ds9a.nl/amazing-dna/index.html">this</a>, which is in a very similar spirit. Two interesting data points from our conversation: one, the code my brother is "writing" is a few kilobase long, and could be represented in well under one kB of binary data. Two, his edit/compile/run cycle is about three weeks long, although he can do a dozen or so in parallel.</p>
<p>I thought these numbers were impressively small, especially that you could produce a working motor from a few hundred bytes of information (try that in Autocad...). He thought of them as huge, because they made it infeasible to brute-force the design by generating all the random variations and seeing which ones worked.</p>
<p>I'm certainly glad it doesn't take me three weeks to do a new build...</p>Avi2008-01-02T18:55:23-08:00Code as Screenplay
http://www.avibryant.com/2007/10/code-as-screenp.html
Giles Bowkett writes Debugger support is like nail-biting support, or farting-in-public support. Its absence is a feature. You want to avoid supporting bad habits. If programmers have to break their bad habits, that's a good thing. I have a confession...<p>Giles Bowkett <a href="http://gilesbowkett.blogspot.com/2007/10/debugger-support-considered-harmful.html">writes</a><br />
<blockquote><br />
Debugger support is like nail-biting support, or farting-in-public support. Its absence is a feature. You want to avoid supporting bad habits. If programmers have to break their bad habits, that's a good thing.<br />
</blockquote></p>
<p>I have a confession to make: I bite my nails. That's a bad habit, and I readily admit it. I also use a debugger. That's not.</p>
<p>Let me explain. Giles' argument seems to rest on this point:<br />
<blockquote><br />
Debuggers are based on the idea that the code base has enough places bugs could happen that the work of locating the bug is involved enough to justify machine assistance. This is not true of well-tested code. It is not true of code you understand, either.<br />
</blockquote></p>
<p>What Giles glosses over is how you come to understand the code in the first place. <b>Nothing</b> helps you understand code - whether you wrote it or someone else did - better than stepping through it in a debugger. Since Giles is a sometime <a href="http://gilesgoatboy.blogspot.com/2007/03/wrote-screenplay.html">screenwriter</a>, maybe this analogy is appropriate: reading the code is like reading a screenplay. Writing tests is maybe like drawing storyboards (they help you visualize the final product). Using a debugger is like <b>actually watching the damn movie</b>. With a jog wheel so you can slow it down. And no matter how good a screenwriter you are, no matter how good your director's storyboards are, when it comes time to cut the film you're going to find out that you didn't understand the movie as well as you thought you did, and you're going to need to watch the footage, sometimes frame by frame, and modify the movie accordingly.</p>
<p>Programs are the same way. Writing tests and reading code show you your program the way you want it to be, but only a debugger shows you the way your program <i>is</i>. Maybe screenwriters sit around in bars in LA and talk about how <i>real</i> filmmakers just read scripts, and the movies themselves are a crutch - me, I guess I like crutches.</p>
<p>See also: <a href="http://collison.ie/blog/?p=25">Patrick Collison</a>, <a href="http://programming.reddit.com/info/5yle2/comments/c029w0a">Ben Matasar</a>, and <a href="http://programming.reddit.com/info/5yle2/comments/c029w7w">Slava Pestov</a>.</p>Avi2007-10-18T14:23:24-07:00Code generation in Smalltalk and Ruby
http://www.avibryant.com/2007/09/code-generation.html
Neal Ford had a recent post about the difference between code-generation (he calls it "meta-programming", but that's an overloaded and ambiguous term) in Ruby and Smalltalk. The core of his point is this: in Ruby, code generation is done at...<p>Neal Ford had a recent <a href="http://memeagora.blogspot.com/2007/09/ruby-matters-meta-programming-synthesis.html">post</a> about the difference between code-generation (he calls it "meta-programming", but that's an overloaded and ambiguous term) in Ruby and Smalltalk. The core of his point is this: in Ruby, code generation is done at runtime, which means that what gets checked into your source code repository is a high level statement like "has_many :foo", which then generates the code when it is executed. In Smalltalk, code generation is done at development time (triggered by some custom wizard-like extension to the IDE), and so the generated code itself is checked in and the intent, according to Neal, is lost (as a trade-off for other benefits, like the ability to take the generated code into consideration when doing refactorings and so on, whereas in Ruby that code is invisible to any static analysis).</p>
<p>This is a straw man: Smalltalkers understand the need to capture (and later modify) the intent as well as anyone else does. The solution is to make the generated code round-trippable. If you look at any real Smalltalk tools that generate code based on a custom tool (the <a href="http://www.refactory.com/Software/SmaCC/">SmaCC</a> parser generator is a good example), it will preserve the settings from that tool, for example in a class comment, and the tool will let you inspect the intent, modify the intent, and regenerate the code.</p>
<p>To be concrete: any self-respecting Smalltalk tool that let you generate all the code associated with a "has_many" expression would annotate those methods with the "has_many" intent, in a way that the tools could understand, present to the user, and modify.</p>
<p>(James Robertson <a href="http://www.cincomsmalltalk.com/blog/blogView?showComments=true&entry=3366525548">points out</a> that ORM tools in Smalltalk tend not to use code generation anyway, but I don't think that really answers Neal's point.)</p>Avi2007-09-06T13:46:17-07:00Moving
http://www.avibryant.com/2007/07/moving.html
Just a quick note that I've moved this blog to a new platform (typepad) and a new URL (www.avibryant.com). If you were subscribed to the old one, you shouldn't have to do anything, because the feeds are redirected. However, although...<p>Just a quick note that I've moved this blog to a new platform (typepad) and a new URL (www.avibryant.com). If you were subscribed to the old one, you shouldn't have to do anything, because the feeds are redirected. However, although all of the old posts are imported, the old permalinks are currently broken. When I find the time over the next week I'll set up the mapping for them but for now, if you came here from a link to a specific post, I apologize.</p>Avi2007-07-04T23:15:25-07:00