Kristina Chodorow's Blog
With a name like Mongo, it has to be good
I just got back from MongoSF, which was awesome. Over 200 Mongo geeks, three tracks, and language-specific workshops all day.
The highlight, for me, was Eliot Horowitz’s talk on sharding. He set up a MongoDB cluster of 25 large EC2 instances and started hammering them.
He pulled up an incredibly snazzy sharding GUI (okay, I wrote it, it’s not actually that snazzy) that displayed a row of stats for each shard. Each shard had about the same level of operations per second, so you could see that Mongo was doing pretty well distributing the data across the shards.
Eliot scrolled down to the bottom of the stats table where it showed the total number of operations per second across all the shards. The cluster was doing 8 million operations per second.
8,000,000 operations per second.
The whole audience burst into applause.
That’s about 320,000 operations per second per box, which is about what would be expected for Mongo on a powerful server (like a large EC2 instance).
Cool things about this:
- If your app needs more than 8 million ops/sec, you can just add more machines and Mongo will take care of redistributing the load.
- Your app doesn’t need to know if it’s talking to 1, 25, or 7,000 servers. You can focus on writing your app and let Mongo focus on scaling it.
If you missed out on MongoSF, there are a bunch of other Mongo conferences coming up: MongoNY at the end of May and MongoUK and MongoFR in June.
| This entry was posted by kristina on May 5, 2010 at 2:58 pm, and is filed under MongoDB. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |
No trackbacks yet.
Comments are closed.






about 4 months ago
The sharding talk sounds interesting. Will any videos be posted?
about 4 months ago
Nice! was this achieved using the experimental auto-sharding functionality or is this readily available in the stable version?
about 4 months ago
@Chris there was video taken, I’m not sure if/when/where it’ll be up, but I’ll link to it if it is. There are slides at http://www.slideshare.net/mongosf, but these don’t include the demo, of course.
@Slimo this was done with the latest 1.5 branch build, which is what I’d recommend using for anyone trying sharding.
about 3 months ago
Looks like videos are up at http://www.10gen.com/event_mongosf_10apr30
about 3 months ago
Any idea on what was the setup of the cluster and what kind of workload it was?
320k ops seems awesome but if the dataset was small (like fewer than 1 millions documents) and all in memory with only reads, the results are not that surprising.
about 3 months ago
IIRC, it was ~20,000 each of writes, deletes, updates, inserts, commands, & get mores and ~200,000 reads. I think that it was a quite large data set, but I’ll check with Eliot tomorrow.
about 3 months ago
I agree – an absolutely awesome demo!
@Nicolas, I think it was five shards of five replica-set servers each. I might misremember.
about 3 months ago
was at MongoSF.. the sharding demo was definitely cool, a lot of the other talks/demos were very good as well. overall a great conference, esp. for the price. thanks again
about 3 months ago
@Noah unfortunately, replica sets aren’t ready yet. So, it was just 25 shards with no replication, which was actually kind of a problem because Amazon thought that we were DDOSing ourselves, so kept shutting down our instances, thus breaking the demo.
About replica sets: I am totally in love with them, so if you’re interested subscribe to my RSS feed because I’ll be doing a blog post on how to use them ~3 seconds after they’re available in the nightly build.
about 3 months ago
8 million ops/s is 320000 ops/s per instance.
As you wrote this is far above a local Mongo, Redis or OrientDB instance.
And this on EC2 with more network letency??
Sounsd impossible. What’s behind this magic?
Regards
Stefan Edlich
about 3 months ago
As I wrote, ~320,000 is reasonable for a powerful server. There were lots of clients sending requests and MongoDB can do reads in parallel, which is what there were the most of.
about 3 months ago
Kristina,
Those numbers sound amazing, but there are a lot of missing details
.
1. what was the concurrency level?
2. where there concurrent CRUD operations? (basically I expect things to behave quite differently when sets of READS are (almost completely) separated from sets of WRITES)
3. what was the average data size/op?
You know such benchmarks can be “fabricated” and missing information tends to encourage that kind of thought.
about 3 months ago
1. I have no idea.
2. Yes.
3. I have no idea.
I think people are taking this post a bit too seriously… I just thought it was a really cool demo of what Mongo’s going to be able to do: scale horizontally to practically infinite capacity. It’s certainly not meant to be a benchmark.
I think we’re working on a more interesting application (one has actual functionality other than hammering the database) that we’ll open source and use as a demo in the future. We’ll also do some real benchmarks against sharding before releasing 1.6, obviously.
about 3 months ago
i think it's blog class ho, meet girls on webcams for free, %O,
about 3 months ago
Do you have any more details on the “incredibly snazzy sharding GUI” that you mentioned in the post? I'm interested in doing some load testing to see how MongoDB will perform for my system and it sounds like that GUI was giving some pretty valuable data — anyway to get access to that program? Thanks!
about 3 months ago
Unfortunately, it's not publicly available yet, but we'll probably release it when 1.6 comes out. It doesn't give any information you can't get from the shell (it basically runs the serverStatus command once per second for each machine), it just presents it in a pretty way. I'm hoping to add a chunk viewer and stuff, too, but it's very, very, very alpha at the moment.