Kristina Chodorow's Blog
MongoDB
MongoDB PHP Driver 1.0.3 Release
Jan 7th
Version 1.0.3 was released today. Everyone should upgrade because there were some weird bugs in 1.0.2 due to a half-complete feature that was added in 1.0.2 and has since been removed. Unfortunately, because I’ve had to bump up the release date, the big feature that was scheduled for 1.0.3, asynchronous queries, has been pushed to 1.0.4. Sorry guys. However, I’m working hard on the asynchronous stuff and I’ll get 1.0.4 out the door ASAP.
The only API change in this release is the addition of client side cursor timeouts. For example, to create a cursor that will wait 2.5 seconds for queries to complete:
$cursor = $collection->find()->timeout(2500);
Time is specified in milliseconds. If the query takes longer than the specified timeout, a MongoCursorTimeoutException will be thrown. Timeouts do not affect MongoDB itself, your query will still be running on the server. It is merely a client side convenience.
Also, array serialisation is significantly faster in this version (only “normal” array serialisation, not associative array serialisation).
Upcoming Talks
Jan 4th
Want to learn more about MongoDB? Here’s the places I’ll be speaking in the next month or so:
- January 13th – DC PHP Meeting (http://www.dcphp.net/)
- January 19th – New York Perl Mongers Seminar (http://tech.groups.yahoo.com/group/perlsemny/)
- January 25th – Long Island PHP User Group (http://www.liphp.org/)
- February 7th – FOSDEM – Free and Open Source Developers European Meetup (http://nosqldevroom.pbworks.com/NoSQL-Devroom-Schedule, http://www.fosdem.org/2010/)
If your event desperately needs a NoSQL talk, feel free to contact me at kristina at mongodb dot org.
(Woohoo! I’m going to Belgium! …Not that Long Island isn’t exciting, but…)
Mongo Just Pawn in Game of Life
Dec 24th
This is in response to this nifty blog post on storing a chess board in MySQL and this snarky Tweet about NoSQL DBs (because I’m never snarky).
On the one hand, I can’t believe I’m doing this. What database can’t store a chessboard? On the other hand, it’s fun, and once I thought of the title, I really had to write the post. Let the pointless data storage begin!
Okay, so first we need a representation for a chess piece. I’m tempted to just use the UTF-8 symbol and position, but it would be nice ot have a human-readable string to query on. So, we’ll use something like:
{ "name" : "black king", "symbol" : "♚", "pos" : [8, "e"] }
Ha! Can your relational database query for a subfield of a subfield of type integer or string? (Actually, I have no idea, maybe it can.) Anyway, moving right along…
So, MongoDB can just run JavaScript, so I’ll write a JavaScript file that does everything we need. Here’s the code to create the basic chess board. “db” is a global variable that is the database you’re connected to. It defaults to “test”, so we’ll start by switching it to the “chess” database. If it doesn’t exist yet, it’ll be created when we put something in it. Then we’ll actually populate it:
// use the "chess" database, creates it if it doesn't exist db = db.getSisterDB("chess"); // make sure the db is empty (in case we run this multiple times) db.dropDatabase(); // map indexes to chess board locations column_map = {0 : "a", 1 : "b", 2 : "c", 3 : "d", 4 : "e", 5 : "f", 6 : "g", 7 : "h"}; // starting at 1a color_char = {"black" : "█", "white" : " "}; color = "black"; for (i=1; i<=8; i++) { for(j=0; j<8; j++) { db.board.insert({x : i, y : column_map[j], color : color_char[color]}) /* * switch the color of the square... it's always the opposite * of the previous one, unless we're at the end of a row */ if (j != 7) { color = color == "white" ? "black" : "white"; } } }
Okay, now let’s iterate through the pieces, create their objects, and add them to the board:
// create unique ids from symbols function get_name(symbol, column) { switch (symbol) { case '♖': case '♜': return " rook " + (column < 4 ? "left" : "right"); case '♘': case '♞': return " knight " + (column < 4 ? "left" : "right"); case '♗': case '♝': return " bishop " + (column < 4 ? "left" : "right"); case '♕': case '♛': return " queen"; case '♔': case '♚': return " king"; case '♙': case '♟': return " pawn " + column; } } // go through the 2D array of pieces, create the objs, and insert them function add_pieces(color, color_str) { for (row=0; row<color.length; row++) { chess_row = row + (color_str == "white" ? 1 : 7); for (column=0; column < color[row].length; column++) { chess_column = column_map[column]; db.board.update({x : chess_row, y : chess_column}, {$set : {piece : {name : color_str+get_name(color[row][column], column), symbol : color[row][column], pos : [chess_row, chess_column]}}}); } } } add_pieces([['♖','♘','♗','♕','♔','♗','♘','♖'], ['♙','♙','♙','♙','♙','♙','♙','♙']], "white"); add_pieces([['♟','♟','♟','♟','♟','♟','♟','♟'], ['♜','♞','♝','♛','♚','♝','♞','♜']], "black");
Phew! The hard part is done. Let’s print out this sucker!
// sort by x from 8-1 and y from a-h cursor = db.board.find().sort({x:-1, y:1}); count = 0; board = ""; while(cursor.hasNext()) { square = cursor.next(); if (square.piece) { board += square.piece.symbol; } else { board += square.color; } count++; if (count % 8 == 0) { board += "\n"; } } print(board);
And we get:
♜♞♝♛♚♝♞♜ ♟♟♟♟♟♟♟♟ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ ♙♙♙♙♙♙♙♙ ♖♘♗♕♔♗♘♖
Very snazzy. Now we can query by symbol, human readable name, or board position. Also, it’ll only take two updates to move a piece. (I attached chess.js, if you don’t want to copy/paste it yourself.)
CouchDB vs. MongoDB Benchmark
Jun 29th
Edit (9/1/10): this benchmark is old, silly, and should probably be ignored in favor of more recent and representative ones. I don’t want to take it down for historical purposes, but seriously people, it was never a good benchmark, it’s over a year old at this point, and both databases have changed a lot.
Edit (12/6/09): this is the #1 Google result for “mongodb benchmark”, so I figure I’ll do some community service: if you’re interested in benchmarks, you might want to look at the 3rd party ones listed on the mongodb.org website.
Felix Geisendörfer did a benchmark in PHP that was super-easy for me to port into MongoDB. You can see his post on his blog.
And now… comparing his results for CouchDB with mine for MongoDB’s (I did the graph in Open Office, which is why the quality sucks):

As you can see, MongoDB does, uh, slightly better. Here are the numbers:
| # of Inserts | Couch Total Time (sec) | Couch Time/Doc (ms) | Mongo Total Time (sec) | Mongo Time/Doc (ms) |
|---|---|---|---|---|
| 1 | .0015 | 1.46 | .0005 | .5 |
| 2 | .0015 | .75 | .0004 | .2096 |
| 3 | .0017 | .56 | .0005 | .1604 |
| 4 | .0017 | .44 | .0005 | .1190 |
| 5 | .0018 | .36 | .0005 | .1060 |
| 6 | .0019 | .32 | .0006 | .0931 |
| 7 | .0021 | .3 | .0006 | .0847 |
| 8 | .0022 | .27 | .0007 | .0789 |
| 9 | .0023 | .25 | .0007 | .0734 |
| 10 | .0025 | .25 | .0007 | .0721 |
| 50 | .0072 | .14 | .0024 | .0476 |
| 100 | .0136 | .14 | .0044 | .0442 |
| 500 | .0687 | .14 | .0253 | .0505 |
| 1000 | .1361 | .14 | .0372 | .0372 |
| 2500 | .4686 | .19 | .0278 | .0111 |
| 5000 | .9165 | .18 | .0488 | .0098 |
| 7500 | 1.5116 | .2 | .0835 | .0111 |
| 10000 | 2.3111 | .23 | .1065 | .0107 |
| 25000 | 6.8684 | .27 | .2711 | .0108 |
| 50000 | 15.8227 | .32 | .5430 | .0109 |
| 100000 | 35.3071 | .35 | .1.7697 | .0177 |
| 250000 | 104.0009 | .42 | 6.4533 | .0258 |
| 500000 | 230.6021 | .46 | 11.7684 | .0235 |
| 750000 | 352.7959 | .47 | 17.0473 | .0227 |
| 1000000 | 487.3284 | .49 | 18.4376 | .0184 |
Please let me know if I made any mistakes, all the values were hand-copied.
I ran these tests using the PHP driver on Ubuntu 9.04 on my MacBook Pro. You can see the test script I forked on Github.
A little analysis: Both DBs start with some overhead, but by 1000 inserts CouchDB seems to be chugging along nicely. MongoDB takes slightly longer to hit its groove, hitting its peak around 10000. They both slow a little near the end, as MongoDB starts spending most of its time allocating files and, although I know almost nothing about CouchDB’s structure, I’d guess it’s doing something similar.
Subscribe to all posts