Archive for the 'Uncategorized' Category

Debugging a running Node.js process

This is the combination of a little documented feature of V8 mixed with a non-documented feature of the node.js debugger client. Say you have an already running node process that you want to debug. The most common scenario is a process that’s running wild on one of your testing environments and you want to see what’s wrong with it. So a backtrace would be helpful. And full debugging even more. The following should help:


# start the remote debugger in the already running node process
kill -s USR1 pid
# attach to the debugger using the node.js client
node debug host:5858
# see where you are
debug> pause
debug> bt

From there you can poke around. You can also continue and pause again to see if you seem to consistently end up in the same code area. Or write a script that regularly connects, dumps the backtrace and disconnects.

Happy debugging.

No Comments »

Matthieu Riou on August 22nd 2012 in Uncategorized

Tech @ the Zoo

I started working on MommaZoo with Catheryne in June and I’ve been meaning to describe the technology stack that powers it for some time. Before I do, knowing about MommaZoo may be helpful. And I can’t really put it better than our website (given that Catheryne wrote it):

MommaZoo helps moms and dads with the craziness around parenting school-aged kids. We remind you when your kid’s homework is due. We tell you the morning you need to bring a car seat for a field trip. We help you instantly contact other parents who can cover 15 minutes if you are running late. We connect you with other parents who live close to you, or whose kids are in similar activities, to share a ride or a walk-to-school.

We have a messaging part, a simple task management thingy (optimized for volunteering because schools do a lot of that) and class rosters. We’re also working on using location and activity matching to get parents who live close to each other to find one another. We want to encourage carpooling, walk-to-school and “let’s all help each other” connections. We’re at the pilot phase at the moment and are testing and tuning it with one school. We’re still pretty small so don’t expect some hardcore traffic numbers here.  But I’ll give an update when we reach a million users (*cough*)

One major challenge we had was we wanted to optimize for mobile delivery but also wanted a website. We’re bootstrapping (yes, we like challenges) and wanted to be out quickly (for the start of the school year, so only 2.5 months). So I decided to try a website that’d work for all devices (the “HTML5″ thing). That has worked much better than I initially expected. Of course there’s been some pain but all things considered, rather minimal given the time saved.

So without further ado, going from bottom to top, here is our stack:

  • Amazon EC2. Right now, a micro instance. Yup, micro. We don’t have that many concurrent users yet and we’re pretty lean an mean. And I love EC2.
  • Ubuntu server. I’m just very familiar with Ubuntu.
  • MongoDB. As already mentioned, development time is rather important. Developing on top of MongoDB is really pleasant. It’s not really a NoSQL storage, it’s much more conventional than what is talked about, but it’s also pretty frictionless. So I like. We’ll go fancier when we’ll have much more data.
  • Amazon SES. Takes a POST, sends an email. Simple as in the name.
  • CoffeeScript. That’s the programming language. And we use it everywhere, top to bottom, left to right, client to server. Okay, I haven’t tried to get the Coffee compiler in the MongoDB shell yet but someday… Anyway if you haven’t checked CoffeeScript out yet, you definitely should, it’s doing yourself a disservice not to. And even more so if you’re using node.js…
  • Node.js. You can bitch all you want about callback oriented programming but with a few good habits (CoffeeScript), it’s just fine. I thank Ryah Dahl every single time I don’t need to use a thread pool, have locks or use synchronization. And Javascript is just Scheme after a long acid trip so it can’t really be bad.
  • Mongoose. To interface with MongoDB. Honestly, I’m not totally seduced. I don’t necessarily like all the choices, there’s a little bit too much magic and it has some kinks. I’d rather have something thinner. I mean to check out Poutine when I have some time.
  • Underscore.js. Simple library, with a few good utilities. Used both in the browser and on the server.
  • Express.js. It’s thin, doesn’t get too much in the way, handles your resource routing and integrates connect.js (for cookies, request logging, etc.). Overall I’m not fully convinced but it does the job well. So I keep for now.
  • Amazon ELB. Just for failover and SSL termination at the moment.
  • Skeleton. We’ve now crossed the network and are in the browser. Skeleton has been working really well in handling all the layout differences between devices and screen sizes. It’s basically a CSS with a lot of media queries. Very nice.
  • jQuery. Just because. With a few plugins.
  • jQTouch. Originally intended for its good iOS support but it’s probably the piece I’m regretting the most right now. It seems a few best practices and tricks with a CSS animation library would serve us better. It’s not very well maintained and the code isn’t that clean.
  • Backbone.js. The third piece of software by Jeremy Askhenas we use. We definitely owe him his weight in beers. The good of backbone is that it’s simple and lightweight (notice a trend?). The less good is that it tends to orient you toward fat views. Of course YMMV.

Total bill? $31 a month right now. And we have 2 copies of everything, one for a test environment and one for production. Overall I’m very happy with our node.js+ec2 stack. It’s been stable, very tractable and will support our growth in the coming years.

So if what we’re doing sounds exciting, if you’re interested as a parent or as a technologist, drop me an email (matthieu at mommazoo dot com). We’re not hiring yet but I’m always up for some interesting email exchanges. Or a beer if you live in San Francisco or are passing by.

4 Comments »

Matthieu Riou on December 7th 2011 in Uncategorized

Multi-layered Perceptron Trained With Backpropagation (in Scala)

My next field of study is neural networks, I’ve been reading about them for some time. Interestingly, it’s not really easy to find resources that aren’t full of not-so-trivial math, so I had to get a calculus refresher as well. By the way if you’re interested in learning algorithms strictly from a programming standpoint, I can’t recommend enough Programming Collective Intelligence, very good practical book.

Before we get into the details, neural networks are used in supervised machine learning to solve classification problems and more generally build non-linear models out of raw data. One of their main strength is that they’re good at generalizing (responding sensibly to data that hasn’t been seen before). They were very popular in the 70s, many thought they were the early stage of a new kind of intelligence. That didn’t really happen so they fell in disgrace. To make a comeback in the mid-80s when back-propagation was invented. Then new machine learning techniques came about more recently, like Support Vector Machines, showing somewhat better results (with somewhat complex math techniques). But neural nets came back with a vengeance! With techniques still evolving, deep belief nets or LSTMs are now showing really good results on some categories of problems (on that last subject, a couple of Google tech talks from Hinton are worth the time).

So the simplest form of neural network is Rosenblatt’s Perceptron, which really isn’t much of a network given that it’s usually represented as a single neuron. Behold the Wikipedia illustration ripoff:

spacer

Basically, you have a series of input signals that stimulate the neuron into producing an output signal. Each input has a weight and the output is the sum of all inputs time their respective weight, passed through some normalization function. The role of the function is simply to normalize the sum.

spacer

The f function can be as simple as a step function (i.e. 0 for x < 0 and 1 for x > 0) although most problems are better handled with more smoothness (like an hyperbolic tangent). At this point you may start to wonder: but where is the learning? It’s all in the weight, brother. For example, say you want to predict the chances of someone having a cancer using a set of basic physiological measurements before recommending a more thorough and expensive examination. Each of theses measurements (blood pressure, quantity of white cells, …) is going to be one of the X input in the above picture, the expected output would ideally be a number between 0 to 1 telling you how sure the network is. To get to that you need to adjust the weights of each connection by training the network. Enters supervised learning.

In supervised learning, the network is fed a data set for which the answers are already known. The difference between the expected output and the one produced by the network is the error and can be used directly to adjust the weights:

spacer

In plain English, the new weight for a given input is the old weight added to the product of a learning factor (usually in the 0.1 to 0.01 ballpark, depending on how fast or precise you want the learning to happen), the error and the input. Pretty easy, no?

But there’s a catch, with Rosenblatt’s Perceptron you can only model a linear problem (the solutions can neatly be separated by a straight line). Sometimes that’s enough but often, like in my cancer detector example, it’s not. So what can we do about it? Just stack the neurons on top of each other.

spacer

Here we have 3 layers, neurons on each layer are connected to all the others neurons of the neighboring layers (the illustration above is slightly wrong, missing connections between the hidden and the output). Middle layers (there’s only one in the illustration) are called hidden layers, as they’re invisible to the outside world. The network can be as large as needed, with many hidden layers composed of many neurons. The thing is, depending on the problem, a network that would be too large can actually perform poorly (over-fitting) and will be long to train. That’s where Perceptrons can be annoying: in theory, they can model any function, in practice several educated guesses need to be made as to the size and structure of the network to get good results. That being said, people have been using them quite successfully for some time.

Multi-layers perceptrons were first described in 1969 but it took a while to figure out how to train them. 17 years actually. The trick is to propagate the error calculated at the output (just like for Rosenblatt’s perceptron) back through the hidden layers, all the way to the input layer. The algorithm is fairly well explained on Wikipedia so I won’t go into the details. Plus you’re going to get some code anyway.

Multi-layers perceptrons are still fairly basic as far as neural networks go nowadays but it’s necessary to understand them well to understand other networks. They also perform decently well and are easy to implement, so why not do that, huh? I’ve implemented one in Scala, it gave me a good occasion to try the language a little more. At the beginning I wanted the implementation to be purely functional but somewhere along the way it felt more natural to start mutating the objects. I’m not sure whether that’s because of Scala, because of object-orientation or because of me. My Scala is probably still a little rough, so let me know if there are more idiomatic ways.

To make it easier to try, instead of using a data set that you’d have to download, I’m training the network on the XOR function. It’s non-linear, so it’s a good illustration of the capabilities of the network, even though the result doesn’t have as much wow factor.

2 Comments »

Matthieu Riou on June 27th 2010 in Programming, Uncategorized

Distributed Hash Tables (part 2)

This is the second (and last) installment of a small personal study of distributed hash tables. The first one was focused on consistent hashing and the Chord DHT, check it out for more context.

Kademlia

Kademlia scales very well mostly because its design taps into the lessons learned by existing peer to peer networks. Which also makes it more complex and a little less elegant than Chord, feeling more like a set of heuristics rather than a single nice algorithm. But that’s often the price to pay to behave well under real-world circumstances.

Probably the most important observation that Kademlia relies on is that a node that has been in the network for a while is very likely to stay a little more. Conversely, a node that recently appeared is likely to leave soon. Kademlia’s routing table practically never removes an “old” node.

Getting a bit more into the details, like most DHTs Kademlia relies on consistent hashing to locate nodes and key / value pairs. But it shows some originality by using a slightly different notion of distance. Where Chord simply uses the traditional definition of how far two points are from each other on the circle (d(a,b) = abs(b-a)), Kademlia relies on an XOR metric (d(a,b)=a xor b). It’s a valid metric as it follows the necessary rules:

  • d(x,x) = 0
  • d(x,y) > 0 if x != y
  • d(x,y) = d(y,x)

So why using this crazy XOR metric instead of the classic distance do you ask? Because of the structure of the routing table that Kademlia puts in place. Once again a picture is worse a thousand words so I’ll give you one (coming from the paper) plus a few hundred words to boot.

spacer

Kademlia routing table

Ah I hear you think, a cool binary tree. Yes it is. The binary tree represents all the numbers in the hash space, so if your node, when expressed in binary is 1100 (12 in decimal) it will be placed under the node found by going twice through the left branch and twice through the right branch. Simple, isn’t it? Except that the rules to build that tree aren’t that simple but I’ll try to explain anyway.

Each leaf in the tree contains what’s called a k-bucket, basically a list of k elements (and in practice, k=20 seems to work well). When a new node enters the Kademlia network it doesn’t have much of a tree, it just has a single empty k-bucket. When it starts learning about the existence of other nodes, it will insert them into that bucket until it reaches its maximum length of k. When the k+1 node needs to be inserted, the bucket gets split. All the nodes that have a hash binary representation starting with 1 get assigned to the new k-bucket on the left of the root node, the ones starting with 0 go to the one on the right. Here is our one level tree. Then our node starts learning about more other nodes and say the right k-bucket (with node hashes starting with 0) gets full again. The bucket is split and all nodes starting with 00 will be in the right-right position (from the top of the tree) and all the ones starting with 01 will be re-attributed to the right-left position.

Now if we were to continue this process on and on, we could end up with a routing table having the same width as the network: n. But we already saw that a too large routing table is expensive to maintain, which is why Kademlia’s routing table only includes O(log n) elements. How? Simply by not storing too many nodes in the “far away” distance using the XOR metric. It never splits k-buckets that aren’t on the same side of the tree as the current node hash.

In the above picture, the current node hash starts with 0011. Therefore it will only have one k-bucket on the left side of the three for all nodes starting with 1 (represented by the big gray circle). That bucket will never be split. It will also have one single bucket under the right-left path (represented by the medium gray circle), which similarly will never get split. Neat isn’t it? This guarantees that our routing table only maintains O(log n) contacts and also that a node knows well its surroundings without knowing too much about what’s far. Heard that already? Yep, Chord does the same.

But Kademlia’s scheme uses k-buckets, with k nodes. And that’s where its main advantage lies, on a full k-bucket that can’t be split, it can still reorder nodes. The buckets being a list, it orders oldest nodes toward the end. And those will never be dropped unless they stop responding. As older nodes are also the most persistent in the routing table, the network organizes itself around the most reliable points.

spacer

I won’t detail the lookup algorithm too much, it almost naturally comes as a result of the routing table’s existence. In a few words, you ask the closest nodes you know about in your routing table which nodes they know about that would be close. And you continue doing that until you don’t learn about any new closer nodes anymore. Another very important property of Kademlia lies right here. The lookup always considers k nodes but only uses a subset of those. This choice in selecting the subset (usually of 3) out of k nodes leaves a lot of space for latency optimization.

That’s about it! There are a few additional refinements, especially around maintaining the network by having nodes refresh each others’ information but the most important part is there. And that’s the most deployed DHT algorithm on Internet.

Improvements and Shortcomings

I’ve mentioned that a big advantage of Kademlia that having a choice in which nodes to contact in Kademlia was a big advantage. Why? Because if you can find your way around in more than one way, you can start optimizing the “best” way. On a network that would mean the fastest which is also often the cheapest. That’s important for you, because you don’t like to wait and that’s important for ISPs because they have strong interests in optimizing the routing. An interesting direction for those optimizations are network coordinates, like Vivaldi or Pharos, that map network distance with real coordinates. Using such a system, a given node can easily analyze the surrounding topology.

It’s also been observed in the wild, on Kademlia’s implementations, that strangely the distribution of nodes on the hash space wasn’t even. When you give a node an id in the hash space, you would normally expect they will be more or less evenly distributed, especially if the id attribution is random. However it’s not what has been observed. How come? Because it’s a relatively easy way to game the system. Suppose there’s a given key / value stored on the network that you would like to censor or swap for another. You could just create a lot of “fake” nodes which would advertise their id as being very close to the chosen key / value. Anybody on the network would need to ask one of your nodes and then you can just return bogus values.

The problem is actually not as bad as it sounds. If it’s relatively easy to “hide” a specific key / value pairs, given the number of nodes and values present in the network, large scale censorship is almost impossible to achieve. And it’s also not trivial to come up with a scheme that would protect the network against that sort of attack. The naive way would be to have neighbors of a node give it a value based on its IP address. Unfortunately it’s possible for many nodes to share the same public IPs and there are privacy issues associated with using your IP directly.

Conclusion

I hope you enjoyed this journey in DHT land, it was certainly enjoyable to me. Looking at the bigger picture, it seems that peer to peer networks are still fairly underused, I’m sure there are new and unexpected applications. The web is actually a very centralized model and that worked really, really well. But it might not always be this way and there may be a day where another model will emerge. I’m not saying P2P will be it but knowing the alternatives in the network layout, knowing how they work, their advantages and shortcomings will surely be useful.

Image by jef_safi.

1 Comment »

Matthieu Riou on August 10th 2009 in Uncategorized

Messing With Javascript

I’ve had this long unfinished rant about Javascript in my drafts for almost a year now. But somehow it felt unfinished, like a useless rant. And it dawned on me recently: “if you’re unhappy about the state of Javascript, just fix it dumb sucker”. That little inner voice had a point. So this a lightweight rant with patches.

See, I actually like Javascript. It’s minimal, functional, now has good implementations (getting better and better), flexible and extensible without magic. Of course it’s not as full-featured as languages with very large standard libraries but it’s not meant to be one anyway and it’s a pretty damn good language to have in your toolbox. And I’m talking about Javascript the language, not the browser DOM which is a library (which, non-arguably, sucks).

The problem though, is that mostly for historical reason, it has some very annoying quirks, much more than most other mainstream languages (one huge quirk being its name). And for some reason, nobody involved in the specs has ever bothered fixing those, even when they’re pretty obviously broken “features”. I guess the main reason is the Holly Backward Compatibility but there are ways to address that.

For most pain posts below I have patches for Rhino, the Javascript implementation I know best. They’re all published on github, applied on the current Rhino trunk, fork away!

Function is 8 characters, that’s looooong

As I already mentioned, Javascript is functional and that’s one of the sweetest thing about it. It allows you to do things like:

[js light="true"]
var total = [0, 1, 2, 3].reduce(function(a, b) { return a + b; });
var flattened = [[0,1],[2,3],[4,5]].reduce(function(a,b) { return a.concat(b); },[]);
[/js]

Sweet no? Well, kind of. There’s actually quite a bit of visual pollution in here, making code longer to write and harder to read, even for very simple examples like these. More complex programs lose elegance.

Javascript 1.8 has improved things a bit with “expression closures” (a very bad name as it’s not necessarily a closure) which are shorter lambda notations. The first anonymous function above could be rewritten “function(a,b) a+b”. It’s better, but just the keyword function is still taking more visual space than the actual implementation.

So I’ve introduced two new symbols. The first one is -> and is strictly identical to the function keyword, it can be used anywhere function can be used. So you get:

[js light="true"]
var flattened = [[0,1],[2,3],[4,5]].reduce(->(a,b) a.concat(b), []);
[/js]

Sweet no? And then there’s one more candy for you: unary lambdas. The ~> symbol is used for a function accepting a single parameter (or even 0 because in Javascript if you can do one you can do zero) that will be bound to the implicit underscore variable. Functions with a single argument are usually pretty common, so I thought it was worth sugaring them a bit. With ~> you can do that:

[js light="true"]
var bigger = [0, 1, 2, 3].map(~> _ + 5);
[/js]

Which will produce a new array with 5 added to each element.

The patch adding -> is here (and very simple), the one adding ~> here. Feedback welcome.

Zero and the empty string are falsy

Truthiness is a pretty useful thing, it makes most conditional testing or assertions shorter to write and more compact. Just in case you don’t know, truthiness for some values is the fact that they can be coerced (or converted) to booleans automatically, even if they’re not. Like this:

[js light="true"]
var m = "a string";
if (m) print("it’s true!");
[/js]

The problem is that Javascript makes undefined, null, false (of course), 0 and the empty string falsy, which is sort of reminiscent of much older languages. Now I can’t recall any program when I had to test values and was considering all null, false, 0 and the empty string as failure cases. Null and false yes (mostly null actually). Null and the empty string, sometimes, but I usually have a utility function to test that as it doesn’t occur that often. Null, false, 0 and the empty string, never. And when you get an array passed to a function, it’s not like you can know easily in advance if there are going to be zeros or empty strings in them

It’s pretty annoying because it’s easy to miss unintended bugs like this:

[js light="true"]
var a = [-1,0,1,2,"","a"];
for (var i = 0, item; item = a[i]; i++) { print(item); }
[/js]

This is a fairly useful Javascript idiom but it won’t print the whole array, the loop will stop at the 0 element. And it’s not like I’m the only one complaining. One could argue that now functions like map shield you against this but you still have the occasional if where you just forgot and for loops still have some usefulness.

So I’ve changed that, making 0 and the empty string truthy and only null and false falsy. The patch is here, feel free to review and tweak.

Undefined considered harmful

So theoretically, you might think having undefined is a good idea. So you have a difference between uninitialized variables (which have for value undefined) and null values (which are initialized with a null value). Practically though, it’s more of a headache because you now have to worry about two different types of things not-really-having-a-real-value and test for both in your programs. Additionally:

[js light="true"]
js> foo != undefined
js: "<stdin>", line 11: uncaught JavaScript runtime exception: ReferenceError: "foo" is not defined.
at <stdin>:11
js> typeof foo == "undefined"
true
js> null == undefined
true
js> null === undefined
false
js> typeof null
object
js> typeof undefined
undefined
[/js]

Kind of messed up. Practically speaking it’s a lot of unnecessary accidental complexity. So my next patch will be to remove undefined, I don’t have that completely ready but stay tuned, I’ll update this post when it’s fully cooked and in the meantime you can watch the Github repository.

Objects, Types and Equality

There’s a duality between primitive types and object types in Javascript that sort of remind the brokenness of Java in that area (which has been somewhat alleviated by boxing/unboxing). The thing is, Javascript is dynamically typed so that bites even more: when you implement a function receiving a string, are you going to get a “abc” or new String(“abc”)? Here is a a Javascript session summing up the problem:

[js light="true"]
js> new String("abc") == new String("abc")
false
js> new String("abc") == "abc"
true
js> typeof "abc"
string
js> typeof new String("abc")
object
js> [] instanceof Array
true
js> new Array() instanceof Array
true
js> var str = "abc"
js> str.foo = true
true
js> str.foo
js> str = new String("abc")
abc
js> str.foo = true
true
js> str.foo
true
[/js]

There are ways to work around the discrepancy in behavior between object-wrapped types and primitive types but again, that’s a lot of accidental complexity. I haven’t completely thought out how to fix that yet, it’s a more ambitious project than the others. So keep watching and help is definitely welcome.

Conclusion

My aim by publishing these patches is not to break compatibility or even fork the language or implementations. It’s more to foster discussion and experimentation. Again, I like Javascript but some of its quirks need fixing. So which one would you fix first?

[youtube]www.youtube.com/watch?v=ZbbxA8a_M_s[/youtube]

4 Comments »

Matthieu Riou on August 5th 2009 in Uncategorized

Distributed Hash Tables (part 1)

Recently I’ve been unlucky enough to contract a wrist tendonitis on both arms. I guess it was just a matter of time, always working on laptops in awkward positions or on the couch. I’m getting much better now but the point is that I’ve had to avoid keyboards as much as possible outside work for a while. So I started to read a few papers instead and somehow ran into distributed hash tables (usually abbreviated DHTs). Recovery was slow so I ended up reading quite a few papers, including algorithms, reports, studies, simulations and related optimizations. Got used to reading PDFs somehow. The wikipedia page is kind of light, staying a bit too abstract, so I thought I’d try to write a more concrete summary. It will help me remember some details too.

So DHTs basically provide the same interface as a normal hash table (put, get) but over a distributed network including a potentially very large number of nodes (think millions) and a very very large number of key / value pairs. Distribution provides reliability (through redundancy) and the ability to store a quantity of data far above anything else. The most common use is peer-to-peer with implementations like Kad (in eDonkey) or Azureus, but there are a few others like distributed file systems, content distribution or overlay for other applications (like chat). DHTs are very interesting in the sense that they mix distribution, resilience (to churn, mostly), reliability and distribution. A nice mix.

Most “classic” DHT algorithms I will talk about here are also optimized for high levels of churn (more on that later). There’s also another class of DHT, which actually aren’t stricto-sensu DHTs, that are optimized for lookups. Dynamo, the Amazon storage, is one of them. Here the keys for the nodes aren’t distributed, a node knows about all the others. Only values are distributed. It’s a different tradeoff: on the plus side, you reduce the number of hops as all nodes are known, on the minus side you can’t handle as many nodes or you’ll need much more network traffic. That last sentence might not be completely clear for you yet but hopefully it will make more sense later. Just keep in mind that I’m only going to consider “traditional” DHTs here.

Consistent Hashing

The common ground between all DHTs is the reliance on consistent hashing. Let’s say I come up with a naive DHT implementation based on modulo. We have 8 nodes and 100 keys, to assign a key to a node I just take the modulo of the key number by the node count. So key 18 will end up on node 2 and key 63 on node 7. Simple and effective so far. Until a node leaves or a new node arrives. By modifying the node count, the assignment of all the keys change. All the stored key/value pairs have to be moved around, leading to a fine and inefficient mess. Ugly pipe waste. Consistent hashing solves that problem.

The only drawback of consistent hashing is that you really need a drawing to explain it and I suck at drawing, even on a computer. So I’ve borrowed the following image from this Japanese blog post, which actually looks very interesting.

spacer

At the start there’s a hash space, which usually corresponds to the whole space of SHA1 key (all the way to 2^160). Keys are expressed in this space by calculating their SHA1 (see the cute little HASH! explosion above?). And the space is usually represented as a circle that you would follow clockwise, so when you get past the maximum, you get back to zero. The secret of consistent hashing is to also give nodes a hash in that same space and to say that key / value pairs get attributed to the node closest to the key, going clockwise again. We’ll see that the idea of “closeness” can change but this is the easiest to understand.

Say we have two nodes, one at position 50 and another at position 200. If we want to store a key / value pair in the DHT and the key hash is 100, we’ll give it to node 200. Another key hash of 30 would go to the node 50. Makes sense so far? Good. Otherwise there’s a slightly more extensive explanation here.

It’s easy to see that consistent hashing is actually much better than my moronic modulo-based attribution. If a node leaves the network, only its key / value pairs need to be re-attributed. If a node joins the network, only a fraction of its follower node key / value pairs need to be moved. The rest of the hash space stays untouched.

Chord

With that out of the way, let’s start actually talking about DHTs. The canonical modern (introduced with the other 3, CAN, Pastry and Tapestry in 2001) DHT algorithm is Chord and it definitely contributed in popularizing consistent hashing. It’s the one you’ll see most often mentioned as it’s probably the simplest, in an elegant way. Which makes it interesting to study.

spacer

The basic idea is that once you have your nodes on the hash ring, the only thing that’s needed for a node is to know its successor. In the picture above, a key lookup starting at node (a), assuming it doesn’t have the key, would just forward the lookup to (b). If (b) doesn’t know about the key either, it will forward to (c) and so on. Sooner or later, you’ll hit a node that knows about your key. The good thing about this method is that it only requires a node to know about a single other node. The bad thing is that practically you actually have some decent memory to store more lookup information, enough network bandwidth for more contacts and that the lookup time with this method is linear with the number n of nodes (so O(n)). It’s also not very resilient against node disappearance.

To improve the lookup, Chord extends that idea by introducing a routing table containing O(log n) elements (at most 140 and that’s with a shitload of nodes). The first node in this table is the closest successor from our node’s position (say p) plus 2. The second element is the closest successor of key (p + 4). The third of key (p + 8). I think you get my drift. Generally speaking, element at index i in this table are located at successor(p + 2^i) starting at 1. Obviously some nodes will be duplicated in this table (it’s fairly unlikely you’ll have a node between p + 2 and p + 4) so the routing information to maintain is decently low. It’s important because maintaining routing information requires some constant traffic to make sure known nodes are still alive (otherwise a new node will need to be looked up to replace a missing one). With this scheme, the area “close” to a n

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.