Some quick testing seems to show that the cost of joining table A to table B (on preexisting keys) is proportional to size(A)*log2(size(B)), that is, each element of table A is looked up (using binary search) in table B. This is very efficient when size(A) << size(B), but when they're of similar size, a linear merge can be done in time size(A)+size(B), which can be several times faster. Binary merge has a smooth transition from one regime to the other, but a simple cross-over is pretty good already.<div>
<br></div><div>I would put together a nice script to show this explicitly, but that will take some time, so I wanted to check whether this is a known issue or if I need to demonstrate this issue in more detail. (Or if I've made some stupid blunder.)</div>
<div><br></div><div>Thanks!</div><div><br></div><div> -s</div>