Saturday, February 5, 2011


I recently came across C.A.R. Hoare's 1962 paper on Quicksort. The paper is divided into two parts: theory and implementation. Most of the theory portion will be familiar to anyone that has taken an algorithms course. There are a few differences in terminology, the most noticeable to me was that Hoare uses the term bound for what is usually called the pivot in modern presentations. I also learned a new phrase, mutatis mutandis, that according to wikipedia means:
Mutatis mutandis is a Latin phrase meaning "by changing those things which need to be changed" or more simply "the necessary changes having been made".
What is really interesting is the implementation details and what they reveal about the state of computers at the time. The first example that comes up is the use of an alternative sort when a partition gets small enough. This is still a common quicksort optimization, but what is interesting is the size at which he recommends this optimization:
Furthermore, if a segment consists of less than three or four items (depending on the characteristics of the computer), then it will be advantageous to sort it by the use of a program specially written for sorting a particular small number of items.
I remember having to compare insertion sort with quick sort in college to find an appropriate cutoff point and as I recall it was around 50 items. Of course, this will vary with the computer, nuances of the sort implementations, and the data set being tested. I just would have expected this number to be at least 10 or so. Another example, is the data provided in table 1:

Number of ItemsMerge SortQuicksort
5002 min 8 sec1 min 21 sec
1,0004 min 48 sec3 min 8 sec
1,5008 min 15 sec*5 min 6 sec
2,00011 min 0 sec*6 min 47 sec

The asterisk by the results for the last two rows of merge sort indicate that those figures had to be estimated because they could not be computed. The reason provided is the limited store size available on the computer:
These figures were computed by formula, since they cannot be achieved on the 405 owing to limited store size.
The description of the machine and its capabilities:
The National-Elliott 405 computer has a delay-line working store of 512 locations, and a magnetic-disc backing store of 16,384 words. The average access time for the working store is 0.8 msec and the average access time for a block of 64 words in the backing store is 32 msec. There are 19 words of immediate-access storage, which are used to contain instructions and working space of the inner loops: the time taken by such loops is about 0.15 msec per instruction.
To get a better idea of the kind of computer he is talking about, see this video from the Yorkshire film archive showing the installation of a 405 computer at Dansom Lane. It won't help for understanding the memory limitations, but it is still interesting to get an idea of what the physical machine they were testing on looked like. The paper is not too clear on exactly what the data set looked like other than the single sentence:
The figures relate to six-word items with a single-word key.
Based on this description it is clear that the problem with merge sort is that it does not sort in-place. Duplicating the data set would mean that merge sort on the 405 could support a maximum data set size of 16,384 words / 2 copies / 6 words per item = 1,365 items. Unfortunately, there is no description of the formula used to calculate the last two times for merge sort. I was curious what the times would be for something like insertion sort. Insertion sort is in-place so it should be able to handle the larger data sets, but what would the expected running time be? An extremely rough approximation would be to assume exactly N2 instructions. If we use the number from the description in the paper of 0.15 msec per instruction and assume that for each instruction we would need to have at least one read from the working store at a cost of 0.8 msec, then we can use a cost of 0.95 msec per operation. This is almost certainly an underestimate of the actual time it would take. If nothing else I'm completely ignoring access to the magnetic disc. Crunch the numbers and look what we wind up with:

Number of ItemsN lg(N)N2
5000 min 4 sec3 min 57 sec
1,0000 min 9 sec15 min 50 sec
1,5000 min 15 sec35 min 37 sec
2,0000 min 20 sec1 hour 3 min 20 sec

Over an hour to sort 2000 items compared to less than 7 minutes for quicksort.

No comments:

Post a Comment