Third in the line of posts on sorting algorithms (first was Bubble
Sort and the
second was Insertion
Sort) is
the Shell Sort. This is
basically a variant of the insertion sort, with the difference that
you're sorting the array multiple times. Another way of looking at this
is that the shell sort is really a second level sort, wrapped around the
actual sorting algorithm. The inner algorithm here is the insertion sort
but it could technically be anything, owing to what the shell sort
actually does: running a number of smaller sorts, gradually sorting the
array, before finally going over the whole array.
What makes the shell sort work is that you first divide the array into n
smaller arrays and sort them (technically, you don't divide the array
into smaller pieces, you just operate on sub-parts of it). After this,
you divide this, now half-sorted, array into n/2 arrays, sort each of
these in turn. Iterate till you're working on the whole array. The trick
is that each time you sort one of the sub-parts, this also sorts the
array. Because using insertion sort on an already sorted array
approaches O(n), the final sort of the entire array in shell sort is
much faster than the equivalent insertion sort on an unsorted array.
The Code
<?php
class ShellSort extends BaseSort
{
public function sortFunction()
{
$count = count($this->store);
$array = $this->store;
$gap = floor($count / 3);
do
{
for ($i = 0; $i < $gap; $i++)
{
$temp = array($array[$i]);
for ($ii = ($i + $gap), $a = 1; $ii < $count; $ii += $gap)
{
$value = $array[$ii];
$b = $a;
while ($b > 0 && $temp[$b - 1] > $value)
{
$temp[$b] = $temp[$b-1];
$b--;
}
$temp[$b] = $value;
$a++;
}
for ($ii = $i, $a = 0; $ii < $count; $ii += $gap)
{
$array[$ii] = $temp[$a];
$a++;
}
}
}
while ($gap = floor($gap / 2));
$this->store = $array;
}
}
The difference to the insertion sort should be readily seen: the outer
loop dividing the main array into smaller arrays wraps around an
insertion sort.
A note on the algorithm: the sub-parts are taken from the original array
by picking every n element. This means that when you move an element in
the array, you move it n steps at a time. Compare this to the bubble
sort, where an element is only moved one step at a time, and you can
probably see why this would be a lot faster - even if the inner loop is
the bubble sort.
Some notes on the code: it could obviously do with some proper variable
naming - it's the result of a couple of iterations trying to improve the
efficiency. At first I had opted to run the inner insertion sort inline
on the array but after getting some fairly bad results I opted for using
a temporary array before reinserting the elements into the array to
sort.
Results
First, the 1.000 elements:
1000 elements to sort.
Sanity check. PHP native sort() algorithm took: 0.003565 seconds. Memory used: 64464 bytes
Running ShellSort sort. Algorithm took: 0.044342 seconds. Memory used: 64376 bytes
At 1.000 elements, the shell sort is about nine times faster than
insertion sort. The next test is 10.000 elements.
10000 elements to sort.
Sanity check. PHP native sort() algorithm took: 0.044411 seconds. Memory used: 665912 bytes
Running ShellSort sort. Algorithm took: 0.633066 seconds. Memory used: 665816 bytes
Still running slower than the built in PHP sort but nothing remotely
close to what could be seen with insertion sort. At 10.000 elements,
shell sort is running 50 times faster than insertion sort and only 15
times slower than the run at 1.000 elements. The insertion and bubble
sorts were so slow that there was no point trying to run them for
100.000 elements but here it makes sense.
100000 elements to sort.
Sanity check. PHP native sort() algorithm took: 0.600952 seconds. Memory used: 6524656 bytes
Running ShellSort sort. Algorithm took: 8.780088 seconds. Memory used: 6524588 bytes
Again, adding 10 times the elements ups the execution time by 15 times -
nothing like the O(n2) of the insertion sort or bubble sort.
With shell sort, we've reached something that would actually work in a
normal environment. There's of course no reason whatsoever to use it if
the PHP sort function does the job - but if it doesn't here's a
plausible algorithm to use. Not, as I expect to show later, as efficient
as other algorithms, but definitely worth a look.