miércoles, 24 de marzo de 2010

Fastest sorting algorithm



Introduction


This document presents the Rhino Direct Double Link sort algorithm (RDDL), which offers a practical, simple and efficient approach based on a linear complexity and structures for easy implementation with optimal performance.



Preliminary concepts


1) n is the number of elements to sort.

2) Doubly linked array is an array whose elements contain data and also the pointer to the next and previous element. This structure allows, in addition to the characteristics of the doubly linked list, direct access to the initial array locations. A doubly linked array is implemented as an array of n+l length, where n is the number of initial elements and l the maximum number of elements may be added.



Characteristics of the algorithm


• Computational complexity: O(n)
• Additional memory: O(n+l)
• Stable:YES
• Applies to integer and real numbers.



Advantages

• High performance in terms of execution speed.
• Simple concept and easy to implement.


Disadvantages

• Additional memory usage



Algorithm description


First, creates a doubly linked array of n length with capacity to add l elements (usually l=n).

Find the maximum value, minimum and range of the elements to order.

Dividing n by the range we get a factor which multiplied by each element gives its position in the doubly linked array.

The element is inserted directly into the doubly linked array or, if necessary, goes in the right direction through the links to find the exact location.

Once all the elements are introduced, going the array via their links from the first element to n we obtain the sorted list.


Example


Given the following list of elements, where n = l = 5:

-5, 7, 0, -4, -1

First, we obtain the maximum value, minimum, the range and factor.

Minimum: -5
Maximum: 7
Range: (7-(-5)) = 12
Factor: 5/12

The doubly linked array is created for 5 elements:



We search the list.

1) First element = -5

Position in the array = -5-(-5) = 0 * Factor = 0.
Insert the element 5 in the 0 position.



2) Second element = 7

Position in the array = 7-(-5) = 5 * Factor = 5.
Insert the element 7 in the 5 position.



3) Third element = 0

Position in the array = 0-(-5) = 5 * Factor = 2.
Insert the element 0 in the 2 position.



4) Fourth element = -4

Position in the array = -4-(-5) = 1 * Factor = 0.
Because the location of index 0 is occupied by a smaller number, we look for a location in the right through the links. Finally, there is a free location in the following position on the right and inserts.



5) Fifth element = -1

Position in the array = -1-(-5) = 4 * Factor = 1.
Because the location of index 1 is occupied by a smaller number, we look for a location in the right through the links. Because the next element in the right is an large number, we insert the element between them.



Note that the created node is the last array element.

Finally, going the array via their links, from the first element to n, we obtain the sorted list.



Computational complexity


The computational complexity is O(n) in the best case and O(n2) in the worse.
The lowest performance is when large groups and jumps appears in the values of the elements.

Example:

If all the elements are sorted numbers between 0 and 1, but one is equal or greater n, the algorithm will try to locate all elements (except the one equal or greater n) from the first location.

Fortunately, it is unlikely that unfavorable distributions appear at random for a large number of elements.

For the average case, the algorithm demonstrates empirically yields much higher than Quicksort, as shown in the comparison chart in Figure 1.



Stability


Although the fastest implementation is not stable, stability is easily achieved at the cost of a virtually nil sacrifice of performance for most cases.







Figure 1: Performance runtime comparison Quicksort / RDDL. The values indicate the number of times faster than RDDL sorts versus Quicksort. For the test we used the recursive optimized Quicksort algorithm and the RDDL implementation of this document.



Algorithm optimization


The characteristics of the algorithm allows the optimization for the specific cases of hardware, programming language and type of elements to sort.

For integers, the calculation of the factor and the subsequent multiplication may be more efficient by using fixed point instead of floating, and adjusting the values of l depending on n and the range (if known). Moreover, the use of fixed point reduces the range of elements that can be sorted.



Efficient implementation of the doubly linked array


For maximum efficiency, avoid using external classes in the implementation, so that the doubly linked array consists of 3 standard arrays: one for the values and the others as pointers to the next and previous element.

When you need to insert a new element in the doubly linked array, is taken from the end of the arrays. In case you need l insertions, you need l new elements in addition to the n elements that are directly referenced, so the total length of the auxiliary arrays will be n + l.

The following table shows the resulting memory consumption by the number of elements that are sorted:



Figure 2: Additional memory consumption depending on the number of elements for l = n. They are expected to require 4 bytes to store integers and pointers.



Java source code


The following source code implements efficiently RDDL for l = n. Sorts arrays of type sObject through a key of type double.

/**
* Title: RDDL Sort
* Description: Sort an array from lowest to highest
* Copyright: Copyright (c) 2008
* @author Alejandro del Campo Gómez
* @version 1.0
*/

public class rddlSort {

// auxiliary doubly linked arrays
sObject[] value;
int[] prev;
int[] next;
sObject aux_min;
sObject aux_max;

public rddlSort(int n) {
int total = n + n;
prev = new int[total];
next = new int[total];
value = new sObject[total];
aux_min = new sObject();
aux_max = new sObject();
aux_min.key = Double.NEGATIVE_INFINITY;
aux_max.key = Double.POSITIVE_INFINITY;
}

// sort

public void sort(sObject[] array) {
int i, n, p, pos;
double v, min, max, factor, range;

n = array.length; // number of elements to sort
pos = n + n; // pointer to new elements

// search for the maximum and minimum values and initializes the auxiliary
// doubly linked arrays
next[0] = 1;
max = min = array[0].key;
for (i = 1; i < n; i++) {
v = array[i].key;
min = min < v ? min : v;
max = max > v ? max : v;
prev[i] = i - 1;
next[i] = i + 1;
value[i] = null;
}
value[0] = aux_min;
value[n-1] = aux_max;

// calculates the factor
range = max - min;
factor = 0.0;
if (range > factor) {
factor = (n - 1) / range;
}

// introduces the elements in the doubly linked array in a sorted way
loop:for (i = n-1; i >= 0; i--) {

v = array[i].key;
p = (int) ( (v - min) * factor); // initial position

// free?
if (value[p] == null) {

// inserts it directly
value[p] = array[i];

} else if (v > value[p].key) { // greater?

// look where insert it in the right
p = next[p];
while (value[p] != null) { // while not free:

// free place?
if (v <= value[p].key) {

// inserts it
value[--pos] = array[i];
prev[pos] = prev[p];
next[pos] = p;
next[prev[p]] = pos;
prev[p] = pos;
continue loop;
}
p = next[p]; // continue searching
}
// inserts it in a free place
value[p] = array[i];

} else {

// look where insert it in the left
p = prev[p];
while (value[p] != null) { // while not free

// free place?
if (v > value[p].key) {

// inserts it
value[--pos] = array[i];
prev[pos] = p;
next[pos] = next[p];
prev[next[p]] = pos;
next[p] = pos;
continue loop;
}
p = prev[p]; // continue searching
}
// inserts it in a free place
value[p] = array[i];
}
}

// copy the sorted elements from the doubly linked array to the
// original array
for (i = 0, p = 0; i < n; i++) {
do {
p = next[p];
} while (value[p] == null);
array[i] = value[p];
}
}
}


Notes


Often, algorithms of computational complexity O(n) require conditions which, in the practice, make its use not practical. Large memory requirements, or a slow implementation, can make them generally less efficient than algorithms of complexity O(n log n) as Quicksort.

An optimal implementation of the algorithms is critical in measuring the final performance. Avoiding method calls in the inner loops or the use of indirections, object creation, and so on. An important part of the good performance of Quicksort is due this, its implementation is highly optimized.

RDDL allows implementing concrete solutions to improve the average performance. It is the responsibility of the programmer to implement it in the most adequate way.



Commonalities and problems of Bucket sort


Clearly there are similarities in the process of assigning elements to locations between RDDL and Bucket sort. But RDDL introduces a decisive improvement regarding Bucket sort in the use of the doubly linked array as structure where the elements are sorted.

In Bucket sort there are k buckets where values are placed a certain range. The location of the elements in the buckets can be done in an sorted way or to be sorted later, using any sorting algorithm.
Because each bucket can contain n elements, a fast implementation should use linked lists, or an array of n elements for each bucket. As an array of n elements for each bucket requires huge amounts of memory, it is usually implemented using linked lists.
This presents a problem of efficiency and must choose between entering the elements sorted in the buckets or sorting the linked lists later. In practice, both ways slow down the process, so that Bucket sort is usually slower than RDDL.

The RDDL structure solves the problems of Bucket sort and is presented as an efficient alternative to Quicksort for many cases.




(C) Copyright 2008

File number: SE-00272-2008
Registration number: 200899900152460