@z This file was created by Aleksandar Donev as part of the Network Optimization project. Feel free to use any portion of it and contact me at donev@pa.msu.edu @x \Title{Ranking and Sorting} \author{Aleksandar Donev} \date{January 2000} @*0 Module |Ranking_Sorting|. This module provides routines for ranking numerical arrays according to one or several {\em numeric} keys in increasing order. The routines all have the same interface at the topmost level and are all generic in that they accept either an integer or real rank-1 (single key) array or possibly an integer rank-2 (multiple keys as different rows, with the first row being the most significant key) for comparison-based sorts. The routines are based on sorting algorithms, but really perform ranking in that they do not modify the input array and only return a permutation which sorts the input array via |array=array(permutation)|. From the comparison-based sorts, {\bf Insertion}, {\bf Shell} and {\bf Quick} are provided, with an ORDERPACK 1.0 Merge-sort based ranking routine used as the base for timing. These can work with multiple keys as well, in which case the comparison itself is a short loop through the keys of the keys being compared. Also provided are very efficient (for large arrays) {\bf Hash} and {\bf Radix} sort based ranking routines. These are usually faster than the comparison based sorts (which are bound by an $O(n)$ lower complexity bound), but more care must be taken when using them. After I run through the code quickly, I will give a conclusion with guidelines on when to use which sort. @a@%% MODULE Sorting_Ranking @; USE Precision @; // Kind parameters @%% USE Standard_Types, ONLY: BIT_SIZE @; // Extended to |REAL| and other data types USE Error_Handling @; USE System_Monitors @; USE Error_Function, ONLY : Erf, InvErf USE Random_Numbers @; IMPLICIT NONE @; PUBLIC :: TestPermutation, TestRanking, DisorderPermutation, ShellInsertionRank, & RadixRank, HashRank, QuickRank, MergeRank @; PRIVATE @; LOGICAL, SAVE, PUBLIC :: debug=.FALSE. @; // Whether to print helpful information INTEGER(KIND=i_dp), SAVE, PUBLIC :: n_swaps=0 @; // I will keep track of the number of swaps only! @@; @@; CONTAINS @; @@; @@; @@; @@; @@; @@; @@; @@; END MODULE Sorting_Ranking @; @%% @*1 Auxilliary Macros. We will of course need many macros to perform some of the common and simple tasks. Here is a macro to perform a swap of two elements (to avoid procedure calling overheads) using a single temporary: @f index _index @m _Swap(a, b, swap_temp) @;@%% swap_temp = a @; a = b @; b = swap_temp @; @%% @ This macro finds the median of three values |a|, |b| and |c| using two temporaries. It is not really used in my |QuickRank| because I implemented other pivot strategies, but here it is for completeness: @m _MedianOf3(a, b, c, median, temp_min, temp_max) @;@%% IF( a >= b ) THEN @; // Choose the larger of |a| and |b| temp_max = a @; temp_min = b @; ELSE @; temp_max = b @; temp_min = a @; END IF @; IF( c > temp_max) THEN @; median = temp_max @; ELSE IF( c < temp_min) THEN @; median = temp_min @; ELSE @; median = c @; END IF @; @%% @ Aside from swapping, the other fundamental operation in comparison-based sorting algorithms is the actual comparison. If we have single keys, the intrinsic comparison operators do the job. The following macro will tell whether the two arguments are in correct ranking order (i.e. {\em non-decreasing} order going from lower to higher index): @m _CompareSingleKey(comparison_test,key_array,first,second,...) @;@%% // |comparison_test| is LOGICAL comparison_test = key_array[first] <= key_array[second] @; // Equal is OK @ When we have multiple keys things get somewhat more entangled. I know of two possible simple comparison tests. The first tests the keys in order until one of them tests negative. Since we are only interested in numerical keys, if one knows the ranges of the keys it is possible to convert a sequence of keys into a single key and comparison_test that. Here I decide to stick with the first version for simplicity. To prevent the compiler from possibly unrolling the short comparison loop when there are only two keys (a common case) I make an automaticaly inlined version: @m _CompareMultipleKeys(comparison_test,key_array,first,second,ID,...) @;@%% SELECT CASE(SIZE(key_array,1)) @; // How many keys are there CASE(1) @; comparison_test = (key_array[1,first] <= key_array[1,second]) @; CASE(2) @; // Inlined version for only two keys--very common temp_compare = (key_array[1,first] - key_array[1,second]) @; // Test most significant key IF(temp_compare==0) THEN @; // In case they are equal--really only for integers comparison_test = (key_array[2,first] <= key_array[2,second]) @; // Test less significant key ELSE @; comparison_test = (temp_compare < 0) @; END IF @; CASE DEFAULT @; // More then 2 keys--use a loop comparison_test=.TRUE. @; // If all keys are equal CompareKey@e@&ID: DO key=1,SIZE(key_array,1) @; temp_compare = (key_array[key,first] - key_array[key,second]) @; IF(temp_compare==0) THEN @; // If equal CYCLE CompareKey@e@&ID @; // Go to next key ELSE @; comparison_test = (temp_compare < 0) @; EXIT CompareKey@e@&ID @; END IF @; END DO CompareKey@e@&ID @; ENDSELECT @; @%% @*1 Testing Functions |TestPermutation| and |TestRanking|. I give two routines which check the validity of the ranking permutations returned by the routines in this module, one that verifies that a given array is indeed a permutation, the other which verifies that a ranking is indeed a non-decreasing ordering. These are most suitable for debugging and I leave them here: @= FUNCTION TestPermutation(permutation) RESULT(permutation_test) @; // Is this a valid permutation? IMPLICIT NONE @; INTEGER(KIND=i_wp), DIMENSION(:), INTENT(IN) :: permutation @; LOGICAL :: permutation_test @; // |.TRUE.| if |permutation| is valid LOGICAL(KIND=l_short), DIMENSION(SIZE(permutation)) :: count_array @; // An auxilliary array INTEGER(KIND=i_wp) :: index, element @; // A counter count_array=.FALSE. @; permutation_test=.TRUE. @; TestDuplicates: DO index = 1, SIZE(permutation) @; element=permutation[index] @; IF( (1<=element) .AND. (element<=SIZE(permutation)) ) THEN @; IF(.NOT.count_array[element]) THEN @; count_array[element] = .TRUE. @; // Visit this location ELSE @; IF(debug) @~ WRITE(message_print_unit,*) "Duplicate found in permutation" @; permutation_test=.FALSE. @; // There was a duplicate in the permutation array EXIT TestDuplicates @; END IF @; ELSE @; IF(debug) @~ WRITE(message_print_unit,*) "Out of range index found in permutation" @; permutation_test=.FALSE. @; // An index was out of the allowed range EXIT TestDuplicates @; END IF END DO TestDuplicates @; RETURN @; END FUNCTION TestPermutation @; @ @m _TestRanking(_type,_kind,_rank,_Compare) @; // Is this ranking non-decreasing? FUNCTION TestRanking_@e@&_rank@e@&_@e@&_kind(array,permutation) RESULT(ranking_test) @; _type(KIND=_kind), DIMENSION(_FullExtent(_rank)), INTENT(IN) :: array @; INTEGER(KIND=i_wp), DIMENSION(:), INTENT(IN) :: permutation @; LOGICAL :: ranking_test @; // |.TRUE.| if ranking is OK LOGICAL :: comparison_test @; INTEGER(KIND=i_wp) :: index, key @; // A counter _type(KIND=_kind) :: temp_compare @; ranking_test=.TRUE. @; TestMisplaced: DO index = 1, SIZE(permutation)-1 @; @e#!_Compare(comparison_test, array, permutation[index], permutation[index+1], _TR) @; IF( .NOT.comparison_test) THEN @; // Ooops! IF(debug) @~ WRITE(message_print_unit,*) "Increase in key found in ranking" @; ranking_test=.FALSE. @; EXIT TestMisplaced @; END IF @; END DO TestMisplaced @; RETURN @; END FUNCTION @; // |TestRanking| @%% @*1 Partially Disordered Permutations. This routine is very-very useful for benchmarking the routines. It is based on a routine from ORDERPACK. What it does is it returns a permutation array whose "distance" from the identity permutation |permutation=(/(i,i=1,N)/)| is controlled by a distance parameter |disorder|. If one chooses |disorder=0.0|, the permutation will be the identity one, while |distribution=1.0| will return an almost perfectly random permutation. The parameter |disorder_distribution| should be either |"Uniform"| (in this case one is almost guaranteed that indexes in the permutation will not be further away from their position in the identity permutation by |disorder*INT(SIZE(array), KIND=i_wp)|, or |"Normal"| (in this case some entries may be moved a lot). Therefore, if |array| is already sorted, |CALL DisorderPermutation(.10, permutation)| and then |array=array(permutation)| will produce an array that is almost sorted, and this can then be used in benchmarks. @%% @f range _range @= SUBROUTINE DisorderPermutation(disorder,permutation,disorder_distribution) @; IMPLICIT NONE @; INTEGER(KIND=i_wp), DIMENSION(:), INTENT(OUT) :: permutation @; // Disordered permutation REAL, INTENT(IN) :: disorder @; // In $[0,1]$ @%% LOGICAL, INTENT(IN), OPTIONAL :: partially_ranked @; CHARACTER(LEN=*), INTENT(IN), OPTIONAL :: disorder_distribution // Starting with |"U"| or |"N"| (|"G"|) or lowercase versions REAL, DIMENSION(SIZE(permutation)) :: temp REAL :: p @; // The disorder INTEGER(KIND=i_wp) :: size_permutation, index @; // A counter @%% LOGICAL :: use_permutation @; CHARACTER(LEN=1) :: distribution @; distribution='U' @; IF(PRESENT( disorder_distribution)) @~ distribution = disorder_distribution[1:1] @; size_permutation=SIZE(permutation) @; p=MIN(1.0,MAX(disorder,0.0)) @; @%% _InitializePermutation(size_permutation) @; SELECT CASE(distribution) @; // This uses my |Random_Numbers| module CASE('N','n','G','g') @; // Normal (gaussian) distribution CALL RandomNormal(temp, mean_std=(/0.0, 0.25 * p * @E REAL(size_permutation)/)) @; // I determined the number 0.25 via experimentation CASE DEFAULT @; CALL RandomUniform(temp, range=(/0.0, p * @E REAL(size_permutation)/)) @; ENDSELECT @; DO index=1,size_permutation @; // Misplace the entries in the identity permutation temp[index] = temp[index] + (1.0-p)*@E REAL(index) @; END DO @; DO index=1,size_permutation @; permutation[index]=index @; // Default order END DO @; CALL QuickRank(array=temp,permutation=permutation,partially_ranked=.TRUE.) @; // Rank disordered array to get a disordered permutation @%% CALL MergeRank(temp,permutation) @; // Now rank this array using a fast routine END SUBROUTINE DisorderPermutation @; @ As usual, I give versions for several different kinds of integers and/or reals. @%% For some odd reasons (overflow in FWEB's macro buffers!!!), I had to eliminate |i_dp| integers for multiple keys. @= _TestRanking(@E INTEGER, i_sp, 1, _CompareSingleKey) @; _TestRanking(@E INTEGER, i_dp, 1, _CompareSingleKey) @; _TestRanking(@E REAL, r_sp, 1, _CompareSingleKey) @; _TestRanking(@E REAL, r_dp, 1, _CompareSingleKey) @; _TestRanking(@E INTEGER, i_sp, 2, _CompareMultipleKeys) @; _TestRanking(@E INTEGER, i_dp, 2, _CompareMultipleKeys) @; @%% @ I overload these with a generic interface: @= INTERFACE TestRanking MODULE PROCEDURE TestRanking_1_i_sp @; MODULE PROCEDURE TestRanking_1_i_dp @; MODULE PROCEDURE TestRanking_1_r_sp @; MODULE PROCEDURE TestRanking_1_r_dp @; MODULE PROCEDURE TestRanking_2_i_sp @; MODULE PROCEDURE TestRanking_2_i_dp @; END INTERFACE @; @%% @*1 Ranking Function Structure. Here I give the basic structure of the ranking functions (for single or multiple keys) that is used through-out this web file. Since most routines will share the same basic input/output arguments and some local variables, these are put in a macro. The rest is an argument to these macros. Most of the routines in this library define |permutation| as an |INTENT(IN OUT)| argument. The boolean argument |partially_ranked| tells whether the content of |permutation| should be used as an initial ranking or whether we should simply initialize |permutation| to the identity one and start ranking from scratch: @%% @m _InitializePermutation(size_array) @;@%% IF(PRESENT(partially_ranked)) THEN @; use_permutation = partially_ranked @; ELSE use_permutation = .FALSE. @; // Start from scratch by default END IF @; IF(.NOT.use_permutation) THEN @; // Initialize the permutation with the default ordering DO index=1,size_array @; permutation[index]=index @; // Default order END DO @; END IF @; @ This macro will produce a routine for ranking a one-dimensional (single-key) array: @%% @m _RankSingleKey(_type, _kind, SortName, _Sort, @| _AdditionalArguments,_AdditionalDeclarations, _LocalVariables, _Compare, _Equal, ...) @;@%% SUBROUTINE SortName@e@&RankSingle_@e@&_kind(array, permutation, partially_ranked, _AdditionalArguments) @; IMPLICIT NONE @; _type(KIND=_kind), DIMENSION(:), INTENT(IN) :: array @; // Keys to be sorted INTEGER(KIND=i_wp), DIMENSION(:), INTENT(INOUT) :: permutation @; // The permutation that sorts |array| LOGICAL, INTENT(IN), OPTIONAL :: partially_ranked @; // Initial permutation provided ? @e#!_AdditionalDeclarations(_type,_kind, #.) @; @e#!_LocalVariables(_type,_kind, #.) @; LOGICAL :: comparison_test, use_permutation @; // Needed by most sorts INTEGER(KIND=i_wp) :: index, element, temp_index, temp_element @; // Almost all sorts need these INTEGER :: alloc_status @; // A temporary for the allocation status _type(KIND=_kind) :: temp_compare @; // A temporary _InitializePermutation(INT(SIZE(array), KIND=i_wp)) @; @e#!_Sort(array,permutation,INT(SIZE(array), KIND=i_wp), @e#!_Compare, @e#!_Equal, #.) @; // Actual body END SUBROUTINE @; // |RankSingle| @ This macro will produce a routine for ranking a two-dimensional (multiple-key) array using a comparison-based algorithm. The keys are assumed to be in a column of the key array |array| where the first key (first row) is the most significant key. @%% @m _RankMultipleKeys(_type,_kind,SortName,_Sort, @| _AdditionalArguments,_AdditionalDeclarations, _LocalVariables, _Compare, _Equal, ...) @;@%% SUBROUTINE SortName@e@&RankMultiple_@e@&_kind(array, permutation, partially_ranked, _AdditionalArguments) @; IMPLICIT NONE @; _type(KIND=_kind), DIMENSION(:,:), INTENT(IN) :: array @; // Keys to be sorted INTEGER(KIND=i_wp), DIMENSION(:), INTENT(INOUT) :: permutation @; // The permutation that sorts |array| LOGICAL, INTENT(IN), OPTIONAL :: partially_ranked @; // Initial permutation provided ? @e#!_AdditionalDeclarations(_type,_kind,#.) @; @e#!_LocalVariables(_type,_kind,#.) @; LOGICAL :: comparison_test, use_permutation @; INTEGER(KIND=i_wp) :: index, element, temp_index, temp_element, key @; INTEGER :: alloc_status @; // A temporary for the allocation status _type(KIND=_kind) :: temp_compare @; // A temporary _InitializePermutation(INT(SIZE(array,DIM=2), KIND=i_wp)) @; @e#!_Sort(array,permutation,INT(SIZE(array,DIM=2), KIND=i_wp), @e#!_Compare, @e#!_Equal, #.) @; END SUBROUTINE @; @%% @*1 Shell and Insertion Sort. Shell sort is a more efficient $O(N^{1.226})$ sorting algorithm based on insertion sort, which applies insertion (ranking) sort to non-contigous subsections of the key array (with spacing |gap| between the elements). It does not access the array sequentially in the early stages, though, so plain insertion sort may be better in some cases, especially if the array is already very presorted. The basis of both algorithms is an insertion sort with a given gap (|gap=1| for plain insertion), which is the following macro: @%% @m _InsertionSort(ID,array,permutation,size_array,_Compare,gap,...) @;@%% DO index = 1, size_array-gap @; @e#!_Compare(comparison_test, array, permutation[index], permutation[index+gap], ID@e@&_I1, #.) @; // Invoke comparison macro IF (.NOT.comparison_test) THEN @; // Peck to find correct position temp_element = permutation(index+gap) @; // Store Peck@e@&ID: DO test_index = index, 1, -gap // Find the insertion point @e#!_Compare(comparison_test, array, permutation[test_index], temp_element, ID@e@&_I2, #.) @; IF(comparison_test) @~ EXIT Peck@e@&ID @; // Found correct position permutation[test_index+gap] = permutation[test_index] @; // Move to right n_swaps=n_swaps+1 @; END DO Peck@e@&ID @; permutation[test_index+gap] = temp_element @; // Insert in correct location n_swaps=n_swaps+1 @; END IF @; END DO @; @ My shell sort tries to be more efficient by not starting from a very large gap if the array is already presorted (indicated by the parameter |disorder| as in |DisorderPermutation|), and also allows one to do partial ranking by finishing with |gap>=1|, controlled by the input argument |last_gap| (so that all elements will be at most |last_gap| positions away from their true rank position). Practice has shown that these are helpful, but not too much. First, here are the additional arguments and local variables needed by |ShellRank|: @%% @m _ShellArguments method, last_gap, disorder @;@%% @m _ShellDeclarations(...) @;@%% CHARACTER(LEN=*), INTENT(IN), OPTIONAL :: method @; // Either |"Insertion"| or |"Shell"| INTEGER(KIND=i_wp), INTENT(IN), OPTIONAL :: last_gap @; // About |final_disorder*INT(SIZE(array), KIND=i_wp)| REAL, INTENT(IN), OPTIONAL :: disorder @; // Initial disorder in the array @m _ShellLocalVars(...) @;@%% INTEGER(KIND=i_wp) :: gap, min_gap, max_gap, test_index @; LOGICAL :: shell_sort @; @%% REAL :: misplaced @; @ And the actual computational body: @m _ShellSort(ID,array,permutation,size_array,_Compare,...) @;@%% IF(PRESENT(last_gap)) THEN @; @%% misplaced=MAX(1E-3,MIN(1.0,disorder)) @; min_gap=MAX(1_i_wp,last_gap) @; ELSE @; @%% misplaced=1.0 @; min_gap=1_i_wp @; // Sort all the way by default END IF @; @%% IF(debug) @~ WRITE(message_print_unit,*) "Truncating shell sort at gap:", min_gap IF(PRESENT(disorder)) THEN @; // No need to start with the largest possible gap max_gap=MAX(1_i_wp,MIN(size_array,INT(disorder* @E REAL(size_array), i_wp))) @; ELSE @; max_gap=size_array @; // Completely disordered array END IF @; @%% IF(method[1:1]=='S') THEN @; @%% gap = max_gap/2 @; // Start with large gap @%% IF(debug) @~ WRITE(message_print_unit,*) "Starting with gap:", gap @; @%% ELSE @; gap = INT(0.9102392227*LOG(2*REAL(max_gap)+1), i_wp) @; // Find nearest integer that is of the form $(3^p-1)/2$--as in Meissner gap = 3^gap/2 @; IF(debug) @~ WRITE(message_print_unit,*) "Starting with gap:", gap @; @%% END IF @; SortShell@e@&ID: DO @; // Rank shells via straight insertion IF (gap < min_gap) @~ EXIT SortShell@e@&ID @; // Done sorting @%% IF (gap < min_gap) @~ gap=1 @; // Switch to insertion sort _InsertionSort(ID@e@&_S, array, permutation, size_array, @e#!_Compare, gap) @; @%% IF(method[1:1]=='S') THEN @; @%% gap = gap/2 @; // Decrease gap @%% IF (MODULO(gap, 2) == 0) @~ gap = gap - 1 @; // Make sure gap is odd @%% ELSE @; gap=gap/3 @; // From Meissner @%% IF(debug) @~ WRITE(message_print_unit,*) gap @; @%% END IF @; END DO SortShell@e@&ID @; @ I decided to bundle both straight insertion and shell ranking in one routine, with the argument |method| controlling which one to actually use: @m _ShellInsertionSort(array,permutation,size_array,_Compare,_Equal,...) @;@%% shell_sort=.TRUE. @; // Use shell sort by default IF(PRESENT(method)) THEN @; IF(method[1:1]=='I' .OR. method[1:1]=='i') THEN @; shell_sort=.FALSE. @; END IF @; END IF @; IF(shell_sort) THEN @; // Do shell sort _ShellSort(_S,array,permutation,size_array, @e#!_Compare) @; ELSE @; // Do plain insertion sort _InsertionSort(_I, array, permutation, size_array, @e#!_Compare, 1) @; // |gap==1| END IF @; @%% @ And now I provide routines for several different types and kinds and overload them generically: @%% @m _ShellRank(_type,_kind,_keys) @;@%% _Rank@e##_keys(_type, _kind, Shell, _ShellInsertionSort, _ShellArguments,_ShellDeclarations, _ShellLocalVars,_Compare@e##_keys,_Dummy, gap) @; @%% @=@%% _ShellRank(@E INTEGER, i_byte, SingleKey) @; _ShellRank(@E INTEGER, i_sp, SingleKey) @; _ShellRank(@E INTEGER, i_dp, SingleKey) @; _ShellRank(@E REAL, r_sp, SingleKey) @; _ShellRank(@E REAL, r_dp, SingleKey) @; _ShellRank(@E INTEGER, i_sp, MultipleKeys) @; _ShellRank(@E INTEGER, i_dp, MultipleKeys) @; @%% _ShellRank(@E REAL, r_sp, MultipleKeys) @; @%% _ShellRank(@E REAL, r_dp, MultipleKeys) @; @ @= INTERFACE ShellInsertionRank @; MODULE PROCEDURE ShellRankSingle_i_byte @; MODULE PROCEDURE ShellRankSingle_i_sp @; MODULE PROCEDURE ShellRankSingle_i_dp @; MODULE PROCEDURE ShellRankSingle_r_sp @; MODULE PROCEDURE ShellRankSingle_r_dp @; MODULE PROCEDURE ShellRankMultiple_i_sp @; MODULE PROCEDURE ShellRankMultiple_i_dp @; @%% MODULE PROCEDURE ShellRankMultiple_r_sp @; @%% MODULE PROCEDURE ShellRankMultiple_r_dp @; END INTERFACE @; @%% @*1 Radix Sort. Radix sort is a very efficient linear (i.e. $O(n)$) algorithm for sorting or ranking numbers with a fixed-size hardware representation. In this implementation, I only work with {\em positive 32- or 64-bit IEEE integer and floating point numbers}. Groups of |n_radix_bits| are extracted from the keys and used as radices for successive bucket sorts, starting at the least significant bits. Speed can be gained by extracting groups of 8 bits using casting-like mechanisms, however, I had enough trouble with standard-conformance with this version as it is so this final modification is left for someone else to do. It is also possible to rank negative values as well, but I will postpone coding that until later on. Please note that the section dealing with |r_dp| real numbers is not standard conforming, but I am using it none-the-less to avoid extra headaches. In principle, |TRANSFER| should be used instead of |EQUIVALENCE|, however, this has great performance penalties with LF95...As usual, we start from declarations and such: @%% @f radix _radix @m _RadixArguments n_radix_bits, n_ignored_bits @;@%% @m _RadixDeclarations(...) @;@%% INTEGER, INTENT(IN), OPTIONAL :: n_radix_bits, n_ignored_bits @; // For now this is not working @m _RadixLocalVars(_type, _kind, N_BITS, ...) @;@%% INTEGER(KIND=i_wp), DIMENSION(:), ALLOCATABLE :: bucket_array @; INTEGER(KIND=i_wp), DIMENSION(:), ALLOCATABLE :: offsets @; INTEGER :: n_bits, n_radices, n_ignored @; INTEGER :: radix, bit_offset, bit @; $IFELSE(_type, REAL, _AuxilliaryEquivalence(N_BITS), _Dummy) @; // For |REAL| numbers @m _AuxilliaryEquivalence(N_BITS) @; INTEGER(KIND=i_@e@&N_BITS) :: temp_integer_@e@&N_BITS @; REAL(KIND=r_@e@&N_BITS) :: temp_real_@e@&N_BITS @; EQUIVALENCE(temp_integer_@e@&N_BITS, temp_real_@e@&N_BITS) @; @ Two important macros for speed are the macros that extract the bits from the keys. Here I simply use Fortran intrinsics to do this, with some extra headaches for |REAL| numbers. The biggest cost on modern architectures will still be from memory trafic though, I believe: @%% @m _ExtractRadix_INTEGER(number, radix, offset, n_bits,...) @; radix=IBITS(number,offset,n_bits) @; @m _ExtractRadix_REAL(number, radix, offset, n_bits, N_BITS) @; temp_real_@e@&N_BITS = number @; radix=IBITS(temp_integer_@e@&N_BITS,offset,n_bits) @; @%% TRANSFER(number,1_i_@e@&N_BITS) @ And here is the actual computational routine. Here I again try to be helpful and allow for the possibility that one wants to ignore the trailing (least-significant |n_ignored_bits|) to permit partial ranking. Use this with care though. The routine will also skip over bucket sorts in case all radices were equal for all elements (such as leading zeros for small integers or same exponents for floating-point numbers): @%% @m _RadixSort(array,permutation,size_array,_Compare,_Equal, N_BITS, _type, _kind, ...) @;@%% IF(PRESENT(n_radix_bits)) THEN @; n_bits = MAX(1,MIN(BIT_SIZE(1)-1,n_radix_bits)) @; // How many bits to take in radix ELSE @; n_bits = 8 @; // Use bytes by default END IF @; n_radices=2^n_bits-1 @; @%% IF(debug) @~ WRITE(message_print_unit,*) "n_radices=", n_radices @; IF(PRESENT(n_ignored_bits)) THEN @; n_ignored = MAX(0,n_ignored_bits) @; // How many bits to take in radix ELSE @; n_ignored = 0 @; // Use bytes by default END IF @; ALLOCATE(offsets(0:n_radices), STAT=alloc_status) @; // Offset table CALL RecordAllocation(n_elements=n_radices+1, mold=1_i_wp, & caller="RadixRank", alloc_status=alloc_status) @; ALLOCATE(bucket_array(size_array), STAT=alloc_status) @; // For bucket sort CALL RecordAllocation(n_elements=size_array, mold=1_i_wp, & caller="RadixRank", alloc_status=alloc_status) @; BucketSort: DO bit_offset=n_ignored, N_BITS-1, n_bits @; // Perform bucket sort on group of |n_bits| bits from |bit_offset|, where |N_BITS| is the total number of bits in key IF( (bit_offset+n_bits) > N_BITS) n_bits=N_BITS-bit_offset @; // F90 standard requires this precaution offsets=0 @; // Initialize counters DO element=1, size_array @; // Count the number of collisions _ExtractRadix_@e##_type(array[element],radix,bit_offset,n_bits,N_BITS,#.) @; offsets[radix]=offsets[radix]+1 @; END DO @; CheckEqual: DO radix=0,n_radices @; // Check if there is any need to proceed IF( (offsets[radix]0) ) THEN @; EXIT CheckEqual @; // Bucket sort is required ELSE IF(offsets[radix]==size_array) THEN @; CYCLE BucketSort @; // All radices were equal to |radix| ELSE @; CYCLE CheckEqual @; // Don't know yet... END IF @; END DO CheckEqual @; IF(debug) @~ WRITE(message_print_unit,*) "Now sorting on bits:", bit_offset, bit_offset+n_bits @; DO radix=1, n_radices @; // Finish making the offset table offsets[radix]=offsets[radix] + offsets[radix-1] @; @%% WRITE(message_print_unit,*) offsets[radix] END DO @; DO index=size_array,1,-1 @; // Perform the actual bucket sort--stable! element=permutation[index] @; _ExtractRadix_@e##_type(array[element],radix,bit_offset,n_bits,N_BITS,#.) @; bucket_array(offsets[radix])=element @; // Put in bucket offsets[radix]=offsets[radix]-1 @; END DO @; permutation=bucket_array @; // Copy the sorted ranking and start again END DO BucketSort @; DEALLOCATE(offsets) @; CALL RecordAllocation(n_elements=-n_radices-1, mold=1_i_wp) @; DEALLOCATE(bucket_array) @; CALL RecordAllocation(n_elements=-size_array, mold=1_i_wp) @; @ And here are the actual instances for different kinds of arguments. Since this is not a comparison-based sort, multiple keys are not directly supported. One can in principle use successive Radix rankings starting with the least-significant key, since it is a stable sort, but in practice for a small number of keys using radix sort on the first key and straight insertion or shell-sort for all other keys is the best way to go: @%% @m _RadixRank(_type,_kind,_keys, N_BITS) @; _Rank@e##_keys(_type, _kind, Radix, _RadixSort, _RadixArguments,_RadixDeclarations, _RadixLocalVars, _Dummy, _Dummy, N_BITS, _type, _kind) @; @=@%% _RadixRank(@E INTEGER, i_32, SingleKey, 32) @; _RadixRank(@E INTEGER, i_64, SingleKey, 64) @; _RadixRank(@E REAL, r_32, SingleKey, 32 ) @; _RadixRank(@E REAL, r_64, SingleKey, 64) @; @ @= INTERFACE RadixRank @; MODULE PROCEDURE RadixRankSingle_i_32 @; MODULE PROCEDURE RadixRankSingle_i_64 @; MODULE PROCEDURE RadixRankSingle_r_32 @; MODULE PROCEDURE RadixRankSingle_r_64 @; END INTERFACE @; @%% @*1 Hash Sort. This code is modelled after R. Vowels's. Hash sort is an interesting algorithm and it is the {\em most efficient algorithm for ranking or sorting randomly-generated numbers}, i.e. numbers whose distribution is known up-front. This function is very complicated and not yet fully untangled, so I will not attempt to fully document it. A simple call with all optional arguments ommited will usually be fastest, although by default a lot of additional memory (twice as many integers as there are keys to rank) is allocated. Hash sort is a very efficient and simple sorting algorithm which is {\em very efficient for ranking randomly generated numbers from a known smooth distribution}. When sorting |size_array| numbers drawn from a random distribution with probability density $p(x)$, a hashing array of size |size_hash=memory_factor*size_array| is allocated and the position of the sorted elements in this array predicted using the {\em hashing factor (function)} $h(x)=\int_{t=-\infty}^x p(t) dt$, i.e., the predicted location is |hash_value=h(x)*size_hash|. After all values are hashed, they are compressed to the front. This routine thus produces a {\em partially ranked array}, which is then strictly ranked with an insertion sort by default, but this can be avoided with the input flag |full_ranking|, and it is then the user's responsibility to run a final pass of insertion or shell sort to fully sort the array if this is needed. The default value of |memory_factor| is 2, though anything in the range 1.5-2.5 should be OK for most purposes (depending on available space). A pointer array may be used to resolve hashing conflicts (for integers with many repetitions per key, for example) if demanded via the argument |use_pointers|, although this is not usually needed. The function |HashRank|, and this is carried on to |QuickRank| can work with either numbers drawn from a uniform or normal distribution, determined by the input argument |distribution| (normal is the default). If the user does not provide the mean and standard deviation of the distribution, the routine calculates these for the given key array |array|. If the user knows that the real distribution is rather distant from either one of these two, than he should supply his own hashing function $h(x)$ via the input argument |hash_function|. This routine actually implements a novel complicated version of Hash Sort which saves space by not hashing all the values at once and instead performing several passes through the key array. This is {\em slower} than a single pass and you should not use it unless you know what you are doing, or there is not enough space to allocate the hash array. There are other input arguments, such as |uniformity_factor|, which should almost always be left at the default values. I {\em emphasize} that |HashRank| {\em is the best sort for truly randomly generated keys from a known smooth distribution. But is is not to be used for small arrays or most of all for numbers that do not follow the distribution used to derive the hashing function $h(x)$ closely!}. @ The hashing factor should usually be a single-precision real number, but in some cases higher precision may be demanded. Therefore I leave this as a choice in the kind parameter |r_hv|. @m R_HV r_sp @;@%% @=@%% INTEGER, PARAMETER, PUBLIC :: r_hv=R_HV @; @ The list of input arguments is long and complicated: @m _HashArguments mean_value, standard_deviation, number_passes, memory_factor, uniformity, hash_function, distribution, store_hash_values, use_hash_pointers, full_ranking @;@%% @m _HashDeclarations(_type,_kind,...) @;@%% _type(KIND=_kind), INTENT(IN), OPTIONAL :: mean_value,standard_deviation @; // Mean and STD of the distribution from which |array| is derived INTEGER, INTENT(IN), OPTIONAL :: number_passes @; // Number of hashing passes through the array REAL, INTENT(IN), OPTIONAL :: memory_factor @; // |memory_factor>1| REAL, INTENT(IN), OPTIONAL :: uniformity @; // |uniformity<1| OPTIONAL :: hash_function @; // User-supplied hashing-function INTERFACE @; FUNCTION hash_function(value) RESULT(hash_factor) @; // $h(x)$ USE Precision @; _type(KIND=_kind), INTENT(IN) :: value @; // $x$ REAL(KIND=R_HV) :: hash_factor @; // In the interval $[0,1]$ END FUNCTION hash_function @; END INTERFACE @; CHARACTER(LEN=*), INTENT(IN), OPTIONAL :: distribution @; // |"Uniform"| or |"Gaussian"| LOGICAL, INTENT(IN), OPTIONAL :: store_hash_values, use_hash_pointers, full_ranking @; // To avoid recalculating hash values @%% @m _HashLocalVars(_type,_kind) @;@%% @%% _type(KIND=_kind), DIMENSION(:), ALLOCATABLE :: hash_array @; // If sorting INTEGER(KIND=i_wp), DIMENSION(:), ALLOCATABLE :: hash_array,hash_pointers,hash_values @; // These are extra space to be allocated INTEGER(KIND=i_wp) :: size_array, size_hash @; // Size of the allocated arrays INTEGER(KIND=i_wp) :: hash_value, hash_guess, hash_range(2) @; INTEGER(KIND=i_wp) :: hash_index, n_placed, n_hashed, n_hashed_max, n_front, n_back @; // Counters INTEGER :: n_passes, pass @; // Indices REAL(KIND=r_hv) :: mem_factor, extra_space, hash_factor @; REAL(KIND=r_hv) :: mean, std, inv_width, total @; // For distribution LOGICAL :: uniform_distribution, store_hash, one_pass, use_pointers, finish_ranking @; // Choices @%% @ Here is the macro that deals with the calculation of the hash values. For a uniform distribution, this is a relatively fast operation, but for a normal distribution the error function is needed. Here I use the |Erf| function modelled after A. Miller's code. It is more expensive, but not too much: @%% @m _CalculateHashValue(value) @; IF(PRESENT(hash_function)) THEN @; hash_factor = hash_function (value) @; // User supplied ELSE @; IF(uniform_distribution) THEN @; // Uniform distribution hash_factor=0.5_r_hv*((@E REAL(value, KIND=r_hv)-mean)*inv_width + 1.0_r_hv) @; ELSE @; // Normal distribution by default hash_factor=0.5_r_hv*(Erf((@E REAL(value, KIND=r_hv)-mean)*inv_width) + 1.0_r_hv) @; END IF @; END IF @; hash_value=INT (extra_space*hash_factor * @E REAL(n_passes*size_hash,KIND=r_hv), i_wp)+1 @; // Location to hash to @ The routine accepts many optional arguments and has to make some choices: @m _HashChoices @;@%% IF(PRESENT(number_passes).AND.(number_passes>0)) THEN @; n_passes=number_passes @; ELSE @; n_passes=1 @; // Default is one pass END IF @; IF(n_passes==1) @~ one_pass=.TRUE. @; // There is only one pass--more optimizations possible IF(PRESENT(memory_factor)) THEN @; // Space in hash array mem_factor=@E REAL(MAX(1.0,memory_factor), KIND=r_hv) @; // Must be greater than 1 ELSE @; mem_factor=2.0_r_hv @; // Twice more space is the default END IF @; IF(PRESENT(uniformity)) THEN @; // Extra space in case of distorted distribution extra_space=@E REAL( MAX(EPSILON(0.0),MIN(1.0,uniformity)), KIND=r_hv) @; // Must be in $(0,1]$ ELSE @; extra_space=0.9_r_hv @; // Variations of 10 percent or so are common END IF @; uniform_distribution=.FALSE. @; // Default is gaussian-based hashing IF(PRESENT(distribution)) THEN @; IF(distribution[1:1]=='U' .OR. distribution[1:1]=='u') & uniform_distribution=.TRUE. @; // Use uniform hashing END IF @; store_hash=.FALSE. @; // Recalculate hash values by default IF(PRESENT(store_hash_values)) THEN @; IF(store_hash_values .AND. (n_passes>1)) store_hash=.TRUE. @; END IF @; use_pointers=.FALSE. @; // Conserve memory by default IF(PRESENT(use_hash_pointers)) @~ use_pointers=use_hash_pointers @; @%% IF(debug) @~ WRITE(message_print_unit,*) "use_pointers=",use_pointers," one_pass=",one_pass @; finish_ranking=.TRUE. @; // Finish with insertion pass by default IF(PRESENT(full_ranking)) @~ finish_ranking=full_ranking @; @ This piece of code actually initializes the routine before hashing begins. It also calculates the parameters of the random distribution needed by the hashing functions above: @m _HashInitialization @;@%% size_array=INT(INT(SIZE(array), KIND=i_wp), KIND=i_wp) @; // {\bf To be added to |SIZE| in Fortran 2K} size_hash=INT(mem_factor * @E REAL(size_array/n_passes, KIND=r_hv) / extra_space, i_wp) + 1 @; // Total hashing space n_hashed_max=INT(@E REAL(size_array/n_passes, KIND=r_hv)/extra_space, i_wp) @; // Maximum number of entries to hash in one pass ALLOCATE(hash_array(size_hash), STAT=alloc_status) @; // Allocate hash space! CALL RecordAllocation(n_elements=size_hash, mold=1_i_wp, & caller="HashRank", alloc_status=alloc_status) @; IF(use_pointers .OR. (.NOT.one_pass) ) THEN @; ALLOCATE(hash_pointers(size_hash), STAT=alloc_status) @; // We need a pointer array CALL RecordAllocation(n_elements=size_hash, mold=1_i_wp, & caller="HashRank", alloc_status=alloc_status) @; END IF @; /* For a uniform distribution $p(x)=\frac{1}{2 a}$ in $[x_0-a,x_0+a]$, the STD is $a/\sqrt{3}$ and the hashing function is $h(x)=\frac{x-(x_0-a)}{2 a}$. For normally distributed numbers, $p(x)=\frac{1}{\sigma \sqrt{2 \pi}} \exp{-\frac{x^2}{2 \sigma^2}}$, the hashing function is $h(x)=\frac{1}{2} \left[ 1 + erf(\frac{x}{\sqrt{2} \sigma}) \right]$: */ IF(.NOT.PRESENT(hash_function)) THEN @; _CalculateMeanAndStd(array, size_array) @; IF(uniform_distribution) THEN @; inv_width=1.0_r_hv / SQRT(3.0_r_hv) / std @; ELSE @; inv_width=1.0_r_hv / SQRT(2.0_r_hv) / std @; END IF @; END IF @; IF(store_hash) THEN @; // Precalculate the hash values ALLOCATE(hash_values(size_array), STAT=alloc_status) @; CALL RecordAllocation(n_elements=size_hash, mold=1_i_wp, & caller="HashRank", alloc_status=alloc_status) @; DO element=1,size_array @; _CalculateHashValue(array[element]) @; hash_values[element]=hash_value @; END DO @; END IF @; @%% @ If the user does not supply them, we need to calculate the mean and STD ourselves: @m _CalculateMeanAndStd(array, size_array) @; IF(PRESENT(mean_value)) THEN @; mean=@E REAL(mean_value, KIND=r_hv) @; ELSE @; // Calculate mean total=0.0_r_hv @; DO element=1,size_array @; total = total + @E REAL(array[element],KIND=r_hv) @; // Sum of all elements END DO @; mean = total / @E REAL(size_array,KIND=r_hv) @; END IF @; IF(PRESENT(standard_deviation)) THEN @; std=@E REAL(standard_deviation,KIND=r_hv) @; ELSE @; // Calculate STD total=0.0_r_hv @; DO element=1,size_array @; total=total+(@E REAL(array[element],KIND=r_hv)-mean)^2 @; // Sum of squared deviations END DO @; std = SQRT (total / @E REAL(size_array,KIND=r_hv)) @; END IF @; @ And this is finally the actual body of |HashRank|. It is not too heavily commented, and it is rather entangled in parts of it: @m _HashComputation @; n_placed=0 @; // No values have been placed in proper hash order yet! MakePass: DO pass=1,n_passes @; // One or several hashing passes IF(debug) THEN @; WRITE(message_print_unit,"(A,I2)") " Now making pass #", pass @; END IF @; @%% IF( (.NOT.use_pointers) .OR. one_pass .OR. debug) @~ hash_array=0 @; // Not needed when there are pointers IF(use_pointers) @~ hash_pointers=0 @; // This is needed /* Now take all the elements that hashed to locations in the interval |(/(pass-1),pass/) size_hash| and try to compress them in the hash array in approximate hash-order. Those elements left unhashed will be left behind, and if there are too few elements in this range, than we simply won't fill the hash array. */ hash_range=size_hash*(/(pass-1), pass/) @; hash_range=INT(extra_space*@E REAL(hash_range,KIND=r_hv), i_wp) @; n_hashed=0 @; // Counter of hashed values TraverseArray: DO index=n_placed+1,size_array @; // Pass through unhashed elements in |array| element=permutation[index] @; IF(store_hash) THEN @; // Calculate the hash value hash_value=hash_values[element] @; // Retrieve stored value ELSE @; _CalculateHashValue(array[element]) @; END IF @; HashEntry: IF (hash_value < hash_range[2]) THEN @; // Try to place this value in the hash array hash_value=MIN(size_hash,MAX(1_i_wp,hash_value-hash_range[1])) @; // Must be in $[1,size\_hash]$ IF(n_hashed>=n_hashed_max) @~ EXIT TraverseArray @; // Enough hashed entries n_hashed=n_hashed+1 @; hash_guess = hash_value @; // Save at least this location UsePointers: IF(.NOT.use_pointers) THEN @; // Use simple step-to-the-right conflict resolution SearchForFreeLocation: DO @; IF(hash_array[hash_value]==0) EXIT SearchForFreeLocation @; // Found an empty place IF(hash_value == size_hash) @~ hash_value = 1 @; // Hashing is circular hash_value=hash_value+1@; // Step to the right END DO SearchForFreeLocation @; ELSE @; // Use hash pointers-to-the-right to resolve conflicts FindFreeLocation: DO @; IF(hash_pointers[hash_value]==0) EXIT FindFreeLocation @; // Found an empty place hash_value=hash_pointers[hash_value] @; // Step to next pointed location END DO FindFreeLocation @; IF(hash_value != size_hash) THEN @; // Point to the next location to the right hash_pointers[hash_value]=hash_value+1 @; hash_pointers[hash_guess]=hash_value+1 @; ELSE @; // Hashing is circular--we go to the front of the array when we overflow hash_pointers[hash_value]=1 @; hash_pointers[hash_guess]=1 @; END IF @; END IF UsePointers @; OnePass: IF(one_pass) THEN @; // When there is a single pass simplify our life hash_array[hash_value] = permutation[index] @; // Occupy the location by storing the {\em element} ELSE @; hash_array[hash_value] = index @; // Occupy the location by storing the {\em index} END IF OnePass @; END IF HashEntry @; END DO TraverseArray @; IF(debug) THEN @; // Useful info WRITE(message_print_unit,*) "Successfully hashed ", n_hashed, " elements, out of ", size_array/n_passes @; _DisplayArray("Hash array:",hash_array) @; IF(use_pointers) THEN @; _DisplayArray("Hash pointers:",hash_pointers) @; END IF @; END IF @; IF(one_pass) THEN @; // Compress the hashed elements to the front--we are done! CompressToFront: DO hash_index = 1, size_hash @; // Compress to the front IF(hash_array[hash_index]/=0) THEN @; // Compress this entry n_placed=n_placed+1 @; permutation[n_placed]=hash_array[hash_index] @; // Copy stored values to front END IF @; END DO CompressToFront @; ELSE @; // Now things get complicated... n_front=0 @; MoveToBack: DO hash_index = 1, size_hash @; // Free the front for new hashed entries IF(hash_array[hash_index]/=0) THEN @; // Swap this entry n_front=n_front+1 @; index=hash_array[hash_index] @; hash_array[n_front]=permutation[index] @; // Store and compress to the front hash_pointers[n_front]=index @; // This is a free position FindStoreLocation: DO @; // Find a free location to store this value at IF( index >= n_placed+n_front) THEN @; EXIT FindStoreLocation @; ELSE @; index=hash_pointers[index-n_placed] @; END IF @; END DO FindStoreLocation @; permutation[index]=permutation[n_placed+n_front] @; // Move entry to the back END IF @; END DO MoveToBack @; MoveToFront: DO hash_index = 1, n_hashed @; // Now place the hashed values to the front n_placed=n_placed+1 @; permutation[n_placed]=hash_array[hash_index] @; // Copy stored values to front END DO MoveToFront @; END IF @; IF(debug) THEN @; WRITE(message_print_unit,*) "Successfully placed ", n_placed, " out of ", (pass*size_array)/n_passes @; _DisplayArray("Permutation array:",permutation) @; END IF @; END DO MakePass @; DEALLOCATE(hash_array) @; CALL RecordAllocation(n_elements=-size_hash, mold=1_i_wp) @; IF(ALLOCATED(hash_pointers)) THEN @; DEALLOCATE(hash_pointers) @; CALL RecordAllocation(n_elements=-size_hash, mold=1_i_wp) @; END IF @; IF(ALLOCATED(hash_values)) THEN @; DEALLOCATE(hash_values) @; CALL RecordAllocation(n_elements=-size_array, mold=1_i_wp) @; END IF @; @%% @ And here are the actual subroutines for both integers and real numbers (only single keys of course): @m _HashSort(array,permutation,...) @;@%% _HashChoices @; _HashInitialization @; _HashComputation @; IF(finish_ranking) @~ CALL ShellInsertionRank(array=array, permutation=permutation, & partially_ranked=.TRUE., method="Insertion") @; // Complete ranking with straight insertion @%% @m _HashRank(_type,_kind) @;@%% _RankSingleKey(@E _type, _kind, Hash, _HashSort, _HashArguments, _HashDeclarations, _HashLocalVars,_Dummy,_Dummy) @; @ @=@%% _HashRank(INTEGER,i_sp) @; _HashRank(INTEGER,i_dp) @; _HashRank(@E REAL,r_sp) @; _HashRank(@E REAL,r_dp) @; @ @= INTERFACE HashRank @; MODULE PROCEDURE HashRankSingle_i_sp @; MODULE PROCEDURE HashRankSingle_i_dp @; MODULE PROCEDURE HashRankSingle_r_sp @; MODULE PROCEDURE HashRankSingle_r_dp @; END INTERFACE @; @%% @*1 Quick Sort. Quick sort is implemented here via Fortran 90 recursion. Although the rouitine implemented here are fast, they are not optimized in the inner loops and you may want to do that if you'd like. This is because I tested an alternative strategy for choosing the pivot key. Namely, if we know that our numbers are distributed from a distribution with integrated probability density $P(x)$ and that the elements in our array are in the interval $[a,b]$, then a pivot that would equalize the two partitions would be the $\xi$ for which $P(\xi)=\frac{P(a)+P(b)}{2}$. As in |HashRank|, either uniform or normal distributions are handled in this routine, given via the argument |pivot_selection|, although the additional input argument |pivot_function|, can be used to handle other cases as well by providing $\xi$ given $a$ and $b$. The only other strategy implemented at present is choosing a random key as the pivot. Median of three or similar tricks are not of interest to me, but they are very easy to add in this web-file. The routine also allows one to stop the recursive quick ranking when the subarrays reach a certain small size |cutoff_size| (25 by default), after which straight insertion is applied (if the user specified his own cutoff, it's his responsibility to call insertion sort if he wants the array fully ranked). @ Here is the recursive subroutine that si the body of quick ranking. This will be an internal procedure in the actual callable procedure, as recommended by Meissner. The inner loop performs the partial pivot sorting by nibbling the subarray |array(starting:ending)| starting from its two ends and going toward the middle, swapping elements which are in the wrong ``half''. In some cases, i.e. especially if it is known that the pivot key is in the range $(\min{S},\max{S})$ it is possible to ommit the in-bounds check in the nibbling loops. But in other cases these are necessary to avoid crashes in pathological situations. Since we'll use alternative pivot choices, I choose to include the check |low==high-1| inside the loops. Please note that {\em this may slow down the routines several times}. @%% @m _QuickSortRecursive(_type, _kind, _SelectPivotKey, _CompareWithPivot, _EqualWithPivot, @| _AdditionalArguments, _AdditionalDeclarations, _RecursiveCall, ID, _keys, ...) @; RECURSIVE SUBROUTINE QuickSortRecursive@e@&ID@e@&_@e@&_kind(starting, ending,_AdditionalArguments) @; IMPLICIT NONE @; INTEGER(KIND=i_wp), INTENT(IN) :: starting, ending @; // Rank the array section |S=array[starting:ending]|, where |array| is host-associated @e#!_AdditionalDeclarations(_type, _kind,#.) @; /* There shouldn't be too many local variables here to be put on the stack: */ INTEGER(KIND=i_wp) :: low, high @; // Nibbling ends for low and high partitions @e$IFELSE(_keys, SingleKey, _type(KIND=_kind) :: pivot_key, _type(KIND=_kind) :: pivot_key(INT(SIZE(array,1), KIND=i_wp)) ) @; // The pivot key @%% IF(debug) @~ WRITE(message_print_unit,*) "QuickSortRecursive: Starting=",starting, ", ending=", ending @; recursion_depth=recursion_depth+1 @; // For debugging purposes IF( (ending-starting) < cutoff) THEN @; // End the recursion for less than two keys remaining _ReturnRecursiveQS @; END IF @; low = starting @; high = ending @; @e#!_SelectPivotKey(pivot_key, array, permutation, starting, ending, _AdditionalArguments, ...) @; // Choose the pivot key NibbleArray@e@&ID: DO @; // Major divide-and-conquer loop NibbleLeft@e@&ID: DO @; // Inner loop---need's more optimization IF(low > high) @~ EXIT NibbleArray@e@&ID @; // May be discarded @e#!_CompareWithPivot(comparison_test, array, permutation[low], pivot_key, ID@e@&_NL, #.) @; IF(.NOT.comparison_test) @~ EXIT NibbleLeft@e@&ID @; low = low + 1 @; // Nibble left end END DO NibbleLeft@e@&ID @; NibbleRight@e@&ID: DO @; // Same as above for the other end IF(high < low) @~ EXIT NibbleArray@e@&ID @; @e#!_CompareWithPivot(comparison_test, array, permutation[high], pivot_key, ID@e@&_NR, #.) @; IF( comparison_test ) @~ EXIT NibbleRight@e@&ID @; // This will pass over equals high = high - 1 @; // Nibble right end END DO NibbleRight@e@&ID @; n_swaps=n_swaps+1 @; _Swap(permutation[low], permutation[high], temp_element) @; // Put misplaced elements in their proper ``half'' low = low + 1 @; // Move on high = high - 1 @; // Move on END DO NibbleArray@e@&ID @; /* Based on R. Vowel's code, we skip over equal elements to the left of the pivot break-point. This really speeds things up if there are many repeated entries for integers, and is ignored in other cases */ SkipEqual@e@&ID: DO @; // Skip over equal values IF(low==starting) @~ EXIT SkipEqual@e@&ID @; // Maybe all equal keys @e#!_EqualWithPivot(comparison_test, array, permutation[low-1], pivot_key, ID@e@&_SE, #.) @; IF(.NOT.comparison_test) @~ EXIT SkipEqual@e@&ID @; low=low-1 @; END DO SkipEqual@e@&ID @; @%% IF(debug .AND. (INT(INT(SIZE(array), KIND=i_wp), KIND=i_wp)<=20_i_wp) ) THEN @; @%% WRITE(message_print_unit,*) "Pivoted array with pivot key=", pivot_key," low=",low," high=", high @; @%% WRITE(message_print_unit,"(20G5.2)") array(permutation) @; @%% END IF @; @%% @e#!_RecursiveCall(_type, _kind, _AdditionalArguments, #.) @; // Call this routine recursively for two halves _ReturnRecursiveQS @; END SUBROUTINE @; // |QuickSortRecursive| @%% @m _ReturnRecursiveQS @; max_recursion_depth=MAX(max_recursion_depth,recursion_depth) @; recursion_depth=recursion_depth-1 @; RETURN @; @ Here is the macro for the pivot selection in the case when the numbers are from a known distribution (|RD| stands for random distribution). There is ample room for improvement here. Especially for normally distributed numbers, the inverse error function is used, which is rather expensive for small partititions. In particular, later I will try to compensate for misbalanced pivot choices that say leave the pivot completely out of the range of the values in the sub-array... @%% @m _SelectPivotSingleKey_RD(pivot_key, array, permutation, starting, ending, range, distribution, ...) @; min_val=range[1] @~ ; max_val=range[2] @; // Implicit conversions IF(min_val==max_val) THEN @; _ReturnRecursiveQS @; // No need to proceed ELSE @; IF(.NOT.PRESENT(pivot_function)) THEN @; IF(distribution=='U') THEN @; // Uniform distribution pivot_key = (range[1]+range[2]) / 2 @; // Implicit conversion ELSE @; // Normal distribution pivot_key=SQRT(2.0_r_hv)*std * InvErf ((Erf( (range[1]-mean)/SQRT(2.0_r_hv)/std) + Erf( (range[2]-mean)/SQRT(2.0_r_hv)/std)) / 2.0_r_hv) + mean @; IF(pivot_key<=range[1] .OR. pivot_key>=range[2] ) THEN @; pivot_key = (range[1]+range[2]) / 2 @; // Numerical issues with above formula END IF @; END IF @; ELSE @; pivot_key=pivot_function(range[1],range[2]) @; // User supplied END IF @; END IF @; @ If the pivot was out of range try to compensate by extra search, else call the proper routine on the two halves. If the size of the subarray is already less than 25, just switch to random pivot selection (here |PE| stands for pivot element, which means that one of the elements of the array is used as a pivot): @%% @m _RecursiveCall_RD(_type, _kind, range, distribution, ... ) @; IF( (low>ending) .OR. (high= MAX(25_i_wp,cutoff) ) THEN @; // Recalculate the range IF(debug) @~ WRITE(message_print_unit,*) "Recalculating range for section of size: ", ending-starting+1 @; min_val=HUGE(array) @~ ; max_val=-HUGE(array) @; DO index = starting, ending @; temp_compare = array(permutation[index]) @; IF(temp_comparemax_val) @~ max_val=temp_compare @; END DO @; CALL QuickSortRecursive_RD_@e@&_kind(starting, ending, & range= @E REAL((/min_val,max_val/), KIND=r_hv), distribution=distribution) @; ELSE @; // Switch to random pivot @%% _ReturnRecursiveQS @; @%% IF(debug) WRITE(message_print_unit,*) "Switching to random pivot for section of size: ", ending-starting+1 @; CALL QuickSortRecursive_PE_@e@&_kind(starting, ending, (starting+ending)/2 ) @; END IF @; ELSE @; // Recursive calls on both halves: IF( low - starting >= cutoff ) CALL QuickSortRecursive_RD_@e@&_kind(starting, low-1, & range=(/range[1], REAL(pivot_key, KIND=r_hv) /), distribution=distribution) @; IF( ending - high >= cutoff ) CALL QuickSortRecursive_RD_@e@&_kind(high+1, ending, & range=(/ REAL(pivot_key, KIND=r_hv), range[2]/), distribution=distribution) @; END IF @; @%% @ When the pivot key is an actual element of the key array, things are simpler (but bigger depths of recursion will be reached). Here I only implement a random choice of the pivot element (|RPE|), though other strategies, like median-of-three, can easily be added here: @%% @m _SelectPivotSingleKey_PE(pivot_key, array, permutation, starting, ending, pivot_index, ...) @; pivot_key = array(permutation[pivot_index]) @; @m _RecursiveCall_RPE(_type, _kind, pivot_index,... ) @; IF ( low - starting >= cutoff ) THEN @; CALL RandomUniform(new_pivot_index, range=(/starting,low-1/) ) @; CALL QuickSortRecursive_PE_@e@&_kind(starting, low-1, new_pivot_index ) @; END IF @; IF ( ending - high >= cutoff ) THEN @; CALL RandomUniform(new_pivot_index, range=(/high+1, ending/) ) @; CALL QuickSortRecursive_PE_@e@&_kind(high+1, ending, new_pivot_index ) @; END IF @; @%% @ From here on we proceeed to use the above recursive rouine to make a real |QuickRank| routine. The additional arguments needed by the recursive routine are different for the two major pivoting strategies, |RD| and |PE|: @%% @m _AdditionalArgs_RD range, distribution @;@%% @m _AdditionalDecl_RD(_type, _kind, ...) @; REAL(KIND=r_hv), DIMENSION(2), INTENT(IN) :: range @; // Upper and lower bounds of keys in |array[starting:ending]| CHARACTER(LEN=1), INTENT(IN) :: distribution @; // |'U'| or |'N'| @m _AdditionalArgs_PE pivot_index @;@%% @m _AdditionalDecl_PE(_type, _kind, ...) @; INTEGER(KIND=i_wp), INTENT(IN) :: pivot_index @; // Which element of |array| is the pivot INTEGER(KIND=i_wp) :: new_pivot_index // A local variable--needed below @%% @ For multiple keys, I only implement the random-pivot strategy: @m _SelectPivotMultipleKeys_PE(pivot_key, array, permutation,starting,ending,pivot_index,...) @; pivot_key = array(:,permutation[pivot_index]) @; @%% @ Unfortunately, I needed to rewrite the comparison macros when a pivot key (instead of an element of the key array) is one of the keys being compared. Oh, well... @%% @m _CompareSingleKeyWithPivot(comparison_test,key_array,element,pivot_key,...) @;@%% comparison_test = (key_array[element] <= pivot_key) @; // Comparison operator @%% @m _EqualSingleKeyWithPivot(comparison_test,key_array,element,pivot_key,...) @;@%% comparison_test = (key_array[element] == pivot_key) @; // Equal to pivot key? @%% @m _CompareMultipleKeysWithPivot(comparison_test,key_array,element,pivot_key,ID,...) @;@%% SELECT CASE(SIZE(key_array,1)) @; // How many keys are there CASE(1) @; comparison_test = (key_array[1,element] <= pivot_key[1]) @; CASE(2) @; // Inlined version for only two keys--very common temp_compare = (key_array[1,element] - pivot_key[1]) @; // Test most significant key IF(temp_compare==0) THEN @; // In case they are equal--really only for integers comparison_test = (key_array[2,element] <= pivot_key[2]) @; // Test less significant key ELSE @; comparison_test = (temp_compare < 0) @; END IF @; CASE DEFAULT @; // More then 2 keys--use a loop comparison_test=.TRUE. @; // If all keys are equal CompareKey@e@&ID: DO key=1,SIZE(key_array,1) @; temp_compare = (key_array[key,element] - pivot_key[key]) @; IF(temp_compare==0) THEN @; // If equal CYCLE CompareKey@e@&ID @; // Go to next key ELSE @; comparison_test = (temp_compare < 0) @; EXIT CompareKey@e@&ID @; END IF @; END DO CompareKey@e@&ID @; ENDSELECT @; @%% @m _EqualMultipleKeysWithPivot(comparison_test,key_array,element,pivot_key,...) @;@%% comparison_test = ALL(key_array[:,element] == pivot_key) @; // Equal to pivot key? @%% @ And now I construct the actual |QuickRank| routine, with all its arguments, choices, the call to the recursive routine, etc.: @%% @m _QuickArgumentsSingleKey mean_value, standard_deviation, pivot_selection, cutoff_size, pivot_function @; @m _QuickArgumentsMultipleKeys pivot_selection, cutoff_size @; @m _QuickDeclarationsSingleKey(_type,_kind,...) @; _type(KIND=_kind), INTENT(IN), OPTIONAL :: mean_value,standard_deviation @; CHARACTER(LEN=*), INTENT(IN), OPTIONAL :: pivot_selection @; // |"Uniform"|, |"Normal"| or |"Random"| INTEGER(KIND=i_wp), INTENT(IN), OPTIONAL :: cutoff_size @; // When to stop expensive recursion OPTIONAL :: pivot_function @; // User-supplied hashing-function INTERFACE @; FUNCTION pivot_function(lower_bound, upper_bound) RESULT(pivot_value) @; USE Precision @; REAL(KIND=R_HV), INTENT(IN) :: lower_bound, upper_bound @; _type(KIND=_kind) :: pivot_value @; // In the interval $[0,1]$ END FUNCTION pivot_function @; END INTERFACE @; @%% CHARACTER(LEN=*), INTENT(IN), OPTIONAL :: distribution @; // |"Uniform"| or |"Gaussian"| @m _QuickDeclarationsMultipleKeys(_type,_kind,...) @; CHARACTER(LEN=1), INTENT(IN), OPTIONAL :: pivot_selection @; INTEGER(KIND=i_wp), INTENT(IN), OPTIONAL :: cutoff_size @; @m _QuickLocalVarsSingleKey(_type,_kind,...) @; INTEGER(KIND=i_wp) :: cutoff @; INTEGER :: recursion_depth, max_recursion_depth @; CHARACTER(LEN=1) :: pivot_method @; // |'U'|, |'N'| or |'R'| REAL(KIND=r_hv) :: mean, std, total @; // Just like in |HashSort| _type(KIND=_kind) :: min_val, max_val @; LOGICAL :: uniform_distribution @; @m _QuickLocalVarsMultipleKeys(_type,_kind,...) @; INTEGER(KIND=i_wp) :: cutoff @; INTEGER :: recursion_depth, max_recursion_depth @; CHARACTER(LEN=1) :: pivot_method @; // |'U'|, |'N'|, |'M'| or |'R'| @%% @ Here are the bodies of the non-recursive wrapper subroutine, which invoke the contained recursive versions and set some choices and such: @m _NonRecursiveCallSingleKey(_type,_kind,size_array,...) @; SELECT CASE(pivot_method) @; CASE('U','u','G','g','N','n') @; // Use pivot based on random distribution $P(x)$ IF(.NOT.PRESENT(pivot_function)) THEN @; _CalculateMeanAndStd(array,INT(INT(SIZE(array), KIND=i_wp), KIND=i_wp)) @; END IF @; IF(pivot_method=='U' .OR. pivot_method=='u') THEN @; CALL QuickSortRecursive_RD_@e@&_kind(1_i_wp,size_array, & range=(/mean-std*SQRT(3.0_r_hv), mean+std*SQRT(3.0_r_hv)/),distribution='U') @; ELSE @; CALL QuickSortRecursive_RD_@e@&_kind(1_i_wp,size_array, & range=(/mean-std*10.0_r_hv, mean+std*10.0/),distribution='N') @; END IF @; CASE DEFAULT @; CALL QuickSortRecursive_PE_@e@&_kind(1_i_wp,size_array, (size_array+1)/2 ) @; ENDSELECT @; @%% @m _NonRecursiveCallMultipleKeys(_type,_kind,size_array,...) @; IF((pivot_method!='R') .AND. (pivot_method!='r')) THEN @; IF(debug) @~ WRITE(message_print_unit,*) "Requested pivot selection method not available for multiple keys! Switching to random pivot selection..." @; @%% pivot_method='R' @; END IF @; CALL QuickSortRecursive_PE_@e@&_kind(1_i_wp,size_array, (size_array+1)/2 ) @; @ @m _QuickSort(array,permutation,size_array,_Compare,_Equal,_type,_kind,_keys,...) @; pivot_method='R' @; // Use random pivot selection by default IF(PRESENT(pivot_selection)) @~ pivot_method=pivot_selection @; IF(PRESENT(cutoff_size)) THEN @; cutoff=MAX(1_i_wp,cutoff_size) @; ELSE @; cutoff=25 @; // By default we use insertion after 25 END IF @; recursion_depth=0 @~ ; max_recursion_depth=0 @; _NonRecursiveCall@e##_keys(_type,_kind,size_array,#.) @; // First call to the recursive quick rank IF(debug) @~ WRITE(message_print_unit,*) "Recursion depth=",max_recursion_depth @; IF(.NOT.PRESENT(cutoff_size)) THEN @; // Finish sorting with straight insertion CALL ShellInsertionRank(array=array, permutation=permutation, & partially_ranked=.TRUE., method="Insertion") @; END IF @; CONTAINS @; // The recursive routines are internal to this one $IFELSE(_keys,SingleKey, _QuickSortRecursive(_type, _kind, _SelectPivot@e##_keys@e##_RD, @e#!_Compare, @e#!_Equal, _AdditionalArgs_RD, _AdditionalDecl_RD, _RecursiveCall_RD, _RD, _keys),_Dummy) @; _QuickSortRecursive(_type, _kind, _SelectPivot@e##_keys@e##_PE, @e#!_Compare, @e#!_Equal, _AdditionalArgs_PE, _AdditionalDecl_PE, _RecursiveCall_RPE, _PE, _keys) @; @ And (indeed finally!!!) here are the actual versions for several different kinds of arguments: @%% @m _QuickRank(_type,_kind,_keys) @; _Rank@e##_keys( _type, _kind, Quick, _QuickSort, _QuickArguments@e##_keys, _QuickDeclarations@e##_keys, _QuickLocalVars@e##_keys, _Compare@e##_keys@e##WithPivot, _Equal@e##_keys@e##WithPivot, _type, _kind, _keys) @; @ @= _QuickRank(@E INTEGER, i_byte, SingleKey) @; _QuickRank(@E INTEGER, i_sp, SingleKey) @; _QuickRank(@E INTEGER, i_dp, SingleKey) @; _QuickRank(@E INTEGER, i_sp, MultipleKeys) @; _QuickRank(@E REAL, r_sp, SingleKey) @; _QuickRank(@E REAL, r_dp, SingleKey) @; @ @= INTERFACE QuickRank @; MODULE PROCEDURE QuickRankSingle_i_byte @; MODULE PROCEDURE QuickRankSingle_i_sp @; MODULE PROCEDURE QuickRankSingle_i_dp @; MODULE PROCEDURE QuickRankMultiple_i_sp @; MODULE PROCEDURE QuickRankSingle_r_sp @; MODULE PROCEDURE QuickRankSingle_r_dp @; END INTERFACE @; @*1 ORDERPACK's Hand-optimized Merge Sort. I needed a base comparison ranking subroutine to see how well I am doing. This one is taken as is from ORDERPACK and it only works on default real numbers. It is hand optimized and is rather long in length and very difficult to read, but here it is none-the-less: \newpage @= // Taken from Michael Metcalf's ORDERPACK 1.0--Merge Ranking Subroutine MergeRank (XVALT, IRNGT) @%% __________________________________________________________ @%% MRGRNK = Merge-sort ranking of an array @%% For performance reasons, the first 2 passes are taken @%% out of the standard loop, and use dedicated coding. @%% __________________________________________________________ Real, Dimension (:), Intent (In) :: XVALT Integer, Dimension (:), Intent (Out) :: IRNGT @%% __________________________________________________________ Integer, Dimension (SIZE(IRNGT)) :: JWRKT Integer :: LMTNA, LMTNC, IRNG1, IRNG2 Integer :: NVAL, IIND, IWRKD, IWRK, IWRKF, JINDA, IINDA, IINDB Real (Kind(XVALT)) :: XVALA, XVALB @%% NVAL = Min (SIZE(XVALT), SIZE(IRNGT)) Select Case (NVAL) Case (:0) Return Case (1) IRNGT (1) = 1 Return Case Default Continue End Select @%% @%% Fill-in the index array, creating ordered couples @%% Do IIND = 2, NVAL, 2 If (XVALT(IIND-1) <= XVALT(IIND)) Then IRNGT (IIND-1) = IIND - 1 IRNGT (IIND) = IIND Else IRNGT (IIND-1) = IIND IRNGT (IIND) = IIND - 1 End If End Do If (Mod(NVAL, 2) /= 0) Then IRNGT (NVAL) = NVAL End If @%% @%% We will now have ordered subsets A - B - A - B - ... @%% and merge A and B couples into C - C - ... @%% LMTNA = 2 LMTNC = 4 @%% @%% First iteration. The length of the ordered subsets goes from 2 to 4 @%% Do If (NVAL <= 2) Exit @%% @%% Loop on merges of A and B into C @%% Do IWRKD = 0, NVAL - 1, 4 If ((IWRKD+4) > NVAL) Then If ((IWRKD+2) >= NVAL) Exit @%% @%% 1 2 3 @%% If (XVALT(IRNGT(IWRKD+2)) <= XVALT(IRNGT(IWRKD+3))) Exit @%% @%% 1 3 2 @%% If (XVALT(IRNGT(IWRKD+1)) <= XVALT(IRNGT(IWRKD+3))) Then IRNG2 = IRNGT (IWRKD+2) IRNGT (IWRKD+2) = IRNGT (IWRKD+3) IRNGT (IWRKD+3) = IRNG2 @%% @%% 3 1 2 @%% Else IRNG1 = IRNGT (IWRKD+1) IRNGT (IWRKD+1) = IRNGT (IWRKD+3) IRNGT (IWRKD+3) = IRNGT (IWRKD+2) IRNGT (IWRKD+2) = IRNG1 End If Exit End If @%% @%% 1 2 3 4 @%% If (XVALT(IRNGT(IWRKD+2)) <= XVALT(IRNGT(IWRKD+3))) Cycle @%% @%% 1 3 x x @%% If (XVALT(IRNGT(IWRKD+1)) <= XVALT(IRNGT(IWRKD+3))) Then IRNG2 = IRNGT (IWRKD+2) IRNGT (IWRKD+2) = IRNGT (IWRKD+3) If (XVALT(IRNG2) <= XVALT(IRNGT(IWRKD+4))) Then @%% 1 3 2 4 IRNGT (IWRKD+3) = IRNG2 Else @%% 1 3 4 2 IRNGT (IWRKD+3) = IRNGT (IWRKD+4) IRNGT (IWRKD+4) = IRNG2 End If @%% @%% 3 x x x @%% Else IRNG1 = IRNGT (IWRKD+1) IRNG2 = IRNGT (IWRKD+2) IRNGT (IWRKD+1) = IRNGT (IWRKD+3) If (XVALT(IRNG1) <= XVALT(IRNGT(IWRKD+4))) Then IRNGT (IWRKD+2) = IRNG1 If (XVALT(IRNG2) <= XVALT(IRNGT(IWRKD+4))) Then @%% 3 1 2 4 IRNGT (IWRKD+3) = IRNG2 Else @%% 3 1 4 2 IRNGT (IWRKD+3) = IRNGT (IWRKD+4) IRNGT (IWRKD+4) = IRNG2 End If Else @%% 3 4 1 2 IRNGT (IWRKD+2) = IRNGT (IWRKD+4) IRNGT (IWRKD+3) = IRNG1 IRNGT (IWRKD+4) = IRNG2 End If End If End Do @%% @%% The Cs become As and Bs @%% LMTNA = 4 Exit End Do @%% @%% Iteration loop. Each time, the length of the ordered subsets @%% is doubled. @%% Do If (LMTNA >= NVAL) Exit IWRKF = 0 LMTNC = 2 * LMTNC @%% @%% Loop on merges of A and B into C @%% Do IWRK = IWRKF IWRKD = IWRKF + 1 JINDA = IWRKF + LMTNA IWRKF = IWRKF + LMTNC If (IWRKF >= NVAL) Then If (JINDA >= NVAL) Exit IWRKF = NVAL End If IINDA = 1 IINDB = JINDA + 1 @%% @%% Shortcut for the case when the max of A is smaller @%% than the min of B. This line may be activated when the @%% initial set is already close to sorted. @%% @%% IF (XVALT(IRNGT(JINDA)) <= XVALT(IRNGT(IINDB))) CYCLE @%% @%% One steps in the C subset, that we build in the final rank array @%% @%% Make a copy of the rank array for the merge iteration @%% JWRKT (1:LMTNA) = IRNGT (IWRKD:JINDA) @%% XVALA = XVALT (JWRKT(IINDA)) XVALB = XVALT (IRNGT(IINDB)) @%% Do IWRK = IWRK + 1 @%% @%% We still have unprocessed values in both A and B @%% If (XVALA > XVALB) Then IRNGT (IWRK) = IRNGT (IINDB) IINDB = IINDB + 1 If (IINDB > IWRKF) Then @%% Only A still with unprocessed values IRNGT (IWRK+1:IWRKF) = JWRKT (IINDA:LMTNA) Exit End If XVALB = XVALT (IRNGT(IINDB)) Else IRNGT (IWRK) = JWRKT (IINDA) IINDA = IINDA + 1 If (IINDA > LMTNA) Exit XVALA = XVALT (JWRKT(IINDA)) End If @%% End Do End Do @%% @%% The Cs become As and Bs @%% LMTNA = 2 * LMTNA End Do @%% Return End Subroutine MergeRank @%% @I HPF2Formatting.hweb @%%