Skip to content

Commit b14d80f

Browse files
authored
Create flash_sort.py
This pull request adds an implementation of the Flash Sort algorithm in `sorts/flash_sort.py`. **Algorithm overview:** Flash Sort is a distribution-based sorting algorithm especially efficient for large datasets with elements that are uniformly distributed. Its main idea is to classify elements into buckets (classes) using a linear transformation, rearrange the array in-place using a cycle leader permutation, and finally apply insertion sort within each class for local ordering. **Implementation details:** - The number of classes (buckets) is empirically set to `int(0.43 * n)` (where `n` is the length of the array), following recommendations from the original paper and Wikipedia. This balance helps avoid both oversparse and overcrowded buckets. - The implementation includes detailed comments and uses descriptive variable names for clarity. - The function returns a new sorted list and does not modify the input array in-place. **Reference:** - [Wikipedia: Flashsort](https://en.wikipedia.org/wiki/Flashsort) **Use cases:** Most efficient when data is numeric and uniformly distributed. For other distributions, performance may degrade. Closes #13203
1 parent a71618f commit b14d80f

1 file changed

Lines changed: 79 additions & 0 deletions

File tree

sorts/flash_sort.py

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Flash Sort Algorithm
2+
#
3+
# Flash Sort is a distribution sorting algorithm designed for large arrays with elements
4+
# that are relatively uniformly distributed. The algorithm can achieve close to O(n) time
5+
# complexity under favorable conditions.
6+
#
7+
# Main steps:
8+
# 1. Find the minimum and maximum values in the array.
9+
# 2. Choose the number of classes ("buckets") m. The typical choice is m = int(0.43 * n),
10+
# where n is the array length. The constant 0.43 is an empirical value shown by the
11+
# original paper and Wikipedia to provide good performance in practice. The goal is
12+
# to have enough classes to distribute elements evenly, but not so many that classes
13+
# become sparse.
14+
# 3. Classify each element into one of the m classes using a linear mapping from the
15+
# value range to class indices.
16+
# 4. Compute prefix sums of the class counts to determine the class boundaries.
17+
# 5. Rearrange (permute) elements in-place so that all elements belonging to the same
18+
# class are grouped together. This is performed using a cycle leader algorithm.
19+
# 6. For each class, perform a final sorting step (usually insertion sort), because
20+
# elements within a class are not guaranteed to be sorted.
21+
#
22+
# Reference:
23+
# https://en.wikipedia.org/wiki/Flashsort
24+
25+
def flash_sort(array):
26+
"""
27+
Flash Sort algorithm.
28+
29+
Flash Sort is a distribution sorting algorithm that achieves linear time complexity O(n)
30+
for uniformly distributed data sets using relatively little additional memory.
31+
See: https://en.wikipedia.org/wiki/Flashsort
32+
33+
Args:
34+
array (list): List of numeric values to be sorted.
35+
36+
Returns:
37+
list: Sorted list.
38+
"""
39+
n = len(array)
40+
if n == 0:
41+
return array.copy()
42+
43+
min_value = min(array)
44+
max_value = max(array)
45+
if min_value == max_value:
46+
return array.copy()
47+
48+
# Step 2: Choose the number of classes (buckets)
49+
# Empirically, 0.43 * n gives good performance; see Wikipedia and original papers.
50+
number_of_classes = max(int(0.43 * n), 2)
51+
class_boundaries = [0] * number_of_classes
52+
53+
# Step 3: Classify elements into classes (buckets)
54+
class_coefficient = (number_of_classes - 1) / (max_value - min_value)
55+
for value in array:
56+
class_index = int(class_coefficient * (value - min_value))
57+
class_boundaries[class_index] += 1
58+
59+
# Step 4: Compute prefix sums for class boundaries
60+
for i in range(1, number_of_classes):
61+
class_boundaries[i] += class_boundaries[i - 1]
62+
63+
# Step 5: Permute elements into correct classes (cycle leader permutation)
64+
sorted_array = [0] * n
65+
for value in reversed(array):
66+
class_index = int(class_coefficient * (value - min_value))
67+
class_boundaries[class_index] -= 1
68+
sorted_array[class_boundaries[class_index]] = value
69+
70+
# Step 6: Final insertion sort within the sorted array
71+
for i in range(1, n):
72+
key = sorted_array[i]
73+
j = i - 1
74+
while j >= 0 and sorted_array[j] > key:
75+
sorted_array[j + 1] = sorted_array[j]
76+
j -= 1
77+
sorted_array[j + 1] = key
78+
79+
return sorted_array

0 commit comments

Comments
 (0)