Comparison sort

Sorting a set of unlabelled weights by weight using only a balance scale requires a comparison sort algorithm.

A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation (often a "less than or equal to" operator or a three-way comparison) that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator forms a total preorder over the data, with:

  1. if ab and bc then ac (transitivity)
  2. for all a and b, ab or ba (connexity).

It is possible that both ab and ba; in this case either may come first in the sorted list. In a stable sort, the input order determines the sorted order in this case.

Comparison sorts studied in the literature are "comparison-based".[1] Elements a and b can be swapped or otherwise re-arranged by the algorithm only when the order between these elements has been established based on the outcomes of prior comparisons. This is the case when the order between a and b can be derived via the transitive closure of these prior comparison outcomes.

For comparison-based sorts the decision to execute basic operations other than comparisons is based on the outcome of comparisons. Hence in a time analysis the number of executed comparisons is used to determine upper bound estimates for the number of executed basic operations such as swaps or assignments.[1]

A metaphor for thinking about comparison sorts is that someone has a set of unlabelled weights and a balance scale. Their goal is to line up the weights in order by their weight without any information except that obtained by placing two weights on the scale and seeing which one is heavier (or if they weigh the same).

Examples

Quicksort in action on a list of numbers. The horizontal lines are pivot values.

Some of the most well-known comparison sorts include:

Performance limits and advantages of different sorting techniques

There are fundamental limits on the performance of comparison sorts. A comparison sort must have an average-case lower bound of Ω(n log n) comparison operations,[2] which is known as linearithmic time. This is a consequence of the limited information available through comparisons alone — or, to put it differently, of the vague algebraic structure of totally ordered sets. In this sense, mergesort, heapsort, and introsort are asymptotically optimal in terms of the number of comparisons they must perform, although this metric neglects other operations. Non-comparison sorts (such as the examples discussed below) can achieve O(n) performance by using operations other than comparisons, allowing them to sidestep this lower bound (assuming elements are constant-sized).

Comparison sorts may run faster on some lists; many adaptive sorts such as insertion sort run in O(n) time on an already-sorted or nearly-sorted list. The Ω(n log n) lower bound applies only to the case in which the input list can be in any possible order.

Real-world measures of sorting speed may need to take into account the ability of some algorithms to optimally use relatively fast cached computer memory, or the application may benefit from sorting methods where sorted data begins to appear to the user quickly (and then user's speed of reading will be the limiting factor) as opposed to sorting methods where no output is available until the whole list is sorted.

Despite these limitations, comparison sorts offer the notable practical advantage that control over the comparison function allows sorting of many different datatypes and fine control over how the list is sorted. For example, reversing the result of the comparison function allows the list to be sorted in reverse; and one can sort a list of tuples in lexicographic order by just creating a comparison function that compares each part in sequence:

function tupleCompare((lefta, leftb, leftc), (righta, rightb, rightc))
    if lefta ≠ righta
        return compare(lefta, righta)
    else if leftb ≠ rightb
        return compare(leftb, rightb)
    else
        return compare(leftc, rightc)

Comparison sorts generally adapt more easily to complex orders such as the order of floating-point numbers. Additionally, once a comparison function is written, any comparison sort can be used without modification; non-comparison sorts typically require specialized versions for each datatype.

This flexibility, together with the efficiency of the above comparison sorting algorithms on modern computers, has led to widespread preference for comparison sorts in most practical work.

Alternatives

Some sorting problems admit a strictly faster solution than the Ω(n log n) bound for comparison sorting by using non-comparison sorts; an example is integer sorting, where all keys are integers. When the keys form a small (compared to n) range, counting sort is an example algorithm that runs in linear time. Other integer sorting algorithms, such as radix sort, are not asymptotically faster than comparison sorting, but can be faster in practice.

The problem of sorting pairs of numbers by their sum is not subject to the Ω(n² log n) bound either (the square resulting from the pairing up); the best known algorithm still takes O(n² log n) time, but only O(n²) comparisons.

Number of comparisons required to sort a list

n Minimum
1 0 0
2 1 1
3 3 3
4 5 5
5 7 7
6 10 10
7 13 13
8 16 16
9 19 19
10 22 22
11 26 26
12 29 30[3][4]
13 33 34[5][6][7]
14 37 38[7]
15 41 42[8][9][10]
16 45 46[11]
17 49 50[11]
18 53 54[11]
19 57 58[10]
20 62 62
21 66 66
22 70 71[7]
n
10 22 19
100 525 521
1 000 8 530 8 524
10 000 118 459 118 451
100 000 1 516 705 1 516 695
1 000 000 18 488 885 18 488 874
Above: A comparison of the lower bound to the actual minimum number of comparisons (from OEISA036604) required to sort a list of n items (for the worst case). Below: Using Stirling's approximation, this lower bound is well-approximated by .

The number of comparisons that a comparison sort algorithm requires increases in proportion to , where is the number of elements to sort. This bound is asymptotically tight.

Given a list of distinct numbers (we can assume this because this is a worst-case analysis), there are factorial permutations exactly one of which is the list in sorted order. The sort algorithm must gain enough information from the comparisons to identify the correct permutation. If the algorithm always completes after at most steps, it cannot distinguish more than cases because the keys are distinct and each comparison has only two possible outcomes. Therefore,

, or equivalently

By looking at the first factors of , we obtain

This provides the lower-bound part of the claim. A more precise bound can be given via Stirling's approximation. An upper bound of the same form, with the same leading term as the bound obtained from Stirling's approximation, follows from the existence of the algorithms that attain this bound in the worst case, like merge sort.

The above argument provides an absolute, rather than only asymptotic lower bound on the number of comparisons, namely comparisons. This lower bound is fairly good (it can be approached within a linear tolerance by a simple merge sort), but it is known to be inexact. For example, , but the minimal number of comparisons to sort 13 elements has been proved to be 34.

Determining the exact number of comparisons needed to sort a given number of entries is a computationally hard problem even for small n, and no simple formula for the solution is known. For some of the few concrete values that have been computed, see OEISA036604.

Lower bound for the average number of comparisons

A similar bound applies to the average number of comparisons. Assuming that

  • all keys are distinct, i.e. every comparison will give either a>b or a<b, and
  • the input is a random permutation, chosen uniformly from the set of all possible permutations of n elements,

it is impossible to determine which order the input is in with fewer than log2(n!) comparisons on average.

This can be most easily seen using concepts from information theory. The Shannon entropy of such a random permutation is log2(n!) bits. Since a comparison can give only two results, the maximum amount of information it provides is 1 bit. Therefore, after k comparisons the remaining entropy of the permutation, given the results of those comparisons, is at least log2(n!) − k bits on average. To perform the sort, complete information is needed, so the remaining entropy must be 0. It follows that k must be at least log2(n!) on average.

The lower bound derived via information theory is phrased as 'information-theoretic lower bound'. Information-theoretic lower bound is correct but is not necessarily the strongest lower bound. And in some cases, the information-theoretic lower bound of a problem may even be far from the true lower bound. For example, the information-theoretic lower bound of selection is whereas comparisons are needed by an adversarial argument. The interplay between information-theoretic lower bound and the true lower bound are much like a real-valued function lower-bounding an integer function. However, this is not exactly correct when the average case is considered.

To unearth what happens while analyzing the average case, the key is that what does 'average' refer to? Averaging over what? With some knowledge of information theory, the information-theoretic lower bound averages over the set of all permutations as a whole. But any computer algorithms (under what are believed currently) must treat each permutation as an individual instance of the problem. Hence, the average lower bound we're searching for is averaged over all individual cases.

To search for the lower bound relating to the non-achievability of computers, we adopt the Decision tree model. Let's rephrase a bit of what our objective is. In the Decision tree model, the lower bound to be shown is the lower bound of the average length of root-to-leaf paths of an -leaf binary tree (in which each leaf corresponds to a permutation). The minimum average length of a binary tree with a given number of leaves is achieved by a balanced full binary tree, because any other binary tree can have its path length reduced by moving a pair of leaves to a higher position. With some careful calculations, for a balanced full binary tree with leaves, the average length of root-to-leaf paths is given by

For example, for n = 3, the information-theoretic lower bound for the average case is approximately 2.58, while the average lower bound derived via Decision tree model is 8/3, approximately 2.67.

In the case that multiple items may have the same key, there is no obvious statistical interpretation for the term "average case", so an argument like the above cannot be applied without making specific assumptions about the distribution of keys.

n log n maximum number of comparisons for array-size in format 2^k

Can easy compute for real algorithm sorted-list-merging (array are sorted n-blocks with size 1, merge to 1–1 to 2, merge 2–2 to 4...).

(1) = = = = = = = =

(2) =   =   =   =     // max 1 compares (size1+size2-1), 4x repeats to concat 8 arrays with size 1 and 1
   === === === ===

(3)   =       =       // max 7 compares, 2x repeats to concat 4 arrays with size 2 and 2
     ===     ===  
    =====   ===== 
   ======= =======

(4)                   // max 15 compares, 1x repeats to concat 2 arrays with size 4 and 4

Formula extraction:
n = 256 = 2^8 (array size in format 2^k, for simplify)
On = (n-1) + 2(n/2-1) + 4(n/4-1) + 8(n/8-1) + 16(n/16-1) + 32(n/32-1) + 64(n/64-1) + 128(n/128-1)
On = (n-1) + (n-2) + (n-4) + (n-8) + (n-16) + (n-32) + (n-64) + (n-128)
On = n+n+n+n+n+n+n+n - (1+2+4+8+16+32+64+128)   | 1+2+4... = formula for geometric sequence Sn = a1 * (q^i - 1) / (n - 1), n is number of items, a1 is first item
On = 8*n - 1 * (2^8 - 1) / (2 - 1)
On = 8*n - (2^8 - 1)   | 2^8 = n
On = 8*n - (n - 1)
On = (8-1)*n + 1   | 8 = ln(n)/ln(2) = ln(256)/ln(2)
On = (ln(n)/ln(2) - 1) * n + 1

Example:
n = 2^4 = 16, On ~= 3*n
n = 2^8 = 256, On ~= 7*n
n = 2^10 = 1.024, On ~= 9*n
n = 2^20 = 1.048.576, On ~= 19*n

Sorting a pre-sorted list

If a list is already close to sorted, according to some measure of sortedness, the number of comparisons required to sort it can be smaller. An adaptive sort takes advantage of this "presortedness" and runs more quickly on nearly-sorted inputs, often while still maintaining an worst case time bound. An example is adaptive heap sort, a sorting algorithm based on Cartesian trees. It takes time , where is the average, over all values in the sequence, of the number of times the sequence jumps from below to above or vice versa.[12]

Notes

[13]

Notes

  1. ^ a b Wilkes, M. V. (1974-11-01). "The Art of Computer Programming, Volume 3, Sorting and Searching". The Computer Journal. 17 (4): 324–324. doi:10.1093/comjnl/17.4.324. ISSN 0010-4620.
  2. ^ Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. pp. 191–193. ISBN 0-262-03384-4.
  3. ^ Mark Wells, Applications of a language for computing in combinatorics, Information Processing 65 (Proceedings of the 1965 IFIP Congress), 497–498, 1966.
  4. ^ Mark Wells, Elements of Combinatorial Computing, Pergamon Press, Oxford, 1971.
  5. ^ Takumi Kasai, Shusaku Sawato, Shigeki Iwata, Thirty four comparisons are required to sort 13 items, LNCS 792, 260-269, 1994.
  6. ^ Marcin Peczarski, Sorting 13 elements requires 34 comparisons, LNCS 2461, 785–794, 2002.
  7. ^ a b c Marcin Peczarski, New results in minimum-comparison sorting, Algorithmica 40 (2), 133–145, 2004.
  8. ^ Marcin Peczarski, Computer assisted research of posets, PhD thesis, University of Warsaw, 2006.
  9. ^ Peczarski, Marcin (2007). "The Ford-Johnson algorithm is still unbeaten for less than 47 elements". Inf. Process. Lett. 101 (3): 126–128. doi:10.1016/j.ipl.2006.09.001.
  10. ^ a b Cheng, Weiyi; Liu, Xiaoguang; Wang, Gang; Liu, Jing (October 2007). "最少比较排序问题中S(15)和S(19)的解决" [The results of S(15) and S(19) to minimum-comparison sorting problem]. Journal of Frontiers of Computer Science and Technology (in Chinese). 1 (3): 305–313.
  11. ^ a b c Stober, F., & Weiß, A. (2023). Lower Bounds for Sorting 16, 17, and 18 Elements. In 2023 Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX) (pp. 201-213). Society for Industrial and Applied Mathematics
  12. ^ Levcopoulos, Christos; Petersson, Ola (1989), "Heapsort - Adapted for Presorted Files", WADS '89: Proceedings of the Workshop on Algorithms and Data Structures, Lecture Notes in Computer Science, vol. 382, London, UK: Springer-Verlag, pp. 499–509, doi:10.1007/3-540-51542-9_41.
  13. ^ Knuth, Donald E. (1998). The art of computer programming, volume 3: (2nd ed.) sorting and searching. USA: Addison Wesley Longman Publishing Co., Inc. ISBN 978-0-201-89685-5.

References

Read other articles:

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (سبتمبر 2020) ألقى كلا جانبي حرب عام 2008 بين روسيا وجورجيا باللوم على الطرف الآخر في بدء الحرب. خلص عدد من التقارير والباحثين (من بينهم خبراء روس مستقلون) إلى أن النزاع بدأ ...

 

This article may lack focus or may be about more than one topic. Please help improve this article, possibly by splitting the article and/or by introducing a disambiguation page, or discuss this issue on the talk page. (December 2023) First aid procedure Abdominal thrustsPerforming the Heimlich maneuver[edit on Wikidata] Abdominal thrusts, also known as the Heimlich maneuver or Heimlich manoeuvre, is a first-aid procedure used to treat upper-airway obstructions (or choking) by foreign obje...

 

Politburo of the 14th Congress← Politburo,13th CongressPolitburo,15th Congress →1 January 1926 – 19 December 19271 January 1926 – 19 December 1927 The Politburo of the 14th Congress of the All-Union Communist Party (Bolsheviks) was in session from 1 January 1926 to 19 December 1927. Composition Members Members of the Politburo of the 14th Congress of the All-Union Communist Party (Bolsheviks)[1] Name Cyrillic 13th POL 15th POL Birth Death PM Ethnicity Gender ...

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (فبراير 2019) هارولد سكوت معلومات شخصية تاريخ الميلاد 4 سبتمبر 1907  تاريخ الوفاة 29 يناير 1997 (89 سنة)   الجنسية المملكة المتحدة المملكة المتحدة لبريطانيا العظمى وأيرلند...

 

Tetapeh Chaetodon trifascialis AdultStatus konservasiHampir terancamIUCN165712 TaksonomiKerajaanAnimaliaFilumChordataKelasActinopteriOrdoChaetodontiformesFamiliChaetodontidaeGenusChaetodonSpesiesChaetodon trifascialis Quoy dan Gaimard, 1825 Tata namaSinonim takson Megaprotodon trifascialis (Quoy & Gaimard, 1825) Chaetodon bifascialis Cuvier, 1829 Chaetodon triangularis Rüppell, 1829 Chaetodon strigangulus Cuvier, 1831 Megaprotodon strigangulus (Cuvier, 1831)[1] lbs Ikan kepe-kepe...

 

Andreas Ernst Etlinger (* 9. Februar 1756 in Kulmbach, Fürstentum Bayreuth; † 26. Juni 1785 ebenda) war ein deutscher Arzt und Botaniker. Sein offizielles botanisches Autorenkürzel lautet „Etl.“. Inhaltsverzeichnis 1 Leben 2 Dedikationsname 3 Schriften 4 Literatur 5 Einzelnachweise Leben Etlinger war Sohn von Johann Leonhard aus Kulmbach. Am 21. April 1763 besuchte Etlinger im Alter von sieben Jahren das Lyceum in Kulmbach und begann 1773 mit dem Studium der Medizin an der Friedrich-A...

この項目には、一部のコンピュータや閲覧ソフトで表示できない文字(Microsoftコードページ932(はしご高))が含まれています(詳細)。 株式会社遠鉄百貨店[1]Entetsu Department Store Co., Ltd. 遠鉄百貨店本館(本社所在地)種類 株式会社市場情報 非上場略称 エンデパ[2]本社所在地 日本〒430-8588静岡県浜松市中区砂山町320-2[3]設立 1987年(昭和62年)4月[3&...

 

The former Carlos'n Charlie's in Oranjestad, Aruba A seafood dish at Mul Yam restaurant, located at Tel Aviv Port, Tel Aviv, Israel Stuffed blue crab shells known as Casquinha de Siri being enjoyed in Tropicana Restaurant at Rio de Janeiro City A bobó de camarão dish at a Rio de Janeiro restaurant The following is a list of notable seafood restaurants. A seafood restaurant typically specializes in seafood cuisine and seafood dishes, such as fish and shellfish. Seafood restaurants This is a ...

 

Campeonato Mundial deTaekwondo de 2022 Homens Mulheres   54 kg     46 kg     58 kg     49 kg     63 kg     53 kg     68 kg     57 kg     74 kg     62 kg     80 kg     67 kg     87 kg     73 kg     + 87 kg     + 73 kg   A categoria até 53 kg feminino foi um evento do Campeonato Mundial de Taekwondo de 2022, que ocorreu no Centro Aquát...

British Army general (1819–1890) Sir Henry Errington LongdenBorn14 January 1819[1]London, EnglandDied29 January 1890(1890-01-29) (aged 71)Bournemouth, DorsetAllegiance United KingdomService/branch British ArmyYears of service1836–1880RankGeneralBattles/warsFirst Anglo-Sikh WarSecond Anglo-Sikh WarIndian RebellionAwardsKnight Commander of the Order of the BathCompanion of the Order of the Star of India Memorial to General Sir Henry Errington Longden, Lincolnshir...

 

American baseball player Baseball player Cowan HydeLeft fielderBorn: (1909-04-10)April 10, 1909Pontotoc, Mississippi, United StatesDied: November 20, 2003(2003-11-20) (aged 94)St. Louis, Missouri, U.S.Batted: RightThrew: RightNegro league baseball debut1924, for the Memphis Red SoxLast Negro league appearance1951, for the Chicago American GiantsCareer statisticsBatting average.266Hits218Home runs1Runs batted in85Stolen bases45 Teams Memphis Red Sox (1924, 1927, 1938...

 

The Indices of Deprivation 2010 (ID 2010) is a deprivation index at the small area level, created by the British Department for Communities and Local Government (DCLG) and released on 24 March 2011. It follows the ID2007 and because much of the datasets are the same or similar between indices allows a comparison of relative deprivation of an area between the two indices.[1] While it is known as the ID2010, most of the data actually dates from 2008. Key results Jaywick, Essex has the h...

Species of insect White-spotted satyr nominate, Tambopata National Reserve, Peru M. h. maculata, Panama Scientific classification Kingdom: Animalia Phylum: Arthropoda Class: Insecta Order: Lepidoptera Family: Nymphalidae Subfamily: Satyrinae Tribe: Melanitini Genus: ManatariaKirby, [1904] Species: M. hercyna Binomial name Manataria hercyna(Hübner, [1821]) Synonyms Tisiphone hercyna Hübner, [1821] Morpho anosia Godart, [1824] Tisiphone maculata Hopffer, 1874 Tisiphone hercyna var. fasci...

 

Municipality in Gelderland, Netherlands This article is about the municipality in the Netherlands. For a community in Suriname, see Wageningen, Suriname. Municipality in Gelderland, NetherlandsWageningenMunicipalityWageningen market square FlagCoat of armsMotto: City of Life SciencesLocation in GelderlandWageningenLocation within the NetherlandsShow map of NetherlandsWageningenLocation within EuropeShow map of EuropeCoordinates: 51°58′N 5°40′E / 51.967°N 5.667°E&#...

 

В Википедии есть статьи о других людях с такой фамилией, см. Обухов; Обухов, Александр. Александр Кириллович Обухов Заместитель Наркома/Министра рыбной промышленности СССР 1941 — 1949 Министр рыбной промышленности РСФСР 12 мая 1954 — 4 июля 1956 Предшественник Мурашов, Паве...

Italian pay-television channel Television channel Disney ChannelCountryItalyBroadcast area Italy Malta San Marino Vatican City HeadquartersMilanProgrammingLanguage(s) Italian (dubbing) English Picture format HDTV 1080i SDTV 576i (downscaled) Timeshift serviceDisney Channel +1OwnershipOwnerThe Walt Disney Company (Italia) S.R.L.Disney Channels WorldwideSister channelsDisney JuniorHistoryLaunched3 October 1998; 25 years ago (1998-10-03)Closed1 May 2020; 3 years ag...

 

A page from the psalter of the Aberdeen Breviary.(National Library of Scotland). The Chepman and Myllar Press was the first printing press to be established in Scotland.[1] The press was founded in 1508 in Edinburgh by Walter Chepman and Androw Myllar, both burgesses of the Scottish capital. The two partners operated under a charter of King James IV issued in 1507 which gave them a monopoly in printed books within Scotland.[2] Very few products of the press are preserved today...

 

Indian film Garuda PuranaTheatrical release posterDirected byManjunath B NagbaScreenplay byManjunath B NagbaStory byManjunath B NagbaProduced bySindhu K MStarringManjunath B Nagba Santhosh Karki Disha Shetty Rajkumar BhagawanthCinematographySuniel NarasimhamurthyEdited byManjunath B NagbaMusic byRakesh AcharyaProductioncompany27 Frame's CreationsRelease date 3 November 2023 (2023-11-03) CountryIndiaLanguageKannada Garuda Purana is a 2023 Indian Kannada language crime thriller f...

Not to be confused with Antipope Gregory VIII. Head of the Catholic Church in 1187 This article's lead section may be too short to adequately summarize the key points. Please consider expanding the lead to provide an accessible overview of all important aspects of the article. (June 2016) PopeGregory VIIIBishop of RomeChurchCatholic ChurchPapacy began21 October 1187Papacy ended17 December 1187PredecessorUrban IIISuccessorClement IIIOrdersConsecration25 October 1187Created cardinal1156by Adria...

 

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Hindmarsh River – news · newspapers · books · scholar · JSTOR (July 2019) (Learn how and when to remove this template message) River in South Australia, AustraliaHindmarshHindmarsh River bridge at Victor Harbor Rowboat on Hindmarsh River, ca. 1915Location of th...

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!