Proxmap sort

Proxmap sort
Insertion sorting into buckets during proxmap.
Insertion sorting into buckets during proxmap.
Example of insertion sort sorting a list of random numbers.
ClassSorting algorithm
Data structureArray
Worst-case performance
Best-case performance
Average performance
Worst-case space complexity
Elements are distributed among bins
Unlike bucket sorting which sorts after all the buckets are filled, the elements are insertion sorted as they are inserted

ProxmapSort, or Proxmap sort, is a sorting algorithm that works by partitioning an array of data items, or keys, into a number of "subarrays" (termed buckets, in similar sorts). The name is short for computing a "proximity map," which indicates for each key K the beginning of a subarray where K will reside in the final sorted order. Keys are placed into each subarray using insertion sort. If keys are "well distributed" among the subarrays, sorting occurs in linear time. The computational complexity estimates involve the number of subarrays and the proximity mapping function, the "map key," used. It is a form of bucket and radix sort.

Once a ProxmapSort is complete, ProxmapSearch can be used to find keys in the sorted array in time if the keys were well distributed during the sort.

Both algorithms were invented in the late 1980s by Prof. Thomas A. Standish at the University of California, Irvine.

Overview

Basic strategy

In general: Given an array A with n keys:

  • map a key to a subarray of the destination array A2, by applying the map key function to each array item
  • determine how many keys will map to the same subarray, using an array of "hit counts," H
  • determine where each subarray will begin in the destination array so that each bucket is exactly the right size to hold all the keys that will map to it, using an array of "proxmaps," P
  • for each key, compute the subarray it will map to, using an array of "locations," L
  • for each key, look up its location, place it into that cell of A2; if it collides with a key already in that position, insertion sort the key into place, moving keys greater than this key to the right by one to make a space for this key. Since the subarray is big enough to hold all the keys mapped to it, such movement will never cause the keys to overflow into the following subarray.

Simplied version: Given an array A with n keys

  1. Initialize: Create and initialize 2 arrays of n size: hitCount, proxMap, and 2 arrays of A.length: location, and A2.
  2. Partition: Using a carefully chosen mapKey function, divide the A2 into subarrays using the keys in A
  3. Disperse: Read over A, dropping each key into its bucket in A2; insertion sorting as needed.
  4. Collect: Visit the subarrays in order and put all the elements back into the original array, or simply use A2.

Note: "keys" may also contain other data, for instance an array of Student objects that contain the key plus a student ID and name. This makes ProxMapSort suitable for organizing groups of objects, not just keys themselves.

Example

Consider a full array: A[0 to n-1] with n keys. Let i be an index of A. Sort A's keys into array A2 of equal size.

The map key function is defined as mapKey(key) = floor(K).

Array table
A 6.7 5.9 8.4 1.2 7.3 3.7 11.5 1.1 4.8 0.4 10.5 6.1 1.8
H 1 3 0 1 1 1 2 1 1 0 1 1
P 0 1 -9 4 5 6 7 9 10 -9 11 12
L 7 6 10 1 9 4 12 1 5 0 11 7 1
A2 0.4 1.1 1.2 1.8 3.7 4.8 5.9 6.1 6.7 7.3 8.4 10.5 11.5

A demonstration of ProxMapSort, a bucket sort variant that uses intermediate parallel arrays to efficiently index and size its sublists.

Pseudocode

// compute hit counts
for i = 0 to 11 // where 11 is n
{
    H[i] = 0;
}
for i = 0 to 12 // where 12 is A.length
{
    pos = MapKey(A[i]);
    H[pos] = H[pos] + 1;
}

runningTotal = 0; // compute prox map – location of start of each subarray
for i = 0 to 11
    if H[i] = 0
        P[i] = -9;
    else
        P[i] = runningTotal;
        runningTotal = runningTotal + H[i];

for i = 0 to 12 // compute location – subarray – in A2 into which each item in A is to be placed
    L[i] = P[MapKey(A[i])];

for I = 0 to 12; // sort items
    A2[I] = <empty>;
for i = 0 to 12 // insert each item into subarray beginning at start, preserving order
{
    start = L[i]; // subarray for this item begins at this location
    insertion made = false;
    for j = start to (<the end of A2 is found, and insertion not made>)
    {
        if A2[j] == <empty> // if subarray empty, just put item in first position of subarray
            A2[j] = A[i];
            insertion made = true;
        else if A[i] < A2[j] // key belongs at A2[j]
            int end = j + 1; // find end of used part of subarray – where first <empty> is
            while (A2[end] != <empty>)
                end++;
            for k = end -1 to j // move larger keys to the right 1 cell
                A2[k+1] = A2[k];
                A2[j] = A[i];
            insertion made = true; // add in new key
    }
}

Here A is the array to be sorted and the mapKey functions determines the number of subarrays to use. For example, floor(K) will simply assign as many subarrays as there are integers from the data in A. Dividing the key by a constant reduces the number of subarrays; different functions can be used to translate the range of elements in A to subarrays, such as converting the letters A–Z to 0–25 or returning the first character (0–255) for sorting strings. Subarrays are sorted as the data comes in, not after all data has been placed into the subarray, as is typical in bucket sorting.

Proxmap searching

ProxmapSearch uses the proxMap array generated by a previously done ProxmapSort to find keys in the sorted array A2 in constant time.

Basic strategy

  • Sort the keys using ProxmapSort, keeping the MapKey function, and the P and A2 arrays
  • To search for a key, go to P[MapKey(k)], the start of the subarray that contains the key, if that key is in the data set
  • Sequentially search the subarray; if the key is found, return it (and associated information); if find a value greater than the key, the key is not in the data set
  • Computing P[MapKey(k)] takes time. If a map key that gives a good distribution of keys was used during the sort, each subarray is bounded above by a constant c, so at most c comparisons are needed to find the key or know it is not present; therefore ProxmapSearch is . If the worst map key was used, all keys are in the same subarray, so ProxmapSearch, in this worst case, will require comparisons.

Pseudocode

function mapKey(key) is
    return floor(key)
    proxMap ← previously generated proxmap array of size n
    A2 ← previously sorted array of size n
function proxmap-search(key) is
    for i = proxMap[mapKey(key)] to length(array) − 1 do
        if sortedArray[i].key == key then
            return sortedArray[i]

Analysis

Performance

Computing H, P, and L all take time. Each is computed with one pass through an array, with constant time spent at each array location.

  • Worst case: MapKey places all items into one subarray, resulting in a standard insertion sort, and time of .
  • Best case: MapKey delivers the same small number of items to each subarray in an order where the best case of insertion sort occurs. Each insertion sort is , c the size of the subarrays; there are p subarrays thus p * c = n, so the insertion phase take O(n); thus, ProxmapSort is .
  • Average case: Each subarray is at most size c, a constant; insertion sort for each subarray is then O(c^2) at worst – a constant. (The actual time can be much better, since c items are not sorted until the last item is placed in the bucket). Total time is the number of buckets, (n/c), times = .

Having a good MapKey function is imperative for avoiding the worst case. We must know something about the distribution of the data to come up with a good key.

Optimizations

  1. Save time: Save the MapKey(i) values so they don't have to be recomputed (as they are in the code above)
  2. Save space: The proxMaps can be stored in the hitCount array, as the hit counts are not needed once the proxmap is computed; the data can be sorted back into A, instead of using A2, if one takes care to note which A values have been sorted so far, and which not.

JavaScript code implementation:

Array.prototype.ProxmapSort = function()
{//-- Edit date: 2019/11/13 Taiwan --//
  var start = 0;
  var end = this.length;
  var A2 = new Array(end);
  var MapKey = new Array(end);
  var hitCount = new Array(end);
  
  for (var i = start; i < end; i++) { hitCount[i] = 0; }
  var min = this[start];
  var max = this[start];
  for (var i = start+1; i < end; i++) {
    if (this[i] < min) { min = this[i]; }
    else {if (this[i] > max) { max = this[i]; }}
  }
  //Optimization 1.Save the MapKey[i].
  for (var i = start; i < end; i++) {
    MapKey[i] = Math.floor(((this[i] - min ) / (max - min )) * (end - 1));
    hitCount[MapKey[i]]++;
  }
  //Optimization 2.ProxMaps store in the hitCount.
  hitCount[end-1] = end - hitCount[end-1];
  for (var i = end-1; i > start; i--){
    hitCount[i-1] = hitCount[i] - hitCount[i-1];
  }
  //insert A[i]=this[i] to A2 correct position
  var insertIndex = 0;
  var insertStart = 0;
  for (var i = start; i < end; i++) {
    insertIndex = hitCount[MapKey[i]];
    insertStart = insertIndex;
    while (A2[insertIndex] != null) { insertIndex++; }
    while (insertIndex > insertStart && this[i] < A2[insertIndex-1]) {
      A2[insertIndex] = A2[insertIndex-1];
      insertIndex--;
    }
    A2[insertIndex] = this[i];
  }
  for (var i = start; i < end; i++) { this[i] = A2[i]; }
};

Comparison with other sorting algorithms

Since ProxmapSort is not a comparison sort, the Ω(n log n) lower bound is inapplicable.[citation needed] Its speed can be attributed to it not being comparison-based and using arrays instead of dynamically allocated objects and pointers that must be followed, such as is done with when using a binary search tree.

ProxmapSort allows for the use of ProxmapSearch. Despite the O(n) build time, ProxMapSearch makes up for it with its average access time, making it very appealing for large databases. If the data doesn't need to be updated often, the access time may make this function more favorable than other non-comparison sorting based sorts.

Like ProxmapSort, bucket sort generally operates on a list of n numeric inputs between zero and some maximum key or value M and divides the value range into n buckets each of size M/n. If each bucket is sorted using insertion sort, ProxmapSort and bucket sort can be shown to run in predicted linear time.[1][original research?] However, the performance of this sort degrades with clustering (or too few buckets with too many keys); if many values occur close together, they will all fall into a single bucket and performance will be severely diminished. This behavior also holds for ProxmapSort: if the buckets are too large, its performance will degrade severely.

References

  1. ^ Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 8.4: Bucket sort, pp.174–177.

Read other articles:

Montalto Di Castro AirfieldPart of Twelfth Air ForceCoordinates42°22′27.58″N 011°36′05.07″E / 42.3743278°N 11.6014083°E / 42.3743278; 11.6014083TypeMilitary airfieldSite informationControlled byUnited States Army Air ForcesSite historyBuilt1944In use1944 Montalto Di Castro Airfieldclass=notpageimage| Location of Montalto Di Castro Airfield, Italy Montalto Di Castro Airfield is an abandoned World War II military airfield in Italy, located approxima...

 

Rhino Información generalTipo de programa intérprete de JavaScriptDesarrollador Fundación MozillaLanzamiento inicial 1997Licencia MPL 1.1 / GPL 2.0Información técnicaProgramado en JavaVersionesÚltima versión estable 1.7R418 de junio de 2012Enlaces Sitio web oficial Repositorio de código [editar datos en Wikidata] Rhino es un intérprete de JavaScript de código abierto desarrollado en lenguaje de programación Java. Historia Rhino comenzó a ser desarrollado en 1997 por Norr...

 

Lambang Kristiansand Kristiansand Kristiansand (awalnya Christianssand ) ialah sebuah kota dan kotamadya, dan ibu kota provinsi Vest-Agder, Norwegia. Menurut penduduknya, inilah kota terbesar di Norwegia,[1] dan kota terbesar di kawasan geografis Sørlandet. Sejak 1 Januari 2006, kotamadya ini berpenduduk 76.917.[2] Kristiansand didirikan oleh Raja Christian IV, yang pada 1641 mengucapkan di sini sebuah kota harus berdiri. Awalnya kota ini berdiri sebagai kota pasar untuk mend...

Osmanlı İmparatorluğu Padişahları Eski Monarşi İmparatorluk Osmanlı İmparatorluğu arması Osman Gazi'den V. Mehmed'e kadar Osmanlı İmparatorluğu padişahları montajı İlk hükümdar Osman Gazi Son hükümdar VI. Mehmed Unvan Sultân-ı ÂzamSultân-es SelâtinHüdâvendigârHakanKağanİslam HalifesiKayser-i RumKendi unvanı[a] Resmi konum İstanbul’daki saraylar: 1460-1853: Topkapı 1853-1889: Dolmabahçe 1889-1909: Yıldız 1909-1922: Dolmabahçe Monarşi başlangıcı 1299...

 

Para el botánico inglés del siglo XVIII, véase John Lightfoot (biólogo). John Lightfoot. John Lightfoot (Stoke-on-Trent, 29 de marzo de 1602 - Ely, 6 de diciembre de 1675) fue un eclesiástico inglés, erudito rabínico, vicerrector de la Universidad de Cambridge y director del Saint Catharine's College de Cambridge. Vida Nació en Stoke-on-Trent, hijo de Thomas Lightfoot, vicario de Uttoxeter, en Staffordshire. Fue educado en Morton Green, cerca de Congleton, Cheshire, y en el Chris...

 

Золота підвіскаказ. Алтын алқа Країна  КазахстанТип орденВручається: багатодітним матерямСтатус вручається Нагородження Засновано: 1993Нагороджені: Черговість Молодша нагорода Срібна підвіскаВідповідає Мати-героїня Золота́ підві́ска (каз. Алтын алқа) — державна ...

2007 Indian filmMission 90 DaysDVD release of Mission 90 Days.Directed byMajor RaviScreenplay by Major Ravi S. Tirru Shiju Nambiath (dialogues) Story byMajor RaviProduced bySasi AyyanchiraStarringMammootty Tulip Joshi Lalu Alex Innocent Ravi MariyaBaburaj RadhikaCinematographyTirruEdited byJayashanjarMusic byJaison J NairRelease date 12 July 2007 (2007-07-12) CountryIndiaLanguageMalayalam Mission 90 Days is a 2007 Indian Malayalam-language film directed by retired Army officer ...

 

Ancient Roman law Politics of ancient Rome Periods Roman Kingdom753–509 BC Roman Republic509–27 BC Roman Empire27 BC – AD 395 Principate27 BC – AD 284 DominateAD 284–641 WesternAD 395–476 EasternAD 395–1453 Timeline Constitution Kingdom Republic Sullan republic Empire Augustan reforms Late Empire Political institutions Imperium Collegiality Auctoritas Roman citizenship Cursus honorum Assemblies Centuriate Curiate Plebeian Tribal Ordinary magistrates Consul Praetor Quaestor Proma...

 

19th-century British military officer and diplomat (1813-1883) The unusual twin grave of William Taylour Thomson and his wife, Warriston Cemetery, Edinburgh Sir William Taylour Thomson KCMG CB (1813-1883) was a British military officer and diplomat. Military career He was a gifted military officer. When the British ship Tigris sank in the Euphrates river he was one of the survivors. In 1839 he participated in taking of Herat. He served in Iran in 1849 and 1853 to 1855.[1] Diplomat...

Susah Jaga Keperawanan di JakartaPoster filmSutradara Joko Nugroho ProduserDitulis olehPemeranMasayu AnastasiaAulia SarahSarah RizkyaRifky BalweelIndra BirowoTessy SrimulatDistributorMultivision PlusTanggal rilis2 Desember 2010Durasi90 menit (1 Jam 30 Menit)Negara Indonesia Bahasa Indonesia Susah Jaga Keperawanan di Jakarta adalah film komedi Indonesia yang dirilis pada 2 Desember 2010 dengan disutradarai oleh Joko Nugroho yang dibintangi oleh Masayu Anastasia dan Aulia Sarah. Sinopsis Di seb...

 

South Korean actress In this Korean name, the family name is Moon. Moon Geun YoungMoon Geun-young in 2019Born (1987-05-06) May 6, 1987 (age 36)Gwangju, South Korea[1]Alma materSungkyunkwan University[2]OccupationActressYears active1997–presentAgentCree Company[3]Korean nameHangul문근영Hanja文瑾瑩Revised RomanizationMun Geun-yeongMcCune–ReischauerMun Kŭnyŏng Moon Geun-young (Korean: 문근영; Hanja: 文瑾瑩; born May 6...

 

BBC local radio station for Newcastle, England BBC Radio NewcastleNewcastle upon TyneBroadcast areaTyne and Wear, Northumberland and north-east County DurhamFrequencyFM: 95.4 MHz (Tyne and Wear, County Durham and South Northumberland)FM: 96.0 MHz (Northumberland)FM: 103.7 MHz (Tyne Valley)FM: 104.4 MHz (Gateshead and Newcastle-upon-Tyne)DAB: 11C Freeview: 719RDSBBC NWCLProgrammingLanguage(s)EnglishFormatLocal news, talk, music and sportOwnershipOwnerBBC Local Radio,BBC North East and CumbriaH...

隆子女王続柄 章明親王第一王女出生 不明死去 天延2年閏10月16日(974年12月2日)伊勢国多気郡斎宮(現:三重県多気郡明和町)埋葬 三重県多気郡明和町父親 章明親王母親 藤原敦敏女役職 伊勢斎宮テンプレートを表示 隆子女王(たかこじょおう、生年不詳 - 天延2年閏10月16日(974年12月2日))は、平安時代中期の皇族。伊勢斎宮。章明親王第1王女(醍醐天皇の皇孫)[...

 

Renault Laguna Общие данные Производитель Renault Годы производства 1994—2015 Класс Среднеразмерный На рынке Похожие модели Nissan Primera, Hyundai i40, Opel Insignia, Chevrolet Malibu Сегмент D-сегмент Renault 21Renault Talisman Медиафайлы на Викискладе Renault Laguna — среднеразмерный автомобиль французского автопроиз...

 

Bygone Texas music hall in Austin The Armadillo World Headquarters (behind the Skating Palace) in 1976 Armadillo World Headquarters (The 'Dillo or Armadillo WHQ) was an influential Texas music hall and beer garden in Austin at 5251⁄2 Barton Springs Road – at South First Street – just south of the Colorado River and downtown Austin. The 'Dillo flourished from 1970 to 1980.[1][2][3][4] The structure that housed it, an old National Guard Armory, was demo...

Ice hockey team in Kearney, NebraskaTri-City StormCityKearney, NebraskaLeagueUSHLConferenceWesternFounded1979Home arenaViaero CenterColorsPurple, black and silver     Owner(s)Kirk W. BrooksHead coachAnthony NoreenMediaGrand Island IndependentKHGI-TVKSNB-TVKearney HubFranchise history1979–1984Bloomington Jr. Stars1985–1986Minneapolis Stars1986–1995St. Paul Vulcans1995–2000Twin Cities Vulcans2000–presentTri-City StormChampionshipsRegular season titles3 Anderson Cups (20...

 

Artikel ini bukan mengenai Ford Transit. Ford Transit ConnectGenerasi pertama Transit ConnectInformasiProdusenFord Motor CompanyJuga disebutFord Tourneo ConnectMasa produksi2002-sekarangBodi & rangkaKelasMPV kompak/minivanBentuk kerangka4/5-pintu panel van4/5-pintu minivan Ford Transit Connect adalah MPV kompak yang diproduksi oleh Ford Eropa. Generasi pertama didesain oleh Peter Horbury dan diperkenalkan tahun 2002 untuk menggantikan Ford Escort dan Courier berbasis Fiesta. Ref...

 

Clément Moreau Joseph Carl Meffert (26 March 1903 in Koblenz, Germany – 27 December 1988 in Sirnach, Switzerland), better known by his nom de plume Clément Moreau, was a politically and socially conscious graphic designer and artist. His best-known work is the wordless novel Night over Germany. Personal life Josef Carl Meffert was born out-of-wedlock in Koblenz, Germany on 26 March 1903. After a difficult childhood, he spent 1914 to 1918 in two hospitals in Westphalia. In 1927, he moved t...

Artikel ini mungkin mengandung riset asli. Anda dapat membantu memperbaikinya dengan memastikan pernyataan yang dibuat dan menambahkan referensi. Pernyataan yang berpangku pada riset asli harus dihapus. (Pelajari cara dan kapan saatnya untuk menghapus pesan templat ini) Artikel ini membutuhkan rujukan tambahan agar kualitasnya dapat dipastikan. Mohon bantu kami mengembangkan artikel ini dengan cara menambahkan rujukan ke sumber tepercaya. Pernyataan tak bersumber bisa saja dipertentangkan dan...

 

McBride, 2009 Daniel Richard Danny McBride (lahir 12 Mei 1976) adalah seorang pemeran berkebangsaan Inggris. Pranala luar Wikimedia Commons memiliki media mengenai Danny McBride. Danny McBride di IMDb (dalam bahasa Inggris) Artikel bertopik biografi pemeran ini adalah sebuah rintisan. Anda dapat membantu Wikipedia dengan mengembangkannya.lbs

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!