Хан


38

Хтось знайомий з Yijie Han , лінійним простором, цілим алгоритмом сортування? Цей результат представляється в досить короткій статті ( детермінованих сортування в O ( N журнал журнал N ) часу і лінійне простір J. Alg 50: .. 96-105, 2004) , які в основному склеює разом багато попередніх результатів, з відповідними пристосуваннями. Моя проблема полягає в тому, що це написано досить махаючи руками, не заглиблюючись в особливості. Він значною мірою спирається на попередні документи, серед яких виділяється інший документ Хана ( Покращене швидке ціле сортування в лінійному просторіO(nloglogn)O(nloglogn). Інформація та обчислення 170 (1): 81–94) написані приблизно в одному стилі. У мене виникають значні труднощі в розумінні цих двох робіт, особливо в тому, як вони адаптуються та використовують попередні результати. Буду вдячний за будь-яку допомогу.

Це, звичайно, занадто широке і розпливчасте, щоб вважати належним питанням, але я сподіваюся розвинути дискусію між декількома цілеспрямованими чітко визначеними питаннями та відповідями.

Щоб вирішити, ось моє перше конкретне питання. У лемі 2 інформації. Склад. У статті є рекурсивний алгоритм часу для знаходження найменшого цілого числа в наборі n малих цілих чисел, упакованих k кожне в слова RAM. В описі алгоритму не згадується, як обробляється базовий випадок k = O ( n ) . У цьому випадку потрібно зробити вибір в O ( log k ) час. Як це можна зробити?O(n/klogk)nkk=O(n)O(logk)


13
Цілком доречно було б написати йому: hanyij@umkc.edu.
Джозеф О'Рурк

Так. Ми обговорювали це загальне питання раніше, і правильний спосіб вирішити це - надіслати електронний лист автору.
Суреш Венкат

17
This includes a specific question about a paper that's 7 years old and has already gone through the peer review process. While Ari could email the author, this seems like an ideal question for this site. I don't understand the deflection.
Huck Bennett

18
Of course the first thing I did was write Han. No answer. Then I reached out through a contact to someone else who has done integer-sorting research, and he said that upon perusal he had found the papers to be too messy to merit further investment of his time. That's when I came here. If there's anyone out there who knows Han and can get his attention on my behalf, that would be great too.
Ari

4
General sorting does not have an Ω(nlogn) lower bound. Quite the opposite---it is sorting restricted to comparisons that has this bound. The issue here is not restricting the input but rather enhancing the computational model. My computational model is any of the unit cost RAM flavors, and I'll allow any reasonable assumptions (such as the availability of constants that depend on the word length).
Ari

Відповіді:


18

I was just wondering the same thing.

Fortunately, I was able to find a journal-article published in 2011 which explains this very thing; whats more, you don't need a subscription to view it: Implementation and Performance Analysis of Exponential Tree Sorting

I recommend reading the entire article to learn how it can be implemented and to better understand its underlying theory. It also shows how Exponential Trees stack up against Quick-Sort and Binary Trees. Here's the relevant excerpt related to Han's O(nloglogn) time, linear space, integer sorting algorithm:

Yijie Han has given an idea which reduces the complexity to expected time in linear space.[6] The technique used by him is coordinated pass down of integers on the Andersson’s exponential search tree[8] and the linear time multi-dividing of the bits of integers. Instead of inserting integer one at a time into the exponential search tree, he passed down all integers one level of the exponential search tree at a time. Such coordinated passing down provides the chance of performing multi-dividing in linear time and therefore speeding up the algorithm. This idea may provide speed up, but in practical implementation it is very difficult to handle integers in batches.

[6] Y. Han, Deterministic sorting in O(n log log n) time and linear space, 34th STOC, 2002.

[8] A. Andersson, Fast deterministic sorting and searching in linear space, IEEE Symposium on Foundations of Computer Science, 1996.


Why the downvote ?
Suresh Venkat

1
I just added this journal-article link to the Exponential tree wikipedia page. FYI: This article might of been published after the question was asked.
A T

@AT, could you please expand your answer a little bit and explain how it answers the question. Right now the only thing it gives is a link to an article in some journal.
Kaveh

1
Well, I've already given up on Han's paper, so I'm glad you've been able to provide this help. I didn't really expect to see anything when I came back here today. Thanks! I'll read this new paper and see if it helps me make progress on Han's paper.
Ari

2
Well, I've now read it, and I'll allow that possibly I completely misunderstood it, but barring that, there appears to be a slight problem. The authors claim their tree has height O(loglogn), but if the tree has height h, then it has (h+1)! leaves, and therefore less than 2(h+1)! nodes in total. Let's generously assume each node holds h+2 keys. Then the tree contains less than 2(h+2)! keys. If 2(h+2)!=n then h=Ω(logn/loglogn). Anyway, even if the authors are correct, they neither achieve O(loglogn) sorting, nor explain Han, so not useful.
Ari

1

im not sure about the answer (haven't gone through paper) but i think this should help. The numbers are packed into a single word, so operations on a single word take O(1) time. If there are, say, k numbers of h bits each then word size depends on k,h which in turn also depends on the range of numbers. So we use range reduction techniques that can reduce the range of numbers so that many numbers can fit in a single word. Then creating proper bit masks, we can find separate larger integers from the shorter ones considering two words at a time. This can be done in O(1) time. (Ontuition: for this each number stored in word has a flag bit associated with it and then we subtract two words... if the flag bit goes then it's smaller a number).

Similarly using above we can also sort any word containing k numbers in O(log k) time (bitonic sort).

Edit: Algorithm to sort 2k numbers in the range 0 to m-1 packed in a word where each number takes size L of = log(m+k)+2.

K1 be 1:000000 1:000000 1:000000 1:000000....... so on where the bit before colon is also called flag bit and each sequence is L bits long and is repeated 2k times in the word K_1.(Colon is only for understanding)

K2 is (2k-1)(2k-2)....1 written in binary. Sketch of algorithm:

Repeat for t=log k to 0.

Part 1 - separate the original word Z into two words A and B.

  1. Let T be obtained by shifting K2, (L-1-t positions) to the left and ANDing the result with K1. Let M=T-(T shifted L-1 places).

  2. And Z with M and shift the result (2tL) places to right. This gives A.

  3. B=Z-(Z&M).

Part 2

  1. M=((A OR K1)-B)& K1

  2. M=M-(M shifted left L-1 places).

  3. MIN=(B&M) OR (A-(A&M))

  4. MAX=(A&M) OR (B-(B&M))

  5. MAX is shifted by 2tL places.

  6. Finally appropriately ORing MAX and MIN we get back Z.

I have given the sketch, hope you can fill up necessary details required.


I'm not clear on what you are suggesting. The assumption is that the integers are already small and k of them are already packed into a single word. Are you proposing to further reduce their size? If so, what do you do then? Also, I know how to sort a bitonic sequence packed into a single word in O(log k) time, or to sort a general (non-bitonic) sequence in O(log^2 k) time. If you know an algorithm that sorts a general sequence in O(log k) time, could you please describe it in more detail? (Such an algorithm would of course solve the selection problem.)
Ari

im not further reducing size, i was suggesting how to reduce size which was not required in your answer. Sorry for the confusion.
singhsumit

Unless I've misunderstood it, this looks like the algorithm for sorting bitonic sequences. It doesn't sort general sequences. For example, does it sort the sequence 3,0,2,0, where the 3 is in the leftmost (most significant) field?
Ari

3 0 2 0 is separated n we get A=3 2 and B=0 0 then MAX becomes 3 2 and MIN is 0 0. Then we have new Z as 3 2 0 0. Any general sequence has bitonic sequence of size 2. with each iteration these sizes get double and finally in log k time we have our answer.
singhsumit

No. The numbers don't get compacted, only shifted down. In the first iteration we split pairs of numbers differing in the high bit of their position so we get A=0 3 0 2 and B=0 0 0 0, so MIN=0 0 0 0, MAX=0 3 0 2, and Z=3 0 2 0. In the second iteration we split pairs differing in the low bit of their position, so again we get A=0 3 0 2, B=0 0 0 0, and again Z remains unchanged.
Ari
Використовуючи наш веб-сайт, ви визнаєте, що прочитали та зрозуміли наші Політику щодо файлів cookie та Політику конфіденційності.
Licensed under cc by-sa 3.0 with attribution required.