Набір Ω ( d , n )Ω(d,n) чітко визначених результатів у nn незалежних рулонах штампу з d = 6d=6 граней має d ndn елементів. Коли штамп справедливий, це означає, що кожен результат одного рулону має ймовірність 1 / д,1/d а незалежність означає, що кожен з цих результатів матиме ймовірність ( 1 / д ) n :(1/d)n: тобто вони мають рівномірний розподіл P d , n .Pd,n.
Припустимо, ви розробили певну процедуру t,t яка визначає m-m результат померлої сторони c ( = 150 )c(=150) - тобто елемент Ω ( c , m ) - абоΩ(c,m) інший повідомляє про помилку (а це означає, що вам доведеться повторити це для отримання результату). Тобто,
t : Ω ( d , n ) → Ω ( c , m ) ∪ { Збій } .t:Ω(d,n)→Ω(c,m)∪{Failure}.
Нехай FF - ймовірність tt призводить до невдачі, і зауважимо, що FF є деяким інтегральним кратним d - n ,d−n, скажімо
F = Pr ( t ( ω ) = відмова ) = N Fd - n .F=Pr(t(ω)=Failure)=NFd−n.
(Для подальшої довідки зауважте, що очікувана кількість разів, коли потрібно запустити t,t перш ніж не вийти з ладу, становить 1 / ( 1 - F ) .1/(1−F). )
Вимога , щоб ці результати в П ( с , м )Ω(c,m) бути однорідними і НЕ залежить умовний від тt не повідомляє , що значить не тt зберігає ймовірність в тому сенсі , що для кожної події ⊂ Ом ( з , м ) ,A⊂Ω(c,m),
P d , n ( t ∗ A )1 - F =Pc,m(A)Pd,n(t∗A)1−F=Pc,m(A)(1)
де
t ∗ ( A ) = { ω ∈ Ω ∣ t ( ω ) ∈ A }t∗(A)={ω∈Ω∣t(ω)∈A}
це набір штампів валків , що процедура тt привласнює події A .A.
Consider an atomic event A={η}⊂Ω(c,m)A={η}⊂Ω(c,m), which must have probability c−m.c−m. Let t∗(A)t∗(A) (the dice rolls associated with ηη) have NηNη elements. (1)(1) becomes
Nηd−n1−NFd−n=Pd,n(t∗A)1−F=Pc,m(A)=c−m.Nηd−n1−NFd−n=Pd,n(t∗A)1−F=Pc,m(A)=c−m.(2)
It is immediate that the NηNη are all equal to some integer N.N. It remains only to find the most efficient procedures t.t. The expected number of non-failures per roll of the cc sided die is
1m(1−F).1m(1−F).
There are two immediate and obvious implications. One is that if we can keep FF small as mm grows large, then the effect of reporting a failure is asymptotically zero. The other is that for any given mm (the number of rolls of the cc-sided die to simulate), we want to make FF as small as possible.
Let's take a closer look at (2)(2) by clearing the denominators:
Ncm=dn−NF>0.Ncm=dn−NF>0.
This makes it obvious that in a given context (determined by c,d,n,mc,d,n,m), FF is made as small as possible by making dn−NFdn−NF equal the largest multiple of cmcm that is less than or equal to dn.dn. We may write this in terms of the greatest integer function (or "floor") ⌊∗⌋⌊∗⌋ as
N=⌊dncm⌋.N=⌊dncm⌋.
Finally, it is clear that NN ought to be as small as possible for highest efficiency, because it measures redundancy in tt. Specifically, the expected number of rolls of the dd-sided die needed to produce one roll of the cc-sided die is
N×nm×11−F.N×nm×11−F.
Thus, our search for high-efficiency procedures ought to focus on the cases where dndn is equal to, or just barely greater than, some power cm.cm.
The analysis ends by showing that for given dd and c,c, there is a sequence of multiples (n,m)(n,m) for which this approach approximates perfect efficiency. This amounts to finding (n,m)(n,m) for which dn/cm≥1dn/cm≥1 approaches N=1N=1 in the limit (automatically guaranteeing F→0F→0). One such sequence is obtained by taking n=1,2,3,…n=1,2,3,… and determining
m=⌊nlogdlogc⌋.m=⌊nlogdlogc⌋.(3)
The proof is straightforward.
This all means that when we are willing to roll the original dd-sided die a sufficiently large number of times n,n, we can expect to simulate nearly logd/logc=logcdlogd/logc=logcd outcomes of a cc-sided die per roll. Equivalently,
It is possible to simulate a large number mm of independent rolls of a cc-sided die using a fair dd-sided die using an average of log(c)/log(d)+ϵ=logd(c)+ϵlog(c)/log(d)+ϵ=logd(c)+ϵ rolls per outcome where ϵϵ can be made arbitrarily small by choosing mm sufficiently large.
Examples and algorithms
In the question, d=6d=6 and c=150,c=150, whence
logd(c)=log(c)log(d)≈2.796489.logd(c)=log(c)log(d)≈2.796489.
Thus, the best possible procedure will require, on average, at least 2.7964892.796489 rolls of a d6
to simulate each d150
outcome.
The analysis shows how to do this. We don't need to resort to number theory to carry it out: we can just tabulate the powers dn=6ndn=6n and the powers cm=150mcm=150m and compare them to find where cm≤dncm≤dn are close. This brute force calculation gives (n,m)(n,m) pairs
(n,m)∈{(3,1),(14,5),…}(n,m)∈{(3,1),(14,5),…}
for instance, corresponding to the numbers
(6n,150m)∈{(216,150),(78364164096,75937500000),…}.(6n,150m)∈{(216,150),(78364164096,75937500000),…}.
In the first case tt would associate 216−150=66216−150=66 of the outcomes of three rolls of a d6
to Failure and the other 150150 outcomes would each be associated with a single outcome of a d150
.
In the second case tt would associate 78364164096−7593750000078364164096−75937500000 of the outcomes of 14 rolls of a d6
to Failure -- about 3.1% of them all -- and otherwise would output a sequence of 5 outcomes of a d150
.
A simple algorithm to implement tt labels the faces of the dd-sided die with the numerals 0,1,…,d−10,1,…,d−1 and the faces of the cc-sided die with the numerals 0,1,…,c−1.0,1,…,c−1. The nn rolls of the first die are interpreted as an nn-digit number in base d.d. This is converted to a number in base c.c. If it has at most mm digits, the sequence of the last mm digits is the output. Otherwise, tt returns Failure by invoking itself recursively.
For much longer sequences, you can find suitable pairs (n,m)(n,m) by considering every other convergent n/mn/m of the continued fraction expansion of x=log(c)/log(d).x=log(c)/log(d). The theory of continued fractions shows that these convergents alternate between being less than xx and greater than it (assuming xx is not already rational). Choose those that are less than x.x.
In the question, the first few such convergents are
3,14/5,165/59,797/285,4301/1538,89043/31841,279235/99852,29036139/10383070….3,14/5,165/59,797/285,4301/1538,89043/31841,279235/99852,29036139/10383070….
In the last case, a sequence of 29,036,139 rolls of a d6
will produce a sequence of 10,383,070 rolls of a d150
with a failure rate less than 2×10−8,2×10−8, for an efficiency of 2.796492.79649--indistinguishable from the asymptotic limit.