Як підійти до виклику вертикальних паличок

23

Ця проблема взята з інтерв'юstrestreet.com

Нам дано масив цілих чисел $Y=\{y_1,...,y_n\}$ що представляє $n$ сегментів рядка таким, що кінцевими точками сегмента $i$ є $(i, 0)$ та $(i, y_i)$ . Уявіть, що зверху кожного сегмента горизонтальний промінь знімається зліва, і цей промінь зупиняється, коли він торкається іншого сегмента або потрапляє на вісь y. Побудуємо масив з цілих чисел п, $v_1, ..., v_n$ , дедорівнює довжині вистріленого променя вгорі відрізка $v_i$ $i$ . Визначимо $V(y_1, ..., y_n) = v_1 + ... + v_n$ .

Наприклад, якщо ми маємо $Y=[3,2,5,3,3,4,1,2]$ , то $[v_1, ..., v_8] = [1,1,3,1,1,3,1,2]$ , як показано на малюнку нижче:

введіть тут опис зображення

Для кожної перестановки $p$ з $[1,...,n]$ , ми можемо обчислити $V(y_{p_1}, ..., y_{p_n})$ . Якщо ми виберемо рівномірно випадкову перестановку $p$ з $[1,...,n]$ , то, що очікуване значення $V(y_{p_1}, ..., y_{p_n})$ ?

Якщо ми вирішимо цю проблему за допомогою наївного підходу, вона не буде ефективною і працюватиме назавжди протягом . Я вважаю, що ми можемо підійти до цієї проблеми, незалежно обчисливши очікувану величину для кожної палички, але мені все одно потрібно знати, чим існує інший ефективний підхід до цієї проблеми. На основі якої можна самостійно обчислити очікувану величину для кожної палички? $n=50$ $v_i$

algorithms probability-theory

— Рафаель
джерело

Можна використовувати лінійність очікування. Це питання, ймовірно, більш доречне на math.SE

23

Уявіть собі іншу проблему: якщо вам довелося розміщувати палиць однакової висоти в прорізах, тоді очікувана відстань між палицями (і очікувана відстань між першою палицею та умовним прорізом , і очікувана відстань між останньою палицею та умовною проріз ) дорівнює $k$ $n$ $0$ $n+1$ оскільки єпрогалини для розміщення довжини. $\frac{n+1}{k+1}$ $k+1$ $n+1$

Повертаючись до цієї проблеми, конкретну паличку цікавить, скільки паличок (включаючи і саму) на висоту чи вище. Якщо це число , то очікуваний проміжок ліворуч також $k$ . $\frac{n+1}{k+1}$

Тож алгоритм - просто знайти це значення для кожної палички і скласти очікування. Наприклад, починаючи з висот , кількість палиць з більшою або рівною висотою становить тож очікування $[3,2,5,3,3,4,1,2]$ $[5,7,1,5,5,2,8,7]$ . $\frac96+\frac98+\frac92+\frac96+\frac96+\frac93+\frac99+\frac98 = 15.25$

Це легко запрограмувати: наприклад, один рядок у R

V <- function(Y){ (length(Y) + 1) * sum( 1 / (rowSums(outer(Y, Y, "<=")) + 1) ) }

дає значення у вибірі вибірки у вихідній задачі

> V(c(1,2,3))
[1] 4.333333
> V(c(3,3,3))
[1] 3
> V(c(2,2,3))
[1] 4
> V(c(10,2,4,4))
[1] 6
> V(c(10,10,10,5,10))
[1] 5.8
> V(c(1,2,3,4,5,6))
[1] 11.15

— Генрі
джерело

1

Дуже цікаво. Чи можете ви трохи розібратися, чому очікувана відстань між палицями дорівнює

; як не зрозуміло (принаймні мені), як це було обчислено. Дякую.

(n + 1) / (k + 1)

$(n+1)/(k+1)$

— М. Алагон

У моєму першому випадку палиць

однакової висоти, довжина

повинна бути заповнена пробілами

тому середній проміжок походить від ділення одна на іншу. Це очікуваний проміжок (або горизонтальний промінь) перед будь-якою конкретною палицею (і від останньої палиці до

). Він переходить до оригінального питання, беручи до уваги палички, які є настільки ж високими або вищими, ніж будь-яка конкретна палиця.

k

$k$

n + 1

$n+1$

k + 1

$k+1$

n + 1

$n+1$

— Генрі

Very nice. This completely subsumes my solution; if all heights are distinct, then

E [V] = \sum_{k = 1}^{n} \frac{n + 1}{k + 1} = (n + 1) (H_{n + 1} - 1) = (n + 1) H_{n} - n

$E[V] = \sum_{k=1}^n \frac{n+1}{k+1} = (n+1)(H_{n+1} - 1) = (n+1)H_n - n$ .

— JeffE

2

@Henry: For the k sticks equal height, n slots problem, what was your reasoning for average length = (n + 1) / (k + 1)? If I have k sticks and I want to know the average ray length of one of those sticks in every permutation of those k sticks in n slots, it does in fact equal your result, but I don't understand why. Is there logic or did you deduce it mathematically from doing what I described for 1 stick and n slots, then 2 sticks and n, slots, ... k sticks, n slots, and noticing that it equaled (n + 1) / (k + 1)? You mention adding an n + 1 slot. That seems very counter-intuitive.

— Alexandre

3

It is a question I have dealt with before. Start with a round table with

n + 1

$n+1$ seats and

k + 1

$k+1$ people and seat them at random. Distances between individuals are obviously i.i.d. with mean

(n + 1) / (k + 1)

$(n+1)/(k+1)$ . Now break the table at the

n + 1^{th}

$n+1^{\text{th}}$ person, remove that person and their seat, and straighten the table. Now you have the question here with

n

$n$ seats and

k

$k$ people but the same i.i.d. property and same mean. (Spot the rare rhyme for month)

— Henry

11

Henry's solution is both simpler and more general than this one!

$E[V]$ is roughly half the expected number of comparisons performed by randomized quicksort.

Assuming the sticks have distinct heights, we can derive a closed-form solution for $E[Y]$ as follows.

For any indices $i\le j$ , let $X_{ij} = 1$ if $Y_j = \max\{Y_i,...,Y_j\}$ and $X_{ij}=0$ otherwise. (If the elements of $Y$ are not distinct, then $X_{ij}=1$ means that $Y_j$ is strictly greater than every element of $\{Y_i, \dots, Y_{j-1}\}$

$j$ $v_j = \sum_{i=1}^j X_{ij}$

V = \sum_{j = 1}^{n} v_{j} = \sum_{j = 1}^{n} \sum_{i = 1}^{j} X_{i j} .

$V = \sum_{j=1}^n v_j = \sum_{j=1}^n \sum_{i=1}^j X_{ij}.$

Linearity of expectation immediately implies that

E [V] = E [\sum_{1 \leq i \leq j \leq n} X_{i j}] = \sum_{1 \leq i \leq j \leq n} E [X_{i j}] .

$E[V] = E\bigg[\sum_{1\le i\le j\le n} X_{ij}\bigg] = \sum_{1\le i\le j\le n} E[X_{ij}].$

Because $X_{ij}$ is either $0$ or $1$ , we have $E[X_{ij}] = \Pr[X_{ij}=1]$ .

Finally—and this is the important bit—because the values in $Y$ are distinct and permuted uniformly, each element of the subset $\{Y_i,...,Y_j\}$ is equally likely to be the largest element in that subset. Thus, $\Pr[X_{ij}=1] = \frac{1}{j-i+1}$ . (If the elements of $Y$ are not distinct, we still have $\Pr[X_{ij}=1] \le \frac{1}{j-i+1}$ .)

And now we just have some math.

\begin{aligned} E [V] & = \sum_{j = 1}^{n} \sum_{i = 1}^{j} E [X_{i j}] & [linearity] \\ = \sum_{j = 1}^{n} \sum_{i = 1}^{j} \frac{1}{j - i + 1} & [uniformity] \\ = \sum_{j = 1}^{n} \sum_{h = 1}^{j} \frac{1}{h} & [h = j - i + 1] \\ = \sum_{h = 1}^{n} \sum_{j = h}^{n} \frac{1}{h} & [1 \leq h \leq j \leq n] \\ = \sum_{h = 1}^{n} \frac{n - h + 1}{h} \\ = ((n + 1) \sum_{h = 1}^{n} \frac{1}{h}) - (\sum_{h = 1}^{n} 1) \\ = (n + 1) H_{n} - n \end{aligned}

$\begin{aligned} E[V] &= \sum_{j=1}^n \sum_{i=1}^j E[X_{ij}] & [\text{linearity}]\\ &= \sum_{j=1}^n \sum_{i=1}^j \frac{1}{j-i+1} & [\text{uniformity}]\\ &= \sum_{j=1}^n \sum_{h=1}^j \frac{1}{h} & [h = j-i+1]\\ &= \sum_{h=1}^n \sum_{j=h}^n \frac{1}{h} & [1\le h\le j\le n]\\ &= \sum_{h=1}^n \frac{n-h+1}{h}\\ &= \left((n+1)\sum_{h=1}^n \frac{1}{h}\right) - \left(\sum_{h=1}^n 1\right)\\ &= (n+1)H_n - n \end{aligned}$ where

H_{n}

$H_n$ denotes the

n

$n$ th harmonic number.

Now it should be trivial to compute $E[V]$ (up to floating point precision) in $O(n)$ time.

— JeffE
джерело

Does this assume that the sticks are of distinct height?

— Aryabhata

Yes, it does assume distinct heights. (Apparently, I misread the question.) The equivalence with randomized quicksort still stands when there are ties, but not the closed-form solution.

— JeffE

4

As mentioned in the comments, you can use Linearity of Expectation.

Sort the $y$ : $y_1 \le y_2 \le \dots \le y_n$ .

For each $y_i$ consider the expected value of $v_i = E[v_i]$ .

Then $E[\sum_{i=1}^{n} v_i] = \sum_{i=1}^{n} E[v_i]$

One straight-forward and naive way to compute $E[v_i]$ would be first fix a position for $y_i$ . Say $j$ .

Now compute the probability that at position $j-1$ you have a value $\ge y_i$ .

Then the probability that at $j-1$ you have a value $\lt y_i$ and at $j-2$ you have a value $\ge y_i$

and so on which will allow you to compute $E[v_i]$ .

You can probably make it faster by actually doing the math and getting a formula (I haven't tried it myself, though).

Hope that helps.

— Aryabhata
джерело

3

Expanding on the answer of @Aryabhata:

Fix an $i$ , and assume the item $y_i$ is at position $j$ . The exact value of the height is immaterial, what matters is whether the items are greater than or equal to $y_i$ or not. Therefore consider the set of items $Z^{(i)}$ , where $z_k^{(i)}$ is 1 if $y_k\geq y_i$ , and $z_k^{(i)}$ is 0 otherwise.

A permutation on the set $Z^{(i)}$ induces an corresponding permutation on the set $Y$ . Consider for instance the following permuation of the set $Z^{(i)}$ : "01000(1) $\dots$ ". The item $z_i^{(i)}$ is the one is brackets, at position $j$ , and the items denoted by " $\dots$ " don't matter.

The value of $v_i$ is then 1 plus the length of the run of consective zeros just to the left of $z_i^{(i)}$ . It follows that $\mathbb{E}\left(v_i\right)$ is actually 1 plus the expected length of consecutive zeors, until the first "1" is met, if we pick at most $j-1$ bits from the set $Z^{(i)} \setminus \\{z_i^{(i)} \\}$ (without replacement). This is reminiscent of the geometric distribution, except that it would be without replacement (and bounded number of draws). The expectation is to be taken on $j$ as well, as a uniform choice on the set of positions $\{1,\dots,n \}$ .

Once this is computed (along these lines), we can follow the lines of @Aryabhata's answer.

— M. Alaggan
джерело

-2

I dont really understand what do you demend, from tags it seems you are looking for an algorithm.

if so, what is the expected time complexity? by saying: "If we solve this problem using the naive approach it will not be efficient and run practically forever for n=50." it seems to me that your naive approach solves it in exponential time.

i do have a O(n^2) algorithm in mind tho.

assume int y[n], v[n] where v[i] initialized with 1; as described in the question
for (i=1;i<n;i++) 
   for ( j=i-1 ; j>=0 && y[j]<y[i] ; j--) v[i]++;