Як отримати функцію вірогідності розподілу біномів для оцінки параметрів?

22

Згідно з вірогідністю та статистикою Міллера та Фрейнда для інженерів, 8ed (pp.217-218), функція ймовірності буде максимально використана для розподілу біномів (випробування Бернуллі) задається як

$L(p) = \prod_{i=1}^np^{x_i}(1-p)^{1-x_i}$

Як дійти до цього рівняння? Мені здається досить зрозумілим щодо інших дистрибуцій, Пуассона та Гаусса;

$L(\theta) = \prod_{i=1}^n \text{PDF or PMF of dist.}$

Але той для двочлена - трохи інший. Щоб бути прямо вперед, як це було

$nC_x~p^x(1-p)^{n-x}$

стати

$p^{x_i}(1-p)^{1-x_i}$

у вищезгаданій функції ймовірності?

— Ебе Ісаак
джерело

25

За максимальною оцінкою ймовірності ви намагаєтеся максимально збільшити ; однак максимізація цього еквівалентна максимізації для нерухомого . $nC_x~p^x(1-p)^{n-x}$ $p^x(1-p)^{n-x}$ $x$

Власне, ймовірність гаусса і пуассона також не передбачає їхніх провідних констант, тому цей випадок подібно до таких, як

Звернення до коментарів з ОП

Ось трохи детальніше:

По-перше, - загальна кількість успіхів, тоді як - це один випробування (0 або 1). Тому: $x$ $x_i$

\prod_{i = 1}^{n} p^{x_{i}} (1 - p)^{1 - x_{i}} = p^{\sum_{1}^{n} x_{i}} (1 - p)^{\sum_{1}^{n} 1 - x_{i}} = p^{x} (1 - p)^{n - x}

$\prod_{i=1}^np^{x_i}(1-p)^{1-x_i} = p^{\sum_1^n x_i}(1-p)^{\sum_1^n1-x_i} = p^{x}(1-p)^{n-x}$

Це показує, як ви отримуєте фактори з імовірністю (виконавши вищезазначені кроки назад).

$p$

Ми можемо отримати це, взявши журнал функції ймовірності та виявивши, де його похідна дорівнює нулю:

\ln (n C_{x} p^{x} (1 - p)^{n - x}) = \ln (n C_{x}) + x \ln (p) + (n - x) \ln (1 - p)

$\ln\left(nC_x~p^x(1-p)^{n-x}\right) = \ln(nC_x)+x\ln(p)+(n-x)\ln(1-p)$

Take derivative wrt $p$ and set to $0$ :

\frac{d}{d p} \ln (n C_{x}) + x \ln (p) + (n - x) \ln (1 - p) = \frac{x}{p} - \frac{n - x}{1 - p} = 0

$\frac{d}{dp}\ln(nC_x)+x\ln(p)+(n-x)\ln(1-p) = \frac{x}{p}- \frac{n-x}{1-p} = 0$

⟹ \frac{n}{x} = \frac{1}{p} ⟹ p = \frac{x}{n}

$\implies \frac{n}{x} = \frac{1}{p} \implies p = \frac{x}{n}$

Notice that the leading constant dropped out of the calculation of the MLE.

More philosophically, a likelihood is only meaningful for inference up to a multiplying constant, such that if we have two likelihood functions $L_1,L_2$ and $L_1=kL_2$ , then they are inferentially equivalent. This is called the Law of Likelihood. Therefore, if we are comparing different values of $p$ using the same likelihood function, the leading term becomes irrelevant.

At a practical level, inference using the likelihood function is actually based on the likelihood ratio, not the absolute value of the likelihood. This is due to the asymptotic theory of likelihood ratios (which are asymptotically chi-square -- subject to certain regularity conditions that are often appropriate). Likelihood ratio tests are favored due to the Neyman-Pearson Lemma. Therefore, when we attempt to test two simple hypotheses, we will take the ratio and the common leading factor will cancel.

NOTE: This will not happen if you were comparing two different models, say a binomial and a poisson. In that case, the constants are important.

Of the above reasons, the first (irrelevance to finding the maximizer of L) most directly answers your question.

2

We can see that's the idea. But could you explain a little more on how

n C_{x}

$nC_x$ is removed and

n

$n$ is replaced with 1?

— Ébe Isaac

@ÉbeIsaac added some more details

2

xi in the product refers to each individual trial. For each individual trial xi can be 0 or 1 and n is equal to 1 always. Therefore, trivially, the binomial coefficient will be equal to 1. Hence, in the product formula for likelihood, product of the binomial coefficients will be 1 and hence there is no nCx in the formula. Realised this while working it out step by step :) (Sorry about the formatting, not used to answering with mathematical expressions in answers...yet :) )

— Abhishek Tiwari
джерело