Chinese restaurant table distribution

From HandWiki
Short description: probability distribution modeling number of tables in Chinese restaurant process
Chinese restaurant table
Parameters

[math]\displaystyle{ \theta \gt 0 }[/math]

[math]\displaystyle{ m \in \{0,1,2,\ldots \} }[/math]
Support [math]\displaystyle{ L \in \{0,1,2,\ldots,m\} }[/math]
pmf [math]\displaystyle{ \frac{\Gamma(\theta)}{\Gamma(m+\theta)} |s(m,\ell)| \theta^{\ell} }[/math]
Mean [math]\displaystyle{ \theta (\psi(\theta+m)-\psi(\theta)) }[/math]
(see digamma function)

In probability theory and statistics, the Chinese restaurant table distribution (CRT) is the distribution on the number of tables in the Chinese restaurant process.[1] It can be understood as the sum of n independent random variables, each with a different Bernoulli distribution:

[math]\displaystyle{ \begin{align} L & = \sum_{n=1}^m b_n \\[4pt] b_n & \sim \operatorname{Bernoulli} \left( \frac \theta {n-1+\theta}\right) \end{align} }[/math]

The probability mass function of L is given by [2]

[math]\displaystyle{ f(\ell) = \frac{\Gamma(\theta)}{\Gamma(m+\theta)} |s(m,\ell)| \theta^\ell }[/math]

where s denotes Stirling numbers of the first kind.

See also

  • Ewens sampling formula

References

  1. "Negative Binomial Process Count and Mixture Modeling". https://arxiv.org/abs/1209.3442. 
  2. Antoniak, Charles E (1974). "Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems". The Annals of Statistics.