Probability distribution on a torus
Samples from the cosine variant of the bivariate von Mises distribution. The green points are sampled from a distribution with high concentration and no correlation (
κ κ -->
1
=
κ κ -->
2
=
200
{\displaystyle \kappa _{1}=\kappa _{2}=200}
,
κ κ -->
3
=
0
{\displaystyle \kappa _{3}=0}
), the blue points are sampled from a distribution with high concentration and negative correlation (
κ κ -->
1
=
κ κ -->
2
=
200
{\displaystyle \kappa _{1}=\kappa _{2}=200}
,
κ κ -->
3
=
100
{\displaystyle \kappa _{3}=100}
), and the red points are sampled from a distribution with low concentration and no correlation (
κ κ -->
1
=
κ κ -->
2
=
20
,
κ κ -->
3
=
0
{\displaystyle \kappa _{1}=\kappa _{2}=20,\kappa _{3}=0}
).
In probability theory and statistics , the bivariate von Mises distribution is a probability distribution describing values on a torus . It may be thought of as an analogue on the torus of the bivariate normal distribution . The distribution belongs to the field of directional statistics . The general bivariate von Mises distribution was first proposed by Kanti Mardia in 1975.[ 1] [ 2] One of its variants is today used in the field of bioinformatics to formulate a probabilistic model of protein structure in atomic detail, [ 3] [ 4] such as backbone-dependent rotamer libraries .
Definition
The bivariate von Mises distribution is a probability distribution defined on the torus ,
S
1
× × -->
S
1
{\displaystyle S^{1}\times S^{1}}
in
R
3
{\displaystyle \mathbb {R} ^{3}}
.
The probability density function of the general bivariate von Mises distribution for the angles
ϕ ϕ -->
,
ψ ψ -->
∈ ∈ -->
[
0
,
2
π π -->
]
{\displaystyle \phi ,\psi \in [0,2\pi ]}
is given by[ 1]
f
(
ϕ ϕ -->
,
ψ ψ -->
)
∝ ∝ -->
exp
-->
[
κ κ -->
1
cos
-->
(
ϕ ϕ -->
− − -->
μ μ -->
)
+
κ κ -->
2
cos
-->
(
ψ ψ -->
− − -->
ν ν -->
)
+
(
cos
-->
(
ϕ ϕ -->
− − -->
μ μ -->
)
,
sin
-->
(
ϕ ϕ -->
− − -->
μ μ -->
)
)
A
(
cos
-->
(
ψ ψ -->
− − -->
ν ν -->
)
,
sin
-->
(
ψ ψ -->
− − -->
ν ν -->
)
)
T
]
,
{\displaystyle f(\phi ,\psi )\propto \exp[\kappa _{1}\cos(\phi -\mu )+\kappa _{2}\cos(\psi -\nu )+(\cos(\phi -\mu ),\sin(\phi -\mu ))\mathbf {A} (\cos(\psi -\nu ),\sin(\psi -\nu ))^{T}],}
where
μ μ -->
{\displaystyle \mu }
and
ν ν -->
{\displaystyle \nu }
are the means for
ϕ ϕ -->
{\displaystyle \phi }
and
ψ ψ -->
{\displaystyle \psi }
,
κ κ -->
1
{\displaystyle \kappa _{1}}
and
κ κ -->
2
{\displaystyle \kappa _{2}}
their concentration and the matrix
A
∈ ∈ -->
M
(
2
,
2
)
{\displaystyle \mathbf {A} \in \mathbb {M} (2,2)}
is related to their correlation.
Two commonly used variants of the bivariate von Mises distribution are the sine and cosine variant.
The cosine variant of the bivariate von Mises distribution[ 3] has the probability density function
f
(
ϕ ϕ -->
,
ψ ψ -->
)
=
Z
c
(
κ κ -->
1
,
κ κ -->
2
,
κ κ -->
3
)
exp
-->
[
κ κ -->
1
cos
-->
(
ϕ ϕ -->
− − -->
μ μ -->
)
+
κ κ -->
2
cos
-->
(
ψ ψ -->
− − -->
ν ν -->
)
− − -->
κ κ -->
3
cos
-->
(
ϕ ϕ -->
− − -->
μ μ -->
− − -->
ψ ψ -->
+
ν ν -->
)
]
,
{\displaystyle f(\phi ,\psi )=Z_{c}(\kappa _{1},\kappa _{2},\kappa _{3})\ \exp[\kappa _{1}\cos(\phi -\mu )+\kappa _{2}\cos(\psi -\nu )-\kappa _{3}\cos(\phi -\mu -\psi +\nu )],}
where
μ μ -->
{\displaystyle \mu }
and
ν ν -->
{\displaystyle \nu }
are the means for
ϕ ϕ -->
{\displaystyle \phi }
and
ψ ψ -->
{\displaystyle \psi }
,
κ κ -->
1
{\displaystyle \kappa _{1}}
and
κ κ -->
2
{\displaystyle \kappa _{2}}
their concentration and
κ κ -->
3
{\displaystyle \kappa _{3}}
is related to their correlation.
Z
c
{\displaystyle Z_{c}}
is the normalization constant. This distribution with
κ κ -->
3
{\displaystyle \kappa _{3}}
=0 has been used for kernel density estimates of the distribution of the protein dihedral angles
ϕ ϕ -->
{\displaystyle \phi }
and
ψ ψ -->
{\displaystyle \psi }
.[ 4]
The sine variant has the probability density function[ 5]
f
(
ϕ ϕ -->
,
ψ ψ -->
)
=
Z
s
(
κ κ -->
1
,
κ κ -->
2
,
κ κ -->
3
)
exp
-->
[
κ κ -->
1
cos
-->
(
ϕ ϕ -->
− − -->
μ μ -->
)
+
κ κ -->
2
cos
-->
(
ψ ψ -->
− − -->
ν ν -->
)
+
κ κ -->
3
sin
-->
(
ϕ ϕ -->
− − -->
μ μ -->
)
sin
-->
(
ψ ψ -->
− − -->
ν ν -->
)
]
,
{\displaystyle f(\phi ,\psi )=Z_{s}(\kappa _{1},\kappa _{2},\kappa _{3})\ \exp[\kappa _{1}\cos(\phi -\mu )+\kappa _{2}\cos(\psi -\nu )+\kappa _{3}\sin(\phi -\mu )\sin(\psi -\nu )],}
where the parameters have the same interpretation.
See also
References
^ a b Mardia, Kanti (1975). "Statistics of directional data". J. R. Stat. Soc. B . 37 (3): 349– 393. doi :10.1111/j.2517-6161.1975.tb01550.x . JSTOR 2984782 .
^ Mardia, K. V.; Frellsen, J. (2012). "Statistics of Bivariate von Mises Distributions". Bayesian Methods in Structural Bioinformatics . Statistics for Biology and Health. pp. 159 . doi :10.1007/978-3-642-27225-7_6 . ISBN 978-3-642-27224-0 .
^ a b Boomsma, W.; Mardia, K. V.; Taylor, C. C.; Ferkinghoff-Borg, J.; Krogh, A.; Hamelryck, T. (2008). "A generative, probabilistic model of local protein structure" . Proceedings of the National Academy of Sciences . 105 (26): 8932– 7. Bibcode :2008PNAS..105.8932B . doi :10.1073/pnas.0801715105 . PMC 2440424 . PMID 18579771 .
^ a b Shapovalov MV, Dunbrack, RL (2011). "A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions" . Structure . 19 (6): 844– 858. doi :10.1016/j.str.2011.03.019 . PMC 3118414 . PMID 21645855 .
^ Singh, H. (2002). "Probabilistic model for two dependent circular variables". Biometrika . 89 (3): 719– 723. doi :10.1093/biomet/89.3.719 .
Discrete univariate
with finite support with infinite support
Continuous univariate
supported on a bounded interval supported on a semi-infinite interval supported on the whole real line with support whose type varies
Mixed univariate
Multivariate (joint) Directional Degenerate and singular Families