Stan is a probabilistic programming language for statistical inference written in C++.[2] The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.[2]
Stan is licensed under the New BSD License. Stan is named in honour of Stanislaw Ulam, pioneer of the Monte Carlo method.[2]
Stan was created by a development team consisting of 52 members[3] that includes Andrew Gelman, Bob Carpenter, Daniel Lee, Ben Goodrich, and others.
A simple linear regression model can be described as y n = α + β x n + ϵ n {\displaystyle y_{n}=\alpha +\beta x_{n}+\epsilon _{n}} , where ϵ n ∼ normal ( 0 , σ ) {\displaystyle \epsilon _{n}\sim {\text{normal}}(0,\sigma )} . This can also be expressed as y n ∼ normal ( α + β X n , σ ) {\displaystyle y_{n}\sim {\text{normal}}(\alpha +\beta X_{n},\sigma )} . The latter form can be written in Stan as the following:
data { int<lower=0> N; vector[N] x; vector[N] y; } parameters { real alpha; real beta; real<lower=0> sigma; } model { y ~ normal(alpha + beta * x, sigma); }
The Stan language itself can be accessed through several interfaces:
In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the R language:[4]
Stan implements gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference, stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference, and gradient-based optimization for penalized maximum likelihood estimation.
Stan implements reverse-mode automatic differentiation to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference.[2] The automatic differentiation within Stan can be used outside of the probabilistic programming language.
Stan is used in fields including social science,[9] pharmaceutical statistics,[10] market research,[11] and medical imaging.[12]
{{cite journal}}
|journal=