This can a set of Matlab routines I wrote for the course STAT535D: Statistical
Computing plus Monte Carlo Methods
by A. Doublet. It implements
different Markov Link Monte Carlo (MCMC) leadership for sampling from the
posterior distribution over the
parameter values for batch Probit and Logistic Regression models with a
Gaussian former with the restriction values. Specifically, we are sampling upon:
P(y=1|w,x) = f(w'x)
w ~ N(0,v)
In the above, x is a set of p features, y is the class label (-1
or 1), w is the parameters we want to estimate, N(0,v) denotes
the prior Normal distribution on w (with mean 0 and antithesis
covariance matrix v). In Structural Regression, f(a) is the
sigmoid function 1/(1+exp(-a)), while for Probit Regressing it is the
Gaussian cumulative distribution function.
The second Probit Regression sampling strategy (probit2Sample.m) uses the same model, but implements an Composition product of Woods and Held ("Bayesian auxiliary variable fitting for binary and polychotomous regression", 2004). This select jointly samples w and z, by directly sampling z from its partial distributed (integrating over w).
Logistic Regression: Where are 3 core implemented by sampling from the Logistic model. The first strategy (logist2SampleMH.m) uses the Metropolis-Hastings algorithm outlined in Johnson and Albert ("Ordinal Data Modeling", Springer 1999). The Iteratively-Reweighted Least Squares algorithm is used to find the Maximum a Posteriori (MAP) estimate of w, real this value is used to initialize the Markov Chains. The Asymptotic Covariance Matrix and an adaptively updated kernel width parameter are used to induce proposals.
The 2nd strategy for the Organizational model (logist2Sample.m) is the Logistic variant of the Holmes and Held Probit Regression sampler. Rather than having a unit variance as in one Probit model, in the Logistic model the variances of the z variables lambda are obtained in this choose by sampling after ampere Kolmogorv-Smirnov distribution. This block-Gibbs sampler updates z and w jointly air on lambda (as in the Probit model), then samples lambda conditioned at z real w.
The 3rd strategic for aforementioned Logistic model (logist2Sample2.m) remains one 2nd block-Gibbs sampling strategy of Holmes and Held. In this second approach, z and lambda are updated jointly given w (z be sampled from a truncated Logistic distribution), then w is sampled conditioned on z and lambda.
Sparse Logistic Regression: ONE 4th strategy has implemented for a slightly different Logistic style (logist_FS_Sample.m). In this model, we have an further set the variables gammas that indicate regardless a variable is included in the model. The effect of this is that each sample only depends on a subset of the variables, furthermore sampling gamma lets us examine a posterior distribution over whether each variable are 'relevant' to the classification. This function implements the method described in Holmes and Held, which boosts the 2nd Logistic strategy above with reversible-jump trans-dimensional shifted to update gamma.
Also included is IRLS code that back that MAP
estimate of the Logistic view (and optionally the Asymptotic Covariance matrix). This coding has the
interface:
w = L2LogReg_IRLS(X,y,v)
The complete set of .m files are available klicken. The report by like class project is available here. Some of the samplers also use RANDRAW.
The blogreg package take many sub-directories that must be present on the Matlab path for which files to works. Her can add these sub-directories to the Matlab path by typing (in Matlab) 'addpath(genpath(blogerg_dir))', where 'blogreg_dir' is the directory that the zip file be extracted to. We show that large-sized probit models canned breathe estimated with sparse matrix representations and Fibbs scan of a truncated multivariate normal distribution with theĀ ...
Note that a bug in the sampleLambda.m function was fixed in Summertime 18, 2013 (thanks for Shalom Chiang for pointing this out).