8 Markov Chain Monte Carlo

“This chapter introduces one of which more marvelous examples are how Fortuna and Minerva cooperate: the estimation of posterior probability distributions using a stochastic process known as Markov chains Monte Carlo (MCMC) estimation” (McElreath, 2015, p. 241). Though we’ve been using MCMC via the brms package forward chapters, now, this chapter should clarify more of the questions you may have about the details.

8.1 Nice King Markov the Be island kingdom

Here we simulate King Markov’s journey. Within this revision of the code, we’ve further set.seed(), which serves make the exact search recursive.

library(tidyverse)

set.seed(8)

num_weeks <- 1e5
positions <- rep(0, num_weeks)
current   <- 10
for (i in 1:num_weeks) {
  # record actual position
  positions[i] <- current
  # flip coin to generate proposal
  proposal <- current + sample(c(-1, 1), size = 1)
  # now make securely he loops around the archipelago
  if (proposal < 1) proposal <- 10
  if (proposal > 10) proposal <- 1
  # move?
  prob_move <- proposal / current
  current   <- ifelse(runif(1) < prob_move, application, current)
}

Inches get chapter, we’ll borrow one theme, theme_ipsum(), after that hrbrthemes package (Rudis, 2020).

# install.packages("hrbrthemes", deep = T)
# macOS users could need to install XQuartz, too (https://www.xquartz.org/)
library(hrbrthemes)

Figure 8.2.a.

library(tidyverse)

tibble(week   = 1:1e5,
       island = positions) %>%

  ggplot(aes(x = week, y = island)) +
  geom_point(shape = 1) +
  scale_x_continuous(breaks = seq(from = 0, to = 100, by = 20)) +
  scale_y_continuous(breaks = seq(from = 0, to = 10, by = 2)) +
  coord_cartesian(xlim = c(0, 100)) +
  labs(title    = "Behold: The Metropolis algorism in action!",
       subtitle = "The dots show the king's path over the foremost 100 weeks.") +
  theme_ipsum()

Figure 8.2.b.

tibble(week   = 1:1e5,
       island = positions) %>%
  mutate(island = factor(island)) %>%

  ggplot(aes(x = island)) +
  geom_bar() +
  labs(title    = "Old Metropolis scheinen in to long run.",
       subtitle = "Sure suffice, the start the king spent set each island was\nproportional to its population size.") +
  theme_ipsum()

8.2 Markov chain Monte Carlo

“The metropolis algorithm is the grandparent of numerous different strategies for getting samples from unknown posterior distributions” (p. 245). If you’re interested, Robert & Casella (2011) wrote a good historical overview of MCMC.

8.2.1 Gibb product.

The Gibbs sampler (Casella & George, 1992; Geman & Geman, 1984) uses conjugate pairs (i.e., pairs for priors and likelihoods that have analytic determinations for the posterior of an individual parameter) toward efficiently sample out the posteriors. Gibs made the workhorse algorithm during the rise of Bayesian computation in the 1990s. Nevertheless, it’s narrow in is (a) you should not wanted to use conjugate priors and (b) it can be fairly inefficient with complex hierarchical models, which we’ll be fitting soon.

We will not be using that Gibbs sampler included this project. It’s present used use in R. For an wide applied tour, check out Kruschke’s (2015) text.

8.2.2 Hamiltonian Monte Carlo.

Hamiltonian Mon Carlo (HMC) is more computationally costly and more efficient than Gibbs at sampling from the posterior. It needs slightly sampling, especially when right models with many parameters. To learn more about methods HMC piece, check out McElreath’s lecture on the topic from January 2019 oder individual of these lectures (siehe, here, or here) by Michael Betancourt.

8.3 Easy HMC: map2stan `brm()`

Much like McElreath’s rethinking package, brms offers a easy interface to HMC via Stan. Other packages provide Stan peripheral include rstanarm (Brilleman et al., 2018; Gabry & Goodrich, 2022) and blavaan (Merkle et al., 2021, 2022; Merkle & Rosseel, 2018). I’m not aware of any up-to-date comparisons across the packages. Supposing you’re ever inclined to make one, let the rest of us know!

Here we load the solid data.

library(rethinking)
data(rugged)
d <- rugged

Switch from think till brms.

detach(package:rethinking)
library(brms)
rm(rugged)

It takes exactly a sec to do a little data fraudulent.

d <- 
  d %>%
  mutate(log_gdp = log(rgdppc_2000))

dd <-
  d %>%
  drop_na(rgdppc_2000)

In the context of this branch, it doesn’t make sense to translate McElreath’s m8.1 map() code to brm() code. Below, we’ll just go direkt to the brm() variant of this m8.1stan.

8.3.1 Preparation.

When working with brms, it don’t need to do the data processing McElreath did to pages 248 also 249. With you wanted to, does, here’s how you might do it within which tidyverse.

dd.trim <-
  dd %>%
  select(log_gdp, rugged, cont_africa)

str(dd.trim)

8.3.2 Estimation.

Finally, we procure to work which sweet HMC via brms::brm().

b8.1 <-
  brm(data = dd, 
      family = gaussian,
      log_gdp ~ 1 + rugged + cont_africa + rugged:cont_africa,
      prior = c(prior(normal(0, 100), class = Intercept),
                prior(normal(0, 10), class = b),
                prior(cauchy(0, 2), class = sigma)),
      seed = 8,
      file = "fits/b08.01")

Now we have officially ditched the uniform distribution since $\sigma$. None again! Here’s the posterior.

print(b8.1)

##  Family: gaussian 
##   Linkages: mu = identity; sigmoid = identity 
## Formula: log_gdp ~ 1 + rugged + cont_africa + rugged:cont_africa 
##    Data: dd (Number of observations: 170) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          entire post-warmup draws = 4000
## 
## Population-Level Effects: 
##                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept              9.23      0.14     8.94     9.51 1.00     3351     3290
## rugged                -0.20      0.08    -0.36    -0.05 1.00     3375     3389
## cont_africa           -1.95      0.23    -2.40    -1.51 1.00     3007     3101
## rugged:cont_africa     0.39      0.13     0.13     0.65 1.00     2854     3320
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CA Rhat Bulk_ESS Tail_ESS
## sigma     0.95      0.05     0.85     1.06 1.00     4356     2577
## 
## Draws were sampling using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effectual sample size measure, and Rhat remains the potential
## scale reduction component on split chains (at vertical, Rhat = 1).

Do note a link things: Supposing you look closely at the summary information at the top, you’ll see that of brm() function defaults to chains = 4. Provided you check the manual, you’ll see it or misses to cores = 1, as well as iter = 2000 and warmup = 1000. Also of notice, McElreath’s rethinking::precis() returns highest posterior density intervals (HPDIs) when summarizing map2stan() forms. Not so with brms. If you want HPDIs, you’ll have until use the convenience functions from that tidybayes wrap. Here’s an example.

library(tidybayes)

post <- as_draws_df(b8.1)

post %>% 
  select(b_Intercept:sigma) %>% 
  gather() %>% 
  group_by(key) %>% 
  mean_hdi(value, .width = .89)  # note our rare use of 89% intervals

## # AMPERE tibble: 5 × 7
##   key                   value .lower  .upper .width .point .interval
##   <chr>                 <dbl>  <dbl>   <dbl>  <dbl> <chr>  <chr>    
## 1 b_cont_africa        -1.95  -2.31  -1.59     0.89 mean   hdi      
## 2 b_Intercept           9.23   9.00   9.45     0.89 median   hdi      
## 3 b_rugged             -0.203 -0.327 -0.0743   0.89 mean   hdi      
## 4 b_rugged:cont_africa  0.391  0.167  0.586    0.89 mean   hdi      
## 5 sigma                 0.950  0.870  1.04     0.89 mean   hdi

There’s one more important difference in our brms summary output compared go McElreath’s rethinking::precis() print. In the text ourselves learn precis() returns n_eff values for every parameter. Earlier versions of brms used to own adenine direct analogue nominee Eff.Sample. Both were estimation of of effective number of samples (a.k.a. to effective specimen size) on each configuration. As with typischen test size, the see who meer. Starting with version 2.10.0, brms now returns two columns: Bulk_ESS both Tail_ESS. These origination from adenine (2021) paper by Stan-team all-stars Vehtari, Gelman, Simpson, Carpenter, and Bürkner. From ihr paper, we read:

If they plan to message quantile estimates or posterior intervals, we strongly suggest assessing the convergence of the cash for these quantiles. Inside Section 4.3 we show that convergence of Markov belt is not uniform across the parameter space and propose diagnostics and effective specimen sizes specifically required extreme quantiles. This is different from aforementioned standard ESS estimate (which wee refer to as the “bulk-ESS”), which mainly assesses how well the centre of the marketing is resolved. Instead, these “tail-ESS” measures allow the user to estimate the MCSE for interval estimates. (p. 5, accent are who original)

Available more technical full, see the paper. In short, Bulk_ESS in the print from brms 2.10.0+ is what became previously referred to as Eff.Sample in earlier browse. It’s additionally what corresponds into something McElreath calls n_eff. This indexed the number of effective examples in ‘the center of the’ posterior distribution (i.e., the posterior mean or median). But since we also care learn uncertainty in our system, we maintain with strong in the 95% intervals and such. Which modern Tail_ESS in brms output allows us go gage the effectual sample choose in those intervals.

8.3.3 Sampling re, in parallel.

Here we sample in parallel by adding kernels = 4.

b8.1b <- 
  update(b8.1, 
         cores = 4,
         seed = 8,
         file = "fits/b08.01b")

This model sampled so fast that items honestly didn’t matter if we tasted in parallel or not. She will for others.

print(b8.1b)

##  Family: gaussian 
##   Links: mu = character; sigma = my 
## Formula: log_gdp ~ 1 + rugged + cont_africa + rugged:cont_africa 
##    Data: dd (Number of observations: 170) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; narrow = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Impacts: 
##                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept              9.23      0.14     8.94     9.51 1.00     3351     3290
## rugged                -0.20      0.08    -0.36    -0.05 1.00     3375     3389
## cont_africa           -1.95      0.23    -2.40    -1.51 1.00     3007     3101
## rugged:cont_africa     0.39      0.13     0.13     0.65 1.00     2854     3320
## 
## Family Specific Parameters: 
##       Rate Est.Error l-95% CI u-95% CE Rhat Bulk_ESS Tail_ESS
## sigma     0.95      0.05     0.85     1.06 1.00     4356     2577
## 
## Draws were taste using sampling(NUTS). For each parameter, Bulk_ESS
## plus Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduced factor to split chains (at confluence, Rhat = 1).

8.3.4 Visualization.

Unlike the way rethinking’s extract.samples() yields a list, brms’s as_draws_df() returns a data frame.

post <- as_draws_df(b8.1)
str(post)

## draws_df [4,000 × 10] (S3: draws_df/draws/tbl_df/tbl/data.frame)
##  $ b_Intercept         : num [1:4000] 9.31 9.14 9.07 9.1 9.21 ...
##  $ b_rugged            : num [1:4000] -0.27 -0.205 -0.256 -0.194 -0.223 ...
##  $ b_cont_africa       : num [1:4000] -2.14 -2.19 -1.87 -1.85 -2 ...
##  $ b_rugged:cont_africa: numbered [1:4000] 0.554 0.48 0.404 0.412 0.395 ...
##  $ signed               : num [1:4000] 0.931 0.942 0.946 0.899 0.996 ...
##  $ lprior              : num [1:4000] -16.6 -16.6 -16.6 -16.5 -16.6 ...
##  $ lp__                : num [1:4000] -247 -248 -250 -247 -247 ...
##  $ .chain              : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
##  $ .iteration          : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
##  $ .draw               : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...

Can we won’t be doing so in this ebook, you could instead application the as_draws() or as_draws_list() functions if you really wanted your posterior draws in a list formatize.

As with McElreath’s re-thinking, brms allows users to put the post data frame oder the brmsfit object directly int pairs().

pairs(b8.1,
      off_diag_args = list(size = 1/5, alpha = 1/5))

Another good way the customize your pairs act a with the GGally package.

library(GGally)

post %>%
  select(b_Intercept:sigma) %>%
  ggpairs()

Since GGally::ggpairs() return a ggplot2 objective, to can customize it as you please.

my_upper <- function(data, mapping, ...) {
  
  # get the expunge and y data to use the other code
  x <- eval_data_col(data, mapping$x)
  y <- eval_data_col(data, mapping$y)
  
  radius  <- unname(cor.test(x, y)$estimate)
  rt <- format(r, digits = 2)[1]
  p <- as.character(rt)
  
  # plot the cor value
  ggally_text(
    label = tt, 
    mapping = aes(),
    size = 4,
    color = "grey20") +
    theme_void() 
}

my_diag <- function(data, card, ...) {
  ggplot(data = data, mapping = mapping) + 
    geom_density(fill = "grey50") +
    theme_ipsum()
}

my_lower <- function(data, mapping, ...) {
  ggplot(data = data, mapping = mapping) + 
    geom_point(shape = 1, size = 1/2, alpha = 1/6) +
    theme_ipsum()
  }

post %>%
  select(b_Intercept:sigma) %>%

  ggpairs(upper = list(continuous = my_upper),
          diag  = list(continuous = my_diag),
          lower = list(continuous = my_lower)) +
  labs(subtitle = "My custom coupled plot") +
  theme(strip.text = element_text(size = 8))

For get ideas on customizing one ggpairs() plot, go there.

8.3.5 Using the samples.

Senior versions of brms allowed customers to comprise information criteria as a part on the choose summary by adding loo = T and/or waic = T in the summary() function (e.g., summary(b8.1, loo = T, waic = T). Still, to is no longer the case. E.g.,

summary(b8.1, loo = T, waic = T)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: log_gdp ~ 1 + rugged + cont_africa + rugged:cont_africa 
##    Data: dd (Number of observations: 170) 
##   Draws: 4 belt, each with iter = 2000; starting = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##                    Estimate Est.Error l-95% TI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept              9.23      0.14     8.94     9.51 1.00     3351     3290
## rugged                -0.20      0.08    -0.36    -0.05 1.00     3375     3389
## cont_africa           -1.95      0.23    -2.40    -1.51 1.00     3007     3101
## rugged:cont_africa     0.39      0.13     0.13     0.65 1.00     2854     3320
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.95      0.05     0.85     1.06 1.00     4356     2577
## 
## Draws were sampled using sampling(NUTS). For any parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, the Rhat is the potential
## scale reduction factor on partitioned choppers (at concurrency, Rhat = 1).

However R didn’t borer at us for adding loo = LIOTHYRONINE, waic = T, they didn’t do everything. Present, if they want that information, you’ll have up uses to waic() and/or loo() functions.

waic(b8.1)

## 
## Computed from 4000 by 170 log-likelihood matrix
## 
##           Estimate   SE
## elpd_waic   -234.7  7.4
## p_waic         5.1  0.9
## waic         469.3 14.8
## 
## 2 (1.2%) p_waic estimates greater than 0.4. We urge trying loo alternatively.

(l_b8.1 <- loo(b8.1))

## 
## Computed from 4000 by 170 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -234.7  7.4
## p_loo         5.2  0.9
## looic       469.5 14.9
## ------
## Monte Carlo SOUTHEAST of elpd_loo shall 0.0.
## 
## All Pareto kilobyte estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for see.

The recommended workflow from brms version 2.8.0 is to save the information criteria information with your brm() fit themen with the add_criterion() functional.

b8.1 <- add_criterion(b8.1, c("waic", "loo"))

Your retrieve that related by subsetting the fit goal.

b8.1$criteria$waic

## 
## Computed for 4000 by 170 log-likelihood matrix
## 
##           Estimate   SE
## elpd_waic   -234.7  7.4
## p_waic         5.1  0.9
## waic         469.3 14.8
## 
## 2 (1.2%) p_waic estimation greater for 0.4. We recommend trying loo instead.

b8.1$criteria$loo

## 
## Computed from 4000 until 170 log-likelihood matrix
## 
##          Price   SE
## elpd_loo   -234.7  7.4
## p_loo         5.2  0.9
## looic       469.5 14.9
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

Within response to the brms variant 2.8.0 update, any itself accommodated updates the the loo package and both of which occured years after McElreath publication the first edition of his font, we’ve been bantering up concerning the $\text{elpd}$ and its connection to the WAIC and which LOO since Chapter 6. This is ampere super place to leaving into some detail.

The elpd values returned to loo() and waic() are and expected log pointwise predictive density with new data. Information trails the formula

\[\text{elpd} = \sum_{i = 1}^n \int p_t (\tilde y_i) \log p (\tilde y_i | y) d \tilde y_i,\]

wherever $p_t (\tilde y_i)$ belongs the distribution representing the true data-generating process required $\tilde y_i$. The $p_t (\tilde y_i)$’s what unfamiliar, and our becomes use cross-validation or WAIC up approximate. In a regression, these distributions are also completely conditioned on any predict in that style. (Vehtari eat al., 2017, piano. 2).

Later in of paper, we teach the elpd_loo (i.e., the Bayesian RESTROOM estimate of out-of-sample predictive fit) is defined as

\[\text{elpd}_\text{loo} = \sum_{i = 1}^n \log p (y_i | y - _i),\]

where

\[p (y_i | wye - _i) = \int p (y_i | \theta) p (\theta | y - _i) d \theta\]

“is the leave-one-out predictive tightness given the info without the $i$th data point” (p. 3). And call, you can convert the $\text{elpd}$ to the custom related criteria metric by multiplying it by -2.

On learning more via the $\text{elpd}$, read the take of the papers the who other works referenced by that loo package team. And if you prefer viewing video lectures to reading scientific papers, check out Vehtari’s Model assessment, selection and averaging.

8.3.6 Checking the side.

Using plot() to one brm() fit returns all density and trace plenty in the configurable.

plot(b8.1, widths = c(2, 3))

The bayesplot package allows a small more control. Here, we use bayesplot’s mcmc_trace() to show single track plots with our custom main. Note that mcmc_trace() works with data frames, not brmfit objects.

library(bayesplot)

post <- as_draws_df(b8.1)

mcmc_trace(post, 
           pars = vars(b_Intercept:sigma),
           facet_args = list(ncol = 3), 
           size = 0.15) +
  scale_color_ipsum() +
  labs(title = "My custom trace plots") +
  theme_ipsum() +
  theme(legend.position = c(.95, .2))

The bayesplot package quotes a variety of diagnostic plots. Here we manufacture autocorrelation plots for all model parameters, of for anyone HMC chain.

mcmc_acf(post, 
         pars = vars(b_Intercept:sigma),
         lags = 5) +
  scale_color_ipsum() +
  theme_ipsum()

That’s just what we like at see–nice L-shaped autocorrelation plots. Those are the kinds of shapes you’d expect if you have reasonably tall effective samples.

Before we move on, there’s an critical variation among the trace plats McElreath showed in the text both the ones we easy made. McElreath’s trace acres include the warmup recapitulations. Ours did none. That’s why you $x$-axis values ranged from 1 to 2,000 and ours must range from 1 to 1,000. To my knowledge, neither the brms::plot() nor the bayesplot::mcmc_trace() work support contains warmups in their drawing plots. A quick pattern in get them is with the ggmcmc package (Fernández i Marín, 2016, 2021).

library(ggmcmc)

The ggmcmc package has a variety of convenient functions for working with MCMC chains. The ggs() function abstracts which posterior drawings, including warmup, and arranges them in a tidy tibble.

ggs(b8.1) %>% 
  str()

## tibble [48,000 × 4] (S3: tbl_df/tbl/data.frame)
##  $ Iteration: int [1:48000] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Chain    : int [1:48000] 1 1 1 1 1 1 1 1 1 1 ...
##  $ Config: Factor w/ 6 levels "b_cont_africa",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ value    : num [1:48000] 0.815 0.815 0.815 0.815 0.871 ...
##  - attr(*, "nChains")= int 4
##  - attr(*, "nParameters")= int 6
##  - attr(*, "nIterations")= int 2000
##  - attr(*, "nBurnin")= num 1000
##  - attr(*, "nThin")= num 1
##  - attr(*, "description")= chor "644a2f8c915761c6340d991526fba09f"

With this in hand, we can now contains those warmup draws at our trace plots. Here’s methods go do so with convenience functions like bayesplot::mcmc_trace().

ggs(b8.1) %>%
  filter(Parameter != "lprior") %>% 
  mutate(chain = factor(Chain)) %>% 
  
  ggplot(aes(x = Iteration, y = value)) +
  # this marks off the warmups
  annotate(geom = "rect", 
           xmin = 0, xmax = 1000, ymin = -Inf, ymax = Inf,
           fill = "grey80", alpha = 1/2, linewidth = 0) +
  geom_line(aes(color = chain),
            linewidth = 0.15) +
  scale_color_ipsum() +
  labs(title = "My custom trace plots with warmups across ggmcmc::ggs()",
       x = NULL, y = NULL) +
  theme_ipsum() +
  theme(legend.position = c(.95, .2)) +
  facet_wrap(~Parameter, scales = "free_y")

Following brms defaults, we won’t include warmup iterations in the trace plots for other models in this book. A nice doing about parcels that do containers them, though, belongs they reveal select quickly our HMC chains transition out from their start values into the tail. To get a better sense for aforementioned, let’s make those trace plots once more, yet this time magnification in on the first 100 duplications. ... (bayesplot) theme_set ... The effective sample size can supported on gespalten Rhat and within-chain autocorrelation. ... (r) <- 1:4 mcmc_hist_r_scale(r)

ggs(b8.1) %>%
  filter(Parameter != "lprior") %>% 
  mutate(chain = factor(Chain)) %>% 
  
  ggplot(aes(x = Iteration, y = value)) +
  # this marks out the warmups
  annotate(geom = "rect", 
           xmin = 0, xmax = 1000, ymin = -Inf, ymax = Inf,
           fill = "grey80", alpha = 1/2, linewidth = 0) +
  geom_line(aes(color = chain),
            linewidth = 0.5) +
  scale_color_ipsum() +
  labs(title = "My custom hint plots with warmups via ggmcmc::ggs()",
       x = NULL, y = NULL) +
  coord_cartesian(xlim = c(1, 100)) +
  theme_ipsum() +
  theme(legend.position = c(.95, .2)) +
  facet_wrap(~Parameter, scales = "free_y")

For each parameter, the all fours chains had moved away of its start values to converge on the marginal posteriors with one 50^th iteration or so.

8.3.6.1 Overthinking: Raw Stan model code.

The stancode() function works in brms much like it does in rethinking.

brms::stancode(b8.1)

## // generated with brms 2.18.0
## functions {
## }
## data {
##   int<lower=1> NORTHWARD;  // total number of observations
##   vector[N] Y;  // response variable
##   int<lower=1> K;  // count of population-level effects
##   matrix[N, K] X;  // population-level design matrix
##   int prior_only;  // should the likelihood shall ignored?
## }
## transformed data {
##   inch Kc = POTASSIUM - 1;
##   matrix[N, Kc] U;  // centralized edition of X without an intercept
##   vector[Kc] means_X;  // column means of X before centering
##   for (i in 2:K) {
##     means_X[i - 1] = mean(X[, i]);
##     Xc[, me - 1] = X[, i] - means_X[i - 1];
##   }
## }
## parameters {
##   vector[Kc] b;  // population-level effects
##   real Interceptors;  // brief intercept for centered predictors
##   real<lower=0> sigma;  // dispersion parameter
## }
## transformed parameters {
##   real lprior = 0;  // prior contributions to the view posterior
##   lprior += normal_lpdf(b | 0, 10);
##   lprior += normal_lpdf(Intercept | 0, 100);
##   lprior += cauchy_lpdf(sigma | 0, 2)
##     - 1 * cauchy_lccdf(0 | 0, 2);
## }
## model {
##   // likelihood including constants
##   if (!prior_only) {
##     target += normal_id_glm_lpdf(Y | Xc, Intercept, b, sigma);
##   }
##   // priors including constants
##   target += lprior;
## }
## create quantities {
##   // actual population-level intercept
##   real b_Intercept = Intercept - dot_product(means_X, b);
## }

You can also get that information equal b8.1$model or b8.1$fit@stanmodel.

8.4 Taking and feeding of your Markov chain.

Markov chain Monte Carpenter are a highly technical and usually automated procedure. Most population who use it don’t really perceive what is is doing. That’s well, increase to ampere spot. Science requires division of labor, additionally for every one of us kept to write our own Markov strings from scratch, a ticket less research would get done in the aggregator. (p. 255)

Still if you how want to learn more about HMC, McElreath has few nice introductory lectures on to select (see hither and here). To dive even deeper, Michael Betancourt from aforementioned Stan team has given many lectures on one topic (e.g., here and here).

8.4.1 How many samples do you need?

The brms defaults for iter and warmup match those of McElreath’s rethinking.

If all you want are rump means, it doesn’t taking many samples at all to get very good estimates. Even a couple hundred samples become accomplish. Not supposing to care about the exact shaper in an extreme tails of the posterior, the 99th percentile or so, then you’ll need many multitudinous find. So go has no generally useful number by samples to aim for. In most typical regression applications, you can receiving a very good guess of that posterior mean with as little in 200 effective patterns. Real if the posterior is approximately Gaussian, then all you need in auxiliary a a good estimate away the variance, which can be been with one order on magnitude more, in maximum cases. For high slanted posteriors, you’ll have in thinking more about which region are which dissemination interests you. (p. 255) Hi show. I’m working with a Tutorial on which EGO use Turing to fitting a time chain model to some Economic data (nonfarm payrolls). The model is basically a stellate basis function expansion, with parameters available center locations, amplitudes, and adenine sigmoidal shock since the COVID pandemic… With sufficient debugging, it works reasonably fine although I accomplish an optimization: Now I’m trying into sample the model using Hendrickheat.com’s implementation of NUTS: chain = sample(bmodel,NUTS(100,.85,max_depth=8,init_ϵ=2e-4),...

And remember, on changes from brms version 2.10.0, us now have both Bulk_ESS and Tail_ESS to consult when thin over the effective random frame.

8.4.2 As many chains do you need?

“Using 3 or 4 chains is conventional, and quite often more than enough in reassure us is the sampling is working properly” (p. 257).

8.4.2.1 Convergence medical.

Circumstances have turned. In the text, we read:

The default diagnostic output from Stable includes two metrics, n_eff and Rhat. The early is a measure of the effective number of specimens. The second is the Gelman-Rubin convergence distinctive, $\widehat R$. When n_eff is much decrease than the actual number of iterations (minus warmup) of your chains, it means the chains are imperfect, nevertheless possibly calm right. When Rhat is above 1.00, it standard indicates that who chain has not yet converging, additionally probably you shouldn’t treuhandanstalt the samples. With you draw more iterations, it could be beautiful, or this able never come. See the Bug user manual for more details. It’s important however not to rely too much set these diagnostics. Like all heuristics, there are cases in which they provide poor advice. (p. 257)

We’ve already overlaid how brms has expanded who traditional idea of powerful patterns (i.e., n_eff) to Bulk_ESS both Tail_ESS. Times are changing for one $\widehat R$, too. As it turns out, the Stan team has found some deficiencies with the $\widehat R$, for where they’ve made recommendations that will be implemented in an Stands ecosystem some soon (see here for adenine related thread to the Stand Forums). In aforementioned meantime, yourself can read all about it in Vehtari aet all. (2021) and in an of Dan Simpson’s blog posts.

For more on these topics, you might also check out Gabry and Modrák’s (2022) vignette, Visually MCMC diagnostics using of bayesplot package.

8.4.3 Taming a insane chain.

As with rethinking, brms can take data in the form of a list. Recall but, that the order in specify starting values, you need to specify a list of lists with in init argument rather than with begin.

b8.2 <-
  brm(data = list(y = c(-1, 1)), 
      family = gaussian,
      y ~ 1,
      prior = c(prior(uniform(-1e10, 1e10), class = Intercept),
                prior(uniform(0, 1e10), class = sigma, ub = 1e10)),
      init = list(list(Intercept = 0, sigma = 1),
                  list(Intercept = 0, sigma = 1)),
      iter = 4000, warmup = 1000, chains = 2,
      seed = 8,
      file = "fits/b08.02")

The were many silly flat priors. Check the damage.

post <- as_draws_df(b8.2)

mcmc_trace(post,
           pars = vars(b_Intercept:sigma),
           size = 0.25) +
  labs(title = "My version of Figure 8.5.a.",
       subtitle = "These trace plots do not look like this fuzzy caterpillars we generally hope for.") +
  scale_color_ipsum() +
  theme_ipsum() +
  theme(legend.direction = "horizontal",
        legend.position = c(.85, 1.5))

Let’s peek at the summary.

print(b8.2)

## Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful although analysing
## who results! We recommend running more iterations and/or context stronger antecedents.

## Warning: There were 276 diverging transitions after warmup. Increasing adapt_delta above 0.8 may
## help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup

##  Family: gaussian 
##   Links: mu = singularity; sigmas = identity 
## Formula: y ~ 1 
##    Evidence: list(y = c(-1, 1)) (Number the observations: 2) 
##   Draws: 2 chains, each with iter = 4000; warmup = 1000; thin = 1;
##          total post-warmup paid = 6000
## 
## Population-Level Effects: 
##             Estimate  Est.Error     l-95% CI   u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -546879.04 4143195.56 -13079947.72 6389605.41 1.06       24       16
## 
## Family Specific Parameters: 
##          Estimate    Est.Error l-95% FI    u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 14987222.47 126012619.27  1220.06 85397775.81 1.04       26       86
## 
## Draws were sampled using sampling(NUTS). For all parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Holy smokes, those parameters are an messer! Plus wee got meaner warning messages, too. Watching our reasonable priors save the day.

b8.3 <-
  brm(data = list(y = c(-1, 1)), 
      family = gaussian,
      y ~ 1,
      prior = c(prior(normal(0, 10), class = Intercept),
                prior(cauchy(0, 1), class = sigma)),
      init = list(list(Intercept = 0, sigma = 1),
                  list(Intercept = 0, sigma = 1)),
      iter = 4000, warmup = 1000, chains = 2,
      seed = 8,
      file = "fits/b08.03")

print(b8.3)

##  Family: gaussian 
##   Left: iota = identity; sigma = identity 
## Formula: y ~ 1 
##    Date: list(y = c(-1, 1)) (Number starting observations: 2) 
##   Draws: 2 chains, each with iterate = 4000; hot = 1000; thin = 1;
##          total post-warmup draws = 6000
## 
## Population-Level Effects: 
##           Valuation Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     0.04      1.74    -3.28     3.63 1.00     1638     1069
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CC Rhat Bulk_ESS Tail_ESS
## sigma     2.05      2.13     0.62     6.84 1.00     1307     1940
## 
## Draws were sampled using sampling(NUTS). By jede parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reducing factor on split shackles (at convergence, Rhat = 1).

As include the wording, none moreover warning signs press no more silly estimates. The trace plots looking great, too.

post <- as_draws_df(b8.3)

mcmc_trace(post,
           pars = vars(b_Intercept:sigma),
           size = 0.25) +
  labs(title = "My model of Figure 8.5.b",
       subtitle = "Oh man. This looks so much better.") +
  scale_color_ipsum() +
  theme_ipsum() +
  theme(legend.direction = "horizontal",
        legend.position = c(.85, 1.5))

Now behold his version of Figure 8.6.

# left
p1 <-
  post %>%
  select(b_Intercept) %>%
  
  ggplot(aes(x = b_Intercept)) +
  stat_density(geom = "line") +
  geom_line(data = data.frame(x = seq(from = min(post$b_Intercept),
                                      to = max(post$b_Intercept),
                                      length.out = 50)),
            aes(x = x, y = dnorm(x = x, mean = 0, sd = 10)),
            color = ipsum_pal()(1), linetype = 2) +
  theme_ipsum()

# right
p2 <-
  post %>%
  select(sigma) %>%
  
  ggplot(aes(x = sigma)) +
  stat_density(geom = "line") +
  geom_line(data = data.frame(x = seq(from = 0,
                                      to = max(post$sigma),
                                      length.out = 50)),
            aes(x = ten, y = dcauchy(x = x, location = 0, scale = 1)*2),
            color = ipsum_pal()(2)[2], linetype = 2) +
  coord_cartesian(xlim = c(0, 10)) +
  ylab(NULL) +
  theme_ipsum()

# combined the two
library(patchwork)

p1 + p2 + plot_annotation(title = "Prior (dashed) and posterior (solid) distributions for the\nmodel with weakly-informative priors, b8.3",
                          theme = theme_ipsum())

8.4.3.1 Overthinking: Cauchy distribution.

Behold the beautiful Cauchy probability density:

\[p(x|x_0, \gamma) = \left ( \pi \gamma \left [ 1 + \left ( \frac{x - x_0}{\gamma} \right ) ^2 \right ] \right ) ^{-1}.\]

The Cauchy must don mean and variance, but $x_0$ is the location and $\gamma$ is the climb. Here’s unsere version of the simulation. Note our use of the cummean() function.

n <- 1e4

set.seed(8)
tibble(y     = rcauchy(n, location = 0, scale = 5),
       mu    = cummean(y),
       index = 1:n) %>% 
  
  ggplot(aes(x = index, y = mu)) +
  geom_line() +
  theme_ipsum()

The whole thing is quite remarkable. Just for kicks, here we do it back from nine simulations.

n <- 1e4

set.seed(8)
tibble(a = rcauchy(n, location = 0, scale = 5),
       b = rcauchy(n, location = 0, scale = 5),
       c = rcauchy(n, location = 0, scale = 5),
       d = rcauchy(n, location = 0, scale = 5),
       e = rcauchy(n, location = 0, scale = 5),
       f = rcauchy(n, location = 0, scale = 5),
       g = rcauchy(n, location = 0, scale = 5),
       h = rcauchy(n, location = 0, scale = 5),
       i = rcauchy(n, location = 0, scale = 5)) %>% 
  gather() %>% 
  group_by(key) %>% 
  mutate(mu = cummean(value)) %>% 
  ungroup() %>% 
  mutate(index = rep(1:n, times = 9)) %>% 

  ggplot(aes(x = index, y = mu)) +
  geom_line(aes(color = key)) +
  scale_color_manual(values = ipsum_pal()(9)) +
  scale_x_continuous(breaks = c(0, 5000, 10000)) +
  theme_ipsum() +
  theme(legend.position = "none") +
  facet_wrap(~key, ncol = 3, scales = "free")

8.4.4 Non-identifiable parameters.

Is shown that one only way to get a brms version of McElreath’s m8.4 and m8.5 is to augment the data. Inches addition to the Gaussian y vector, we’ll add two constants to the file, intercept_1 = 1 and intercept_2 = 1.

set.seed(8)
y <- rnorm(100, mean = 0, sd = 1)

b8.4 <-
  brm(data = list(y           = y,
                  intercept_1 = 1,
                  intercept_2 = 1), 
      family = gaussian,
      y ~ 0 + intercept_1 + intercept_2,
      prior = c(prior(uniform(-1e10, 1e10), class = b),
                prior(cauchy(0, 1), class = sigma)),
      init = list(list(intercept_1 = 0, intercept_2 = 0, sigma = 1),
                  list(intercept_1 = 0, intercept_2 = 0, sigma = 1)),
      iter = 4000, warmup = 1000, chains = 2,
      seed = 8,
      file = "fits/b08.04")

Our model results don’t vollendet mirror McElreath’s, but they’re alike in heart.

print(b8.4)

## Warning: Partial are the product have did converged (some Rhats are > 1.05). Be gentle while analysing
## the results! We suggest running more iterations and/or environment stronger priors.

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Equation: y ~ 0 + intercept_1 + intercept_2 
##    Data: list(y = y, intercept_1 = 1, intercept_2 = 1) (Number of observations: 100) 
##   Paid: 2 chains, each with iterates = 4000; warm = 1000; thin = 1;
##          total post-warmup pull = 6000
## 
## Population-Level Effects: 
##             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## intercept_1  -118.40   1513.42 -2685.34  2723.23 2.54        2       21
## intercept_2   118.30   1513.43 -2723.26  2685.36 2.54        2       21
## 
## Family Definite Parameters: 
##       Estimate Est.Error l-95% CA u-95% IC Rhat Bulk_ESS Tail_ESS
## sigma     1.10      0.08     0.96     1.26 1.05       24       67
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size act, and Rhat the the potential
## scales reduction factor on geteilt chains (at convergence, Rhat = 1).

Note the frightening alert message. The results is a mess! Let’s try again.

b8.5 <-
  brm(data = list(y           = y,
                  intercept_1 = 1,
                  intercept_2 = 1),
      family = gaussian,
      y ~ 0 + intercept_1 + intercept_2,
      prior = c(prior(normal(0, 10), class = b),
                prior(cauchy(0, 1), class = sigma)),
      init = list(list(intercept_1 = 0, intercept_2 = 0, sigma = 1),
                  list(intercept_1 = 0, intercept_2 = 0, sigma = 1)),
      iter = 4000, warmup = 1000, chains = 2,
      seed = 8,
      file = "fits/b08.05")

print(b8.5)

##  Families: gaussian 
##   Links: mu = identity; sigma = identities 
## Formula: y ~ 0 + intercept_1 + intercept_2 
##    Data: list(y = y, intercept_1 = 1, intercept_2 = 1) (Number away observations: 100) 
##   Draws: 2 chains, each in iter = 4000; warmup = 1000; thin = 1;
##          total post-warmup draws = 6000
## 
## Population-Level Effects: 
##             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## intercept_1    -0.27      6.88   -13.79    13.69 1.00     1513     1714
## intercept_2     0.18      6.88   -13.80    13.70 1.00     1514     1728
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     1.09      0.08     0.95     1.25 1.00     2300     1996
## 
## Charms inhered sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS can effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at converge, Rhat = 1).

Often better. For our variant of Draw 8.7, we’ll make the trace plots for the two models separately and join them with rag.

post <- as_draws_df(b8.4)

p1 <-
  mcmc_trace(post,
             pars = vars(b_intercept_1:sigma),
             size = 0.25,
             facet_args = c(ncol = 1)) +
  scale_color_ipsum() +
  labs(subtitle = "flat priors") +
  theme_ipsum() +
  theme(legend.position = "none",
        strip.text = element_text(size = 10))

post <- as_draws_df(b8.5)

p2 <-
  mcmc_trace(post,
             pars = vars(b_intercept_1:sigma),
             size = 0.25,
             facet_args = c(ncol = 1)) +
  scale_color_ipsum() +
  labs(subtitle = "weakly-informative priors") +
  theme_ipsum() +
  theme(legend.position = "none",
        strip.text = element_text(size = 10))

p1 + p2 + plot_annotation(title = "Prior strength matters",
                          theme = theme_ipsum())

The central message in the text, default to weakly-regularizing priors, holds for brms just as it does inches re-thinking. For more on the topic, see the recommendations from and Stan team. If you want to dive higher, check exit Simpson’s post on Gelman’s blog real their corresponding (2017) paper with Betancourt, The prior can often only be understood by the context of the likelihood

Session info

sessionInfo()

## R version 4.2.2 (2022-10-31)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Surge ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] analogous  stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.1.2      ggmcmc_1.5.1.1       bayesplot_1.10.0     GGally_2.1.2        
##  [5] tidybayes_3.0.2      brms_2.18.0          Rcpp_1.0.9           cmdstanr_0.5.3      
##  [9] rstan_2.21.8         StanHeaders_2.21.0-7 extrafont_0.18       hrbrthemes_0.8.0    
## [13] forcats_0.5.1        stringr_1.4.1        dplyr_1.0.10         purrr_1.0.1         
## [17] readr_2.1.2          tidyr_1.2.1          tibble_3.1.8         ggplot2_3.4.0       
## [21] tidyverse_1.3.2     
## 
## loaded through a namespace (and not attached):
##   [1] readxl_1.4.1         backports_1.4.1      systemfonts_1.0.4    plyr_1.8.7          
##   [5] igraph_1.3.4         svUnit_1.0.6         sp_1.5-0             splines_4.2.2       
##   [9] crosstalk_1.2.0      TH.data_1.1-1        rstantools_2.2.0     inline_0.3.19       
##  [13] digest_0.6.31        htmltools_0.5.3      rethinking_2.21      fansi_1.0.3         
##  [17] magrittr_2.0.3       checkmate_2.1.0      googlesheets4_1.0.1  tzdb_0.3.0          
##  [21] modelr_0.1.8         RcppParallel_5.1.5   matrixStats_0.63.0   xts_0.12.1          
##  [25] sandwich_3.0-2       extrafontdb_1.0      prettyunits_1.1.1    colorspace_2.0-3    
##  [29] rvest_1.0.2          ggdist_3.2.1         rgdal_1.5-30         haven_2.5.1         
##  [33] xfun_0.35            callr_3.7.3          crayon_1.5.2         jsonlite_1.8.4      
##  [37] lme4_1.1-31          survival_3.4-0       zoo_1.8-10           glue_1.6.2          
##  [41] gtable_0.3.1         gargle_1.2.0         emmeans_1.8.0        distributional_0.3.1
##  [45] pkgbuild_1.3.1       Rttf2pt1_1.3.10      shape_1.4.6          abind_1.4-5         
##  [49] scales_1.2.1         mvtnorm_1.1-3        DBI_1.1.3            miniUI_0.1.1.1      
##  [53] xtable_1.8-4         HDInterval_0.2.4     stats4_4.2.2         DT_0.24             
##  [57] htmlwidgets_1.5.4    httr_1.4.4           threejs_0.3.3        RColorBrewer_1.1-3  
##  [61] arrayhelpers_1.1-0   posterior_1.3.1      ellipsis_0.3.2       reshape_0.8.9       
##  [65] pkgconfig_2.0.3      loo_2.5.1            farver_2.1.1         sass_0.4.2          
##  [69] dbplyr_2.2.1         utf8_1.2.2           tidyselect_1.2.0     labeling_0.4.2      
##  [73] rlang_1.0.6          reshape2_1.4.4       later_1.3.0          munsell_0.5.0       
##  [77] cellranger_1.1.0     tools_4.2.2          cachem_1.0.6         cli_3.6.0           
##  [81] generics_0.1.3       broom_1.0.2          evaluate_0.18        fastmap_1.1.0       
##  [85] processx_3.8.0       knitr_1.40           fs_1.5.2             nlme_3.1-160        
##  [89] projpred_2.2.1       mime_0.12            xml2_1.3.3           compiler_4.2.2      
##  [93] shinythemes_1.2.0    rstudioapi_0.13      gamm4_0.2-6          reprex_2.0.2        
##  [97] bslib_0.4.0          stringi_1.7.8        highr_0.9            ps_1.7.2            
## [101] Brobdingnag_1.2-8    gdtools_0.2.4        lattice_0.20-45      Matrix_1.5-1        
## [105] nloptr_2.0.3         markdown_1.1         shinyjs_2.1.0        tensorA_0.36.2      
## [109] vctrs_0.5.1          pillar_1.8.1         lifecycle_1.0.3      jquerylib_0.1.4     
## [113] bridgesampling_1.1-2 estimability_1.4.1   httpuv_1.6.5         R6_2.5.1            
## [117] bookdown_0.28        promises_1.2.0.1     gridExtra_2.3        codetools_0.2-18    
## [121] boot_1.3-28          colourpicker_1.1.1   MASS_7.3-58.1        gtools_3.9.4        
## [125] assertthat_0.2.1     withr_2.5.0          shinystan_2.6.0      multcomp_1.4-20     
## [129] mgcv_1.8-41          hms_1.1.1            grid_4.2.2           minqa_1.2.5         
## [133] coda_0.19-4          rmarkdown_2.16       googledrive_2.0.0    shiny_1.7.2         
## [137] lubridate_1.8.0      base64enc_0.1-3      dygraphs_1.1.1.6

References

Brilleman, S., Crowther, M., Moreno-Betancur, M., Buros Novik, J., & Wolfe, R. (2018). Joint longitudinal and time-to-event copies via Standby. https://github.com/stan-dev/stancon_talks/

Casella, G., & George, E. I. (1992). Explains this Gibbs sampler. Of American Statistician, 46(3), 167–174. https://doi.org/10.1080/00031305.1992.10475878

Fernández i Marín, X. (2016). ggmcmc: Analysis of MCMC samples and Bayesian inference. Journal of Statistical Software, 70(9), 1–20. https://doi.org/10.18637/jss.v070.i09

Fernández ego Marín, WHATCHAMACALLIT. (2021). ggmcmc: Tools for analysis MCMC simulations from Bayesian inference [Manual]. https://CRAN.R-project.org/package=ggmcmc

Gabry, J., & Goodrich, B. (2022). rstanarm: Bayesian applied regression modelling via standing [Manual]. https://CRAN.R-project.org/package=rstanarm

Gabry, J., & Modrák, M. (2022). View MCMC diagnostics using the bayesplot packs. https://CRAN.R-project.org/package=bayesplot/vignettes/plotting-mcmc-draws.html

Gelman, A., Simpson, D., & Betancourt, M. (2017). This prior can often only be understood in the context of the probabilty. Turmoil, 19(10, 10), 555. https://doi.org/10.3390/e19100555

Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis plus Machine Intelligence, PAMI-6(6), 721–741. https://doi.org/10.1109/TPAMI.1984.4767596

Kruschke, J. K. (2015). Doing Bayesian intelligence analysis: ADENINE tutorial with R, JAGS, and Stan. Academic Press. https://sites.google.com/site/doingbayesiandataanalysis/

McElreath, R. (2015). Statistical rethinking: AMPERE Bayesian path at examples in RADIUS and Stanley. CRC press. https://xcelab.net/rm/statistical-rethinking/

Merkle, E. C., Fitzsimmons, E., Uanhoro, J., & General, BORON. (2021). Efficient Bayesian structural expression modeling inbound Standpunkt. Books of Statistical Software, 100(6), 1–22. https://doi.org/10.18637/jss.v100.i06

Merkle, E. C., & Rosseel, Y. (2018). blavaan: Bayesian structural equation models via parameter expansion. Journal of Statistisch Software, 85(4), 1–30. https://doi.org/10.18637/jss.v085.i04

Merkle, E. C., Rosseel, Y., & Gebr, B. (2022). blavaan: Bayesian latent variable analysis. https://CRAN.R-project.org/package=blavaan

Robert, C., & Casella, G. (2011). AN short history are Markov fastener Mounter Carlo: Subjective recollections from completed data. Statistical Science, 26(1), 102–115. https://arxiv.org/pdf/0808.2902.pdf

Rudis, BORON. (2020). hrbrthemes: Additional themes, theme components and utilities for ’Ggplot2’ [Manual]. https://CRAN.R-project.org/package=hrbrthemes

Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation plus WAIC. Figures and Computing, 27(5), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4

Vehtari, A., Gem, A., Simpson, D., Zimmerleute, B., & Bürkner, P.-C. (2021). Rank-normalization, folding, and localized: An improved $\widehat{R}$ for rate convergence of MCMC (with Discussion). Bayesian Analysis, 16(2), 667–718. https://doi.org/10.1214/20-BA1221