ZooarchMixMod Model Supplement.Rmd

---
title: "**MODEL SUPPLEMENT: A BAYESIAN MULTILEVEL MIXTURE MODEL FOR ZOOARCHAEOLOGICAL MEASUREMENTS**"
author:
  - "Jesse Wolfhagen^1,2^"
  - ^1^Department of Anthropology, Purdue University, West Lafayette, Indiana, USA
  - ^2^Department of Archaeology, Max Planck Institute for Geoanthropology, Jena, Germany
  - " "
  - "email: jl.wolfhagen@gmail.com"
date:
  - " "
  - " "
  - "*R Markdown version last compiled on `r lubridate::today()`*"
indent: yes
header-includes:
    - \usepackage{setspace}\doublespacing
    - \usepackage[labelfont=bf,width=1\textwidth,justification=raggedright,singlelinecheck=false]{caption}
    - \captionsetup[figure]{labelfont=bf}
    - \usepackage[left]{lineno}
output:
  pdf_document:
    latex_engine: xelatex
    keep_md: false
  word_document: default
  html_document: default
link-citations: yes
bibliography: zooarch_mixmod_library.bib
---

```{r supplement setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
options(kableExtra.auto_format = FALSE)
library("boot")
library("Cairo")
library("cmdstanr")
library("data.table")
library("doParallel")
library("ggdist")
library("ggplot2")
library("ggpubr")
library("ggrepel")
library("kableExtra")
library("knitr")
library("mixtools")
library("parallel")
library("readxl")
library("rnaturalearth")
library("rnaturalearthdata")
library("rstan")
library("sf")
library("zoolog")
#setting resources to do simulations with multicore processing (on Windows)
num_cores <- detectCores(logical = T)
cl <- makeCluster(num_cores - 4) #creating a virtual cluster
registerDoParallel(cl) #registering the cluster
```
\newpage

# *Model Supplement: A Bayesian Multilevel Mixture Model for Zooarchaeological Measurements*

The Bayesian model developed for this paper describes assemblages of faunal measurements as a mixture of immature animals, (adult-sized) females, and (adult-sized) males that have distinct average body sizes and expected variation around that average size. The model uses multiple measured dimensions [e.g., humerus distal breadth "humerus Bd", radius proximal breadth "radius Bp", abbreviations following @vondendriesch_1976], which are first converted into a logarithmic size index, or LSI, values with a natural logarithm base [@meadow_1999; @wolfhagen_2020]. LSI observations from measurement sets are grouped together within a specimen to create individuals grouped into defined “element portions” that serve as the basis for the mixture model analysis. Element portions relate to categories used for element fusion (e.g., distal humerus) to relate biometry and mortality profiles; specimens that contain multiple element portions—like complete limb bones—are grouped into the latest-fusing element portion [compare to "skeletal part type" in @breslawski_2023].

The multilevel structure of the model uses partial pooling to allow the parameters of the mixture model to vary between element portions while resisting overfitting. These element portion-specific parameters are related to each other through hyper-parameters, which describe the average value of the model parameters and the variability of model parameters across element portions [@wolfhagen_2020]. The following sections describe the details of the multilevel mixture model, including the model’s likelihood, the ways that the direct observations of measurements and demographic data are used by the model to account for measurement error, and the development of prior distributions for the model’s hyper-parameters and for parameters that govern the model’s multilevel structure. Finally, this supplement provides the full sets of equations for the single-site and multisite Bayesian multilevel mixture models and the prior distribution definitions used in the model applications described in the main text.

## *1. Definition of the Bayesian Multilevel Model*

The central likelihood of the mixture model uses parameters that are specific to each element portion. These parameters include the relative proportions for the different animal groups: immature animals, females, and males ($\pi_{1}$, $\pi_{2}$, $\pi_{3}$), the average size for each group ($\mu_{1}$, $\mu_{2}$, $\mu_{3}$), and the standard deviation for each group ($\sigma_{1}$, $\sigma_{2}$, $\sigma_{3}$). For each element portion, immature animals are described with the first set of parameters ($\pi_{1}$, $\mu_{1}$, $\sigma_{1}$), adult-sized females with the second set of parameters ($\pi_{2}$, $\mu_{2}$, $\sigma_{2}$), and adult-sized males with the third set of parameters ($\pi_{3}$, $\mu_{3}$, $\sigma_{3}$). This results in both a set of parameters that describe the composition of the assemblage (of measurements from that element portion) and an equation to estimate the probability that a particular specimen comes from an immature, adult female, or adult male individual.

*Mixture Model Likelihood Equation:*
\begin{equation}
\begin{split}
P(x|\pi_1,\pi_2,\pi_3,\mu_1,\mu_2,\mu_3,\sigma_1,\sigma_2,\sigma_3) = \\
& \pi_1 * \operatorname{Normal}(x, \mu_1, \sigma_1) + \\
& \pi_2 * \operatorname{Normal}(x, \mu_2, \sigma_2) + \\
& \pi_3 * \operatorname{Normal}(x, \mu_3, \sigma_3)
\end{split}
\end{equation}

In addition to a specimen’s LSI value, the model needs two additional observed variables to address the potential presence of immature animals in the assemblage. First, an indicator variable $\operatorname{Immature}[\operatorname{specimen}]$ describes whether the specimen could be from an immature animal based on the body part and the fusion characteristics (1 = potentially immature, 0 = cannot be immature). Data from known-age Shetland sheep show that specimens killed at younger than one year of age are significantly smaller than those killed at older ages, regardless of fusion status [@popkin_etal_2012]. Thus, any measurement from an element with an unfused epiphysis or from an element that does not fuse or could fuse before one year of age is considered potentially immature. Measurements from specimens with fused epiphyses that fuse after one year of age are considered ineligible to be immature, so the model does not consider that probability (it considers $\pi_{1}$ = 0 for fitting that specimen).

Second, the proportion of specimens from an element portion that could be immature ($\operatorname{proportion}_{\operatorname{immature}}$) determines how to re-weight the mixture components ($\pi_1$, $\pi_2$, and $\pi_3$) for potentially-immature specimens from that element portion. The mixture components describe the entire assemblage for an element portion (a combination of potentially-immature and non-immature specimens), meaning that if $\pi_{1}$ = 0.25 we should expect 25% of the specimens to be from immature animals. If every specimen could be immature—say, for specimens from an early-fusing element—then the mixture components do not need to be re-weighted. If, however, only half of the specimens could be immature, then the mixture components must be re-weighted to ensure that the whole-assemblage proportions are correct. In such a case, we would expect half of the potentially-immature animals to be from immature animals if $\pi_{1}$ = 0.25 for the whole assemblage and we know that $\operatorname{proportion}_{\operatorname{immature}}$ = $\left( \tfrac {0.25}{0.50} = 0.50 \right)$; this same re-weighting would make it less likely that potentially-immature animals are from adult-sized female or adult-sized male animals. The code includes checks to ensure that $\pi_{1}$ can never exceed 1.00 after accounting for the proportion of immature specimens in cases where there are very few potentially-immature specimens and/or a relatively high expected proportion of immature animals in the assemblage.

## *2. Measurement Error and Observations*

The model estimates measurement error for different observed quantities that are input into the model. Measurements on both the archaeological specimens and the standard values used to calculate LSI values are assumed to have a 1% measurement error [@breslawski_byers_2015; @popkin_etal_2012: Figure 6]. This 1% value comes from an evaluation of the @breslawski_byers_2015 measurement data, where the average standard deviation of repeated measurements on bison radius proximal breadth measurements was 1.1% the average value of the measurement. This means that each measurement is given a standard deviation based on the observed value, which is used to estimate the “modeled” measurement value based on the observation. These modeled measurements are then used to calculate the LSI value for that observation ($\operatorname{LSI}_{\operatorname{Measurement}}$).

Because specimens can have multiple measured dimensions that are included in the mixture model on them (e.g., a distal humerus with both Bd and BT observations or a complete radius with Bp and Bd observations), the mixture model calculates specimen-specific LSI values ($\operatorname{LSI}_{\operatorname{Specimen}}$) that are related to the observed measurement-specific LSI values ($\operatorname{LSI}_{\operatorname{Measurement}}$). $\operatorname{LSI}_{\operatorname{Measurement}}$ values are the “observations” with a standard deviation of 0.01 (in $\operatorname{LSI}_e$ scale) based on intra-individual variation of $\operatorname{LSI}_e$ values for the @popkin_etal_2012 sheep using the *Ovis orientalis* female standard animal (FMC 57951) from Uerpmann and Uerpmann [-@uerpmann_uerpmann_1994, Table 12].

*Observation Error Equations for Measurements:*
\begin{equation}
\begin{split}
\sigma_{\operatorname{measurement}} & = \operatorname{Measurement}_{\operatorname{observed}} * 0.01 \\
\operatorname{Measurement}_{\operatorname{observed}} & \sim \operatorname{Normal}(\operatorname{Measurement}_{\operatorname{modeled}}, \sigma_{\operatorname{measurement}}) \\
\sigma_{\operatorname{reference}} & = \operatorname{Reference}_{\operatorname{observed}} * 0.01 \\
\operatorname{Reference}_{\operatorname{observed}} & \sim \operatorname{Normal}(\operatorname{Reference}_{\operatorname{modeled}}, \sigma_{\operatorname{reference}}) \\
\operatorname{LSI}_{\operatorname{measurement}} & = \log_e(\operatorname{Measurement}_{\operatorname{modeled}}) - \log_e(\operatorname{Reference}_{\operatorname{modeled}}) \\
\operatorname{LSI}_{\operatorname{measurement}} & \sim \operatorname{Normal}(\operatorname{LSI}_{\operatorname{specimen}}, 0.01)
\end{split}
\end{equation}

The model also uses observations of sex ratios and fusion rates to estimate assemblage-level demographic proportions. This allows relevant data to inform the model about the expected relative proportions of different animal groups while still allowing these proportions to vary across different element portions. These observations are interpreted as binomial data: counts of some quantity (e.g., immature specimens) out of a total count of relevant specimens (e.g., total ageable specimens); this approach lets the model incorporate the uncertainty caused by small sample sizes. The observation of the average proportion of immature specimens ($\mu_{\pi_{1}}$) is based on the fusion rate of proximal and middle phalanges (the number of unfused phalanges $\operatorname{N}_{\operatorname{Unfused}}$ out of the total number of phalanges with fusion data $\operatorname{N}_{\operatorname{Ageable}}$), which fuse at around the same time as the estimated time that animals reach adult body size in the Shetland sheep population [@popkin_etal_2012]. The observation of the average adult sex ratio—the proportion of females among mature animals $\left( \tfrac {\mu_{\pi_2}}{\mu_{\pi_2} + \mu_{\pi_3}} \right)$—is based on the sex ratio of fused pelvises (the number of female pelvises $\operatorname{N}_{\operatorname{Female}}$ out of the total number of pelvises with a sex assignment $\operatorname{N}_{\operatorname{Sexable}}$). In each case, the number of observable specimens determines the measurement error using the binomial distribution. While this paper uses these quantities to estimate the relevant hyper-parameters, relevant observations from other elements can be incorporated into the model in the same fashion if there is a clear sense of the total number of specimens that could have potentially been immature or female.

*Observation Error Equations for Demographic Estimates:*
\begin{equation}
\begin{split}
N_{\operatorname{unfused}} & \sim \operatorname{Binomial}(N_{\operatorname{ageable}}, \mu_{\pi_1}) \\
N_{\operatorname{female}} & \sim \operatorname{Binomial}(N_{\operatorname{sexable}}, \dfrac{\mu_{\pi_2}} {\mu_{\pi_2} + \mu_{\pi_3}})
\end{split}
\end{equation}

## *3. Prior Distributions*

The prior distributions in this model are focused on describing previous beliefs about the value of the mixture model hyper-parameters, as the element portion-specific parameters are derived from these distributions. Mixture modeling performs well with unconstrained parameters values because it is more straightforward to estimate variation across element portions, meaning that constrained parameters-those where the range of possible values depends on the values of other parameters—must first be transformed into related unconstrained parameters [@betancourt_2017] . The following sub-sections describe the necessary transformations for different sets of the mixture model parameters, describing the unconstrained parameters that can be modeled and the transformations that result in the mixture model parameters. While these sections use the mixture model parameter notations, prior distributions are for the ‘central tendency’ hyper-parameter for the described unconstrained parameter.

It is important to remember that for all of these prior distributions are arbitrary choices made by the researcher, regardless of whether the distributions are based on specific animal populations or on reference priors. Other researchers could and should use different prior distributions to best reflect their intuition about likely parameter values for particular case studies. This also highlight the importance of reporting the prior distributions used in a Bayesian analysis to ensure replicability. Examining the implications of different prior distributions is an important step in the development of Bayesian models, one that should be regularly tested even before models are fit to datasets [@gelman_etal_2020_workflow].

### *3.1 Mixture Proportion Priors*

Prior distributions for the mixture proportions reflect our prior beliefs about the relative proportions of the three animal groups in the assemblage (immature, adult-sized females, and adult-sized males). The three mixture proportions ($\pi_{1}$, $\pi_{2}$, $\pi_{3}$) are a three-value unit simplex, meaning that the values are constrained as a group to sum up to one. Thus, the simplex can be described by only two variables because the third value cannot vary once those two values are known. The model uses two unconstrained variables ($\theta_{1}$ and $\theta_{2}$) to describe the $\pi$ values. These $\theta$ values are related back to $\pi$ values using a ‘stick-breaking’ transformation that iteratively estimates the relative proportions of the simplex taken up by each $\theta$ value [@stan_language_2.29: Section 10.7].

*Stick-Breaking Transformations:*
\begin{equation}
\begin{split}
\pi_1 & = \operatorname{logit}^{-1}(\theta_1 + \log(0.5)) \\
\pi_2 & = (1 - \pi_1) * \operatorname{logit}^{-1}(\theta_2 + \log(1)) \\
\pi_3 & = 1 - (\pi_1 + \pi_2)
\end{split}
\end{equation}

The $\theta_{1}$ value can be directly related to the $\pi_{1}$ value using the first line of the stick-breaking transformation, meaning that one can examine the associated $\pi_{1}$ estimate for a given $\theta_{1}$ value. Within the stick-breaking transformation, $\theta_{2}$ relates to the relative proportions of $\pi_{2}$ and $\pi_{3}$ after $\pi_{1}$ has been estimated, which is effectively the adult sex ratio. Just as we could examine the expected $\pi_{1}$ estimates from a distribution of $\theta_{1}$ values, we can thus use expected adult sex ratios $\left( \tfrac {\pi_2}{\pi_2 + \pi_3} \right)$ estimates from a particular prior distribution for $\theta_{2}$. Relating these $\theta$ values back to observable phenomena makes it easier to define reasonable prior distribution definitions for the parameters from domain expertise (see Section 2.2 and Figure 2 of the main text).

### *3.2 Average Body Size and Size Variability Priors*

While the average body sizes of the different components ($\mu_{1}$, $\mu_{2}$, $\mu_{3}$) are not intrinsically linked in the same way that $\pi$ values are, the model still requires some structure to aid interpretability. Bayesian mixture models that are fit using Markov Chain Monte Carlo (MCMC) methods, like the model in this paper, can suffer from an issue called “label switching” if $\mu$ values are not ordered in some way [@jasra_etal_2005]. MCMC methods rely on running multiple “chains”—separate iterations of the model that are independently fit and then combined together—to show that the results are independent of the initial conditions. Label switching describes a scenario where different chains fit the data well, but the parameter labels relate to different specimens (e.g., smaller specimens are assigned to $\mu_{1}$ in one chain and to $\mu_{2}$ in another). To avoid label switching, the average body sizes are strictly ordered, meaning that $\mu_{1} < \mu_{2} < \mu_{3}$ must be maintained. Note that this only affects average values, individual immature specimens can still be larger than female specimens or male specimens and individual female specimens can be larger than male specimens. This ordering is achieved by only estimating $\mu_{2}$ directly (average $\operatorname{LSI}_{e}$ value for females) and estimating the average $\operatorname{LSI}_{e}$ value for immature and male animals with offsets ($\delta_{1}$, $\delta_{2}$) from the female average. The $\delta$ values must be positive to maintain the ordering of the $\mu$ values, so each $\delta$ is modeled in a log-transformed space.

*Offset Equations for $\mu$ Values:*
\begin{equation}
\begin{split}
\mu_1 & = \mu_2 - \delta_1 \\
\mu_3 & = \mu_2 + \delta_2
\end{split}
\end{equation}

Conceptually, this expression of animal body size defines female animals as the standard “body size” that is subject to various selective pressures, with the offset for male animals reflecting sex-specific pressures on males. This interpretation of body size broadly fits the general pressures affecting adult body size in females and males across many ungulate taxa, including domestic herd animals [@perezbarberia_etal_2002; @tchernov_horwitz_1991]. The body size offset between immature animals and adult-sized females ($\delta_{1}$) is admittedly an *ad hoc* definition rather than one under strict biological constraints, as it can be affected by the age immature animals reach before being killed [@gillis_etal_2014]. The computational advantages of this definition arguably outweigh the awkwardness of the definition, however. Further, evaluation of $\delta_{1}$ and $\delta_{2}$ values across different sites could conceivably highlight variation in the timing of the killing of immature animals ($\delta_{1}$) and the degree of adult sexual dimorphism ($\delta_{2}$); both variables can be related to models of hunting intensity, animal domestication, and herd management [@gillis_etal_2014; @marom_baroz_2013; @zeder_hesse_2000].

The $\operatorname{LSI}_{e}$ size variability of animals within a group ($\sigma_{1}$, $\sigma_{2}$, $\sigma_{3}$) is a key variable in the Bayesian mixture model. The values of these standard deviation parameters play a major role in ensuring that the mixture components reflect biological entities rather than overfitting to specific sample noise. Coefficients of variance (CVs) for raw mammal bone measurements from a single sex have been found to be relatively consistent [@davis_1996; @popkin_etal_2012]. When transforming these measurements using a logarithm, this produces consistent standard deviations of the transformed measurement values [@wolfhagen_2020: Figure 1], suggesting that $\sigma$ values should be relatively stable across elements. While $\sigma$ parameters from different animal groups within the model are not directly related to each other, the values still need some transformations to be modeled consistently by the multilevel model. These values must be positive, which conflicts with the multilevel model’s need for unconstrained variables. To achieve this, the model uses the same log-transformation technique used for size offsets to create an unconstrained parameter, $\log_e \sigma$, that is then transformed to actual $\sigma$ values after estimating variation across element portions.

### *3.3 Developing Priors from a Simulated Prior Assemblage*

```{r supplement prior distributions for biological parameters, include = FALSE, eval = TRUE}
####SET SEED: REMOVE TO GENERALIZE####
set.seed(1234567890)
#Import data
sheep_data <- fread("./Data/1-s2.0-S0305440312000301-mmc1.csv", nrows = 356) #accessible from https://doi.org/10.1016/j.jas.2012.01.018 supplementary table, resaved as a .CSV

#Make variable for immature animals (`Days at Death` <= 365), female animals (not-immature), and male/castrated animals (not-immature)
sheep_data[, Immature := ifelse(`Days at Death` <= 365, 1, 0)]
sheep_data[, Female := ifelse(`Days at Death` > 365 & Sex %in% "Female", 1, 0)]
sheep_data[, Male := ifelse(`Days at Death` > 365 & Sex %in% c("Castrate", "Male"), 1, 0)]

#Reframe the sheep data to be set in a long format
sheep_longdata <- melt(sheep_data, id.vars = c("Accession  no", "Sex", "Immature", "Female", "Male", "Sca_coracoid_fus", "Hum_prox_fus", "Hum_dist_fus", "Rad_prox_fus", "Rad_dist_fus", "Mtc_dist_fus", "Pel_acetabulum_fus", "Fem_caput_fus", "Fem_dist_fus", "Tib_prox_fus", "Tib_dist_fus", "Mtt_dist_fus", "Cal_prox_fus"), measure.vars = names(sheep_data[, Sca_GLP:Cal_GDde]), variable.name = "Measurement", value.name = "Measurement_value")
sheep_longdata[, Element := tstrsplit(Measurement, "_")[1]]

#Get the standard animal from zoolog
sheep_standard_animal <- data.table(referencesDatabase$`Ovis orientalis`$Uerpmann)

#Keep only limb breadth bone measurements for the mixture model
sheep_mixmod_data <- sheep_longdata[Measurement %in% c("Sca_GLP", "Hum_Bd", "Hum_BT", "Rad_Bp", "Rad_Bd", "Mtc_Bp", "Mtc_BFd", "Fem_Bd", "Tib_Bd", "Ast_Bd", "Mtt_Bp", "Mtt_BFd")]

#Rename standard animal measurements to match the names in the dataset
sheep_standard_animal[EL %in% "Scapula", c("Element", "Measurement") := .("Sca", paste("Sca", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Humerus", c("Element", "Measurement") := .("Hum", paste("Hum", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Radius", c("Element", "Measurement") := .("Rad", paste("Rad", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Metacarpus", c("Element", "Measurement") := .("Mtc", paste("Mtc", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Femur", c("Element", "Measurement") := .("Fem", paste("Fem", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Tibia", c("Element", "Measurement") := .("Tib", paste("Tib", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Astragalus", c("Element", "Measurement") := .("Ast", paste("Ast", Measure, sep = "_"))]
sheep_standard_animal[EL %in% "Metatarsus", c("Element", "Measurement") := .("Mtt", paste("Mtt", Measure, sep = "_"))]

#Rename some measurement name mismatches
sheep_mixmod_data[Measurement %in% "Mtc_BFd", Measurement := "Mtc_Bd"]
sheep_mixmod_data[Measurement %in% "Mtt_BFd", Measurement := "Mtt_Bd"]

#Join the standard animal reference to the dataset (allows for easier identification of relevant standard measurement)
sheep_mixmod_data <- sheep_mixmod_data[sheep_standard_animal[, .(Measurement, Reference_value = Standard)], on = "Measurement"][!is.na(`Accession  no`)]

#Create "element portion" and "measurement set" numbers for the model (needs numeric labels)
#note that, by default, these are ordered alphabetically. Can also assign specific orders if desirable
sheep_mixmod_data[, c("Dimension", "Element_Portion") := .(as.numeric(as.factor(Measurement)), as.numeric(as.factor(Element)))]

#Create "specimen" labels to link meausrements from the same bone specimen (model needs numeric labels)
sheep_mixmod_data[, "Specimen_No" := as.numeric(as.factor(paste(`Accession  no`, Element, sep = "_")))]
sheep_mixmod_data[, "Individual_No" := as.numeric(as.factor(`Accession  no`))]

#Calculate measurement error for observed measurements and reference data (2%, based on Breslawksi and Byers 2015)
sheep_mixmod_data[, c("Measurement_sd", "Reference_sd") := .((Measurement_value * 0.01), (Reference_value * 0.01))]

#Summarize the Popkin, et al. (2012) sheep data
popkin_sheep_summary <- list(
  total_individual_n = sheep_mixmod_data[, .N, .(Individual_No)][, .N],
  immature_individual_n = sheep_mixmod_data[, .N, .(Individual_No, Immature, Female, Male)][, sum(Immature)],
  female_individual_n = sheep_mixmod_data[, .N, .(Individual_No, Immature, Female, Male)][, sum(Female)],
  male_individual_n = sheep_mixmod_data[, .N, .(Individual_No, Immature, Female, Male)][, sum(Male)],
  total_specimen_n = sheep_mixmod_data[, .N, .(Specimen_No)][, .N],
  immature_age_range = sheep_data[Immature %in% 1, max(`Days at Death`) - min(`Days at Death`)]
)

#Subsampling the measurements for measurement error since this may be causing an issue
subsample_n <- 150
mixmod_data_msrmnt_error <- rbind(
  sheep_mixmod_data[Specimen_No %in% sheep_mixmod_data[Immature %in% 1, .N, Specimen_No][, sample(Specimen_No, subsample_n, replace = F)]],
  sheep_mixmod_data[Specimen_No %in% sheep_mixmod_data[Female %in% 1, .N, Specimen_No][, sample(Specimen_No, subsample_n, replace = F)]],
  sheep_mixmod_data[Specimen_No %in% sheep_mixmod_data[Male %in% 1, .N, Specimen_No][, sample(Specimen_No, subsample_n, replace = F)]]
)
mixmod_data_msrmnt_error[, Dimension := as.numeric(as.factor(Dimension))]
mixmod_data_msrmnt_error[, Element_Portion := as.numeric(as.factor(Element_Portion))]
mixmod_data_msrmnt_error[, Specimen_No := as.numeric(as.factor(Specimen_No))]

#Set up the data for Stan modeling
shetland_sheep_mixmod_standata_msrmnt_error <- list(
  N_Specimens = mixmod_data_msrmnt_error[, .N, Specimen_No][, .N],
  N_Measurements = mixmod_data_msrmnt_error[, .N],
  N_Element_Portions = mixmod_data_msrmnt_error[, .N, Element_Portion][, .N],
  N_Dimensions = mixmod_data_msrmnt_error[, .N, Dimension][, .N],
  Element_Portion = mixmod_data_msrmnt_error[, .N, .(Element_Portion, Specimen_No)][order(Specimen_No), Element_Portion],
  Group = mixmod_data_msrmnt_error[, .N, .(Specimen_No, Group = ifelse(Immature %in% 1, 1, ifelse(Female %in% 1, 2, 3)))][order(Specimen_No), Group],
  Immature = mixmod_data_msrmnt_error[, .N, .(Immature, Female, Male, Specimen_No)][order(Specimen_No), Immature],
  Female = mixmod_data_msrmnt_error[, .N, .(Immature, Female, Male, Specimen_No)][order(Specimen_No), Female],
  Male = mixmod_data_msrmnt_error[, .N, .(Immature, Female, Male, Specimen_No)][order(Specimen_No), Male],
  Measurement_obs = mixmod_data_msrmnt_error[, Measurement_value],
  Measurement_sd = mixmod_data_msrmnt_error[, Measurement_sd],
  Reference_obs = mixmod_data_msrmnt_error[, .N, .(Reference_value, Reference_sd, Dimension)][order(Dimension), Reference_value],
  Reference_sd = mixmod_data_msrmnt_error[, .N, .(Reference_value, Reference_sd, Dimension)][order(Dimension), Reference_sd],
  Dimension = mixmod_data_msrmnt_error[, Dimension],
  Specimen = mixmod_data_msrmnt_error[, Specimen_No]
)

#Run the Stan model
reference_pop_msrmnt_error_LSI_model <- cmdstan_model("./Scripts/LSI_known_specimens.stan")
shetland_sheep_samples <- reference_pop_msrmnt_error_LSI_model$sample(
  data = shetland_sheep_mixmod_standata_msrmnt_error,
  chains = 4,
  parallel_chains = 4,
  refresh = 250,
  adapt_delta = 0.80,
  seed = 122839029, #remove this line to generalize/if changing the random seed for the dataset
  max_treedepth = 15
)

#Save the posterior results
#Function to save the results from a cmdstan object
cmdstanr_extract_samples <- function(fit_obj) {
  #credit to Andrew Johnson for this code (source: https://discourse.mc-stan.org/t/rstan-read-stan-csv-throwing-error-with-cmdstan-models-versions-2-35/35665/7)
  vars <- fit_obj$metadata()$stan_variables
  draws <- posterior::as_draws_rvars(fit_obj$draws())
  
  lapply(vars, \(var_name){  
    posterior::draws_of(draws[[var_name]], with_chains = FALSE)
  }) |> setNames(vars)
}
shetland_sheep_post <- cmdstanr_extract_samples(shetland_sheep_samples)
```

Relevant prior estimates of the biologically-relevant parameters were derived from the biometry of `r popkin_sheep_summary$total_individual_n` known-age and known-sex Shetland sheep described in @popkin_etal_2012 (see Section 2.2). Castrated individuals were included as males and animals killed under one year of age were considered immature; this resulted in an assemblage of `r popkin_sheep_summary$immature_individual_n` immature animals, `r popkin_sheep_summary$female_individual_n` females, and `r popkin_sheep_summary$male_individual_n` males/castrates. The element portion definitions and included measurements are shown Table 4 of the main text. From the `r popkin_sheep_summary$total_specimen_n` element portions in the full assemblage, `r subsample_n` immature, female, and male element portions were randomly selected to create an assemblage of `r shetland_sheep_mixmod_standata_msrmnt_error$N_Specimens` element portions for analysis. $\operatorname{LSI}_e$ values are calculated using the *Ovis orientalis* female standard animal (FMC 57951) from Uerpmann and Uerpmann [-@uerpmann_uerpmann_1994: Table 12]. These specimens were modeled using a Bayesian multilevel mixture model that used their known identities to estimate the biologically relevant parameters directly ($\mu_2$, $\delta_1$, $\delta_2$, $\sigma_1$, $\sigma_2$, $\sigma_3$). The resulting hyper-parameters, which average across anatomical variation in the parameter values, are used as a baseline for defining prior distributions of the parameter in the archaeological model (see Figure 3 in the main text).

The prior distributions used in this Bayesian multilevel mixture model on the reference population are more straightforward. While the same transformations to create unconstrained parameters are necessary (e.g., modeling average size as $\mu_2$ with offsets for immature and male animals), the definition of these prior distributions can be broadly described as weakly-informative priors [@gelman_etal_2008]. These weakly-informative prior distributions are reasonable in this case—and not in the archaeological case—because all the mixture model parameters have direct observations rather than relying on latent state estimations. That is, parameters like the size difference between males and females ($\delta_2$) and the size variability in female animals ($\sigma_2$) can be directly observed because the group identities of every specimen are known. With these direct observations, the prior distributions have a more muted influence on the resulting posterior distributions. This does not mean that the prior distributions have no effect, however, which is why objective priors can have undesirable impacts on modeling results [@gabry_etal_2019].

*Prior Distribution Definitions for the Known-Identity Model Hyper-Parameters*
\begin{equation}
\begin{split}
\mu_2 & \sim \operatorname{Normal}(-0.1, 0.1) \\
\log \delta_1 & \sim \operatorname{Normal}(-3, 0.5) \\
\log \delta_2 & \sim \operatorname{Normal}(-3, 0.5) \\
\log \sigma_1 & \sim \operatorname{Normal}(-3, 0.1) \\
\log \sigma_2 & \sim \operatorname{Normal}(-3, 0.1) \\
\log \sigma_3 & \sim \operatorname{Normal}(-3, 0.1) \\
\sigma_{\operatorname{Element}}[1,2,3,4,5,6] & \sim \operatorname{Half-Normal}(0, 0.05) \\
\end{split}
\end{equation}

The average $\operatorname{LSI}_e$ value for females ($\mu_2$) is likely to vary across contexts in reaction to different selective pressures, both anthropogenic and ecological [e.g., @davis_1981; @manning_etal_2015; @wright_vinerdaniels_2015]. While the posterior distribution is extremely focused on this specific population, there is no reason to think that this value should be centered at any particular value since that relates to the standard animal used [@meadow_1999; @wolfhagen_2020]. Therefore, the prior distribution used in archaeological models for $\mu_2$ uses a larger standard deviation, $\mu_2 \sim \operatorname{Normal}(0, 0.1)$, to encompass likely $\operatorname{LSI}_e$ values (Figure 3A in the main text). Under this definition, there is a 95% probability that the $\mu_2$ value lies within the range of `r format(round(qnorm(0.025, 0, 0.1), 2), nsmall = 2)` and `r format(round(qnorm(0.975, 0, 0.1), 2), nsmall = 2)` on the $\operatorname{LSI}_e$ scale, which translates to roughly `r paste0(round(100 * exp(qnorm(0.025, 0, 0.1)), 0), "-", round(100 * (exp(qnorm(0.975, 0, 0.1))), 0), "%")` the size of the standard animal’s measurement.

For the average size difference between immature and female animals ($\delta_1$), the narrowness of the posterior distribution likely reflects the fact that immature animals in the sample cover a narrow age range. Animals killed under one year of age span only `r popkin_sheep_summary$immature_age_range` days and the youngest animals are nearly half a year old [178-214 days: @popkin_etal_2012]. Thus while the posterior results provide a useful starting point for estimating this offset, there is a good potential for larger $\delta_1$ values (i.e., greater size differences between immature and adult-sized female animals) in other contexts that could include animals killed at a younger age (Figure 3B in the main text). To capture this possibility, the archaeological model uses a prior distribution with a larger standard deviation and a slightly higher center, $\log \delta_1 \sim \operatorname{Normal}(-3.5, 0.4)$, which results in an average size difference of `r round(exp(-3.5), 2)` on the $\operatorname{LSI}_e$ scale and a 95% probability that the size difference is between `r round(exp(qnorm(0.025, -3.5, 0.4)), 2)` and `r round(exp(qnorm(0.975, -3.5, 0.4)), 2)`. This translates into expecting the average body size of immature animals in an assemblage being `r paste0(round(100 * (exp(exp(-3.5)) - 1), 0), "%")` smaller than the average body size of adult-sized female animals, but also plausibly believing that this size difference could range from `r paste0(round(100 * (exp(exp(qnorm(0.025, -3.5, 0.4))) - 1), 0), "-", round(100 * (exp(exp(qnorm(0.975, -3.5, 0.4))) - 1), 0), "%")` smaller.

The average size difference between adult males and females ($\delta_2$), also known as the index of sexual dimorphism [@fernandez_monchot_2007], is likely to be under stricter biological control than the other “average body size” parameters in the model. This does not mean that this difference could not vary between contexts, however. Some models of animal domestication argue that initial domestication removed sexual selective pressures on male body size, reducing sexual dimorphism [e.g., @tchernov_horwitz_1991]. In a similar fashion, specialized hunting strategies could also reduce sexual dimorphism by targeting large-bodied males, for example [@zeder_2012; @proaktor_etal_2007; @milner_etal_2007]. Again, the posterior distribution of the extent of sexual dimorphism in the Shetland sheep population provides a useful starting point to describe a prior distribution for the model (Figure 3C in the main text). Increasing the standard deviation of the distribution slightly, $\log \delta_2 \sim \operatorname{Normal}(-2.7, 0.1)$, produces a distribution centered at `r round(exp(-2.7), 2)` $\operatorname{LSI}_e$ units with a 95% probability that the value is between `r paste0(round(exp(qnorm(0.025, -2.7, 0.1)), 2), "-", round(exp(qnorm(0.975, -2.7, 0.1)), 2))`, translating to the average male being `r paste0(round(100 * (exp(exp(qnorm(0.025, -2.7, 0.1))) - 1), 0), "-", round(100 * (exp(exp(qnorm(0.975, -2.7, 0.1))) - 1), 0), "%")` larger than the average female relative to a standard measurement. The smaller standard deviation in the prior distribution of $\delta_2$ than for $\delta_1$ reflects our understanding that the extent of sexual dimorphism, as a biological phenomenon, is less likely to have extreme values than the average size difference between immature and female animals, since $\delta_2$ is unaffected by the specific age structure of the assemblage.

As in the average body size parameters, prior distributions for the size variability model parameters are developed from the Bayesian model of known-identity Shetland sheep measurements. The resulting $\sigma$ hyper-parameters provide a baseline for establishing hyper-parameter prior distributions in archaeological cases. Figure 3D-F of the main text shows the posterior distributions of these $\sigma$ hyper-parameters in both the log-transformed values and associated $\operatorname{LSI}_e$ values. Average size variability within an element portion for immature animals ($\sigma_1$) is higher, on average, than for females ($\sigma_2$) and males ($\sigma_3$). The immature category includes both male and female animals, so larger size variability makes sense; again, it is possible that $\sigma_1$ is relatively low in this population relative to other contexts given the narrow age range of immature animals in the Shetland sheep population. Unlike the average body size parameters, there are not compelling reasons to believe that size variability parameters for females and males ($\sigma_2$ and $\sigma_3$) should vary widely in different contexts given the consistency of coefficients of variation in mammals broadly [@davis_1996]. Thus, the results of this analysis are used for the prior distributions of $\log \sigma_2$ and $\log \sigma_3$, while the prior distribution of $\log \sigma_1$ is given an increased standard deviation and slightly increased average value. Overall, however, these prior distributions suggest that the average size variability within an element portion is between `r paste0(round(exp(qnorm(0.025, -3.05, 0.1)), 2), "-", round(exp(qnorm(0.975, -3.05, 0.1)), 2))` for females and males and is between `r paste0(round(exp(qnorm(0.025, -3.1, 0.1)), 2), "-", round(exp(qnorm(0.975, -3.1, 0.1)), 2))` for immature animals. Note that even though $\sigma_2$ and $\sigma_3$ have the same prior distributions, these values can still vary from each other in different contexts.

## *4. Multilevel Structure of the Model*

The previous section described prior distributions that describe the *average* value for different mixture model parameters across all element portions. To create parameter estimates that are specific to different element portions, it is necessary to estimate the *variation* around these average values that different parameters can have among different element portions. The model uses a Multivariate Normal definition of the model parameters to allow for correlations between different parameters; effectively, the possibility that multiple model parameters will covary from element portion to element portion. To do this, each hyper-parameter has an associated $\sigma_{\operatorname{element}}$ parameter that describes inter-element variation in parameter values. The model uses a non-centered parameterization, wherein the Multivariate Normal distribution is centered at zero to calculate offsets, $\nu_{\operatorname{element}}$, that are added to the average hyper-parameters to calculate model parameters for each element portion. This definition provides computational stability and makes it more straightforward to incorporate other levels of multilevel structure.

*Equations for Defining Inter-Element Variation (Multilevel Modeling):*
\begin{equation}
\begin{split}
\nu_{\operatorname{Element}} & \sim \operatorname{MultivariateNormal} \left(\begin{bmatrix} 0 \\ \vdots \\ 0 \\ \end{bmatrix}, \Sigma_{\operatorname{Element}}\right) \\
\Sigma_{\operatorname{Element}} & =
\begin{pmatrix}
\sigma_{\operatorname{Element}}[1] & \dots & 0 \\
\vdots & \ddots & \vdots \\
0 & \dots & \sigma_{\operatorname{Element}}[8] \\
\end{pmatrix}
\rho_{\operatorname{Element}}
\begin{pmatrix}
\sigma_{\operatorname{Element}}[1] & \dots & 0 \\
\vdots & \ddots & \vdots \\
0 & \dots & \sigma_{\operatorname{Element}}[8] \\
\end{pmatrix} \\
\rho_{\operatorname{Element}} & = LKJcorr(2) \\
\theta_1[\operatorname{Element}] & = \theta_1 + \nu_{\operatorname{Element}}[1] \\
\theta_2[\operatorname{Element}] & = \theta_2 + \nu_{\operatorname{Element}}[2] \\
\mu_2[\operatorname{Element}] & = \mu_2 + \nu_{\operatorname{Element}}[3] \\
\log \delta_1 [\operatorname{Element}] & = \log \delta_1 + \nu_{\operatorname{Element}}[4] \\
\log \delta_2 [\operatorname{Element}] & = \log \delta_2 + \nu_{\operatorname{Element}}[5] \\
\log \sigma_1 [\operatorname{Element}] & = \log \sigma_1 + \nu_{\operatorname{Element}}[6] \\
\log \sigma_2 [\operatorname{Element}] & = \log \sigma_2 + \nu_{\operatorname{Element}}[7] \\
\log \sigma_3 [\operatorname{Element}] & = \log \sigma_3 + \nu_{\operatorname{Element}}[8]
\end{split}
\end{equation}

The multilevel structure used to allow variation in parameter estimates across element portions can also be expanded to create multisite models that can directly compare sex-specific biometric estimates alongside the age/sex composition of different assemblages. Such comparisons can highlight variation in herd management strategies or diachronic body size trends related to population turnover [e.g., @arbuckle_atici_2013; @arbuckle_etal_2016]. To do this, an additional multilevel structure can be applied to the same mixture model parameters, using $\sigma_{Site}$ rather than $\sigma_{\operatorname{Element}}$ parameters. However, an additional set of multilevel structure parameters, $\sigma_{Interaction}$, are also necessary to ensure that elemental variation is different at different sites (e.g., the difference between $\mu_2$ for the distal humerus and $\mu_2$ for the distal radius is not necessarily the same at different sites). Again, weakly-informative priors are appropriate for both sets of parameters. Each additional term is included in the sum to create specific mixture model parameter values.

*Example of Parameter Definition for Inter-Site and Inter-Element Variation:*
$$\theta_1[\operatorname{Site}, \operatorname{Element}] = \theta_1 + \nu_{\operatorname{Site}}[\operatorname{Site}] + \nu_{\operatorname{Element}}[\operatorname{Element}] + \nu_{\operatorname{Interaction}}[\operatorname{Site}, \operatorname{Element}]$$

The inclusion of multiple sites changes the definition of the ‘grand mean’ variable ($\theta_1$ in the example equation) from a site-level estimate to an overall mean across the sites and elements. These parameter estimates thus describe the average composition of the entire set of assemblages. The details of the assemblages included in the analyses would affect how useful these estimates are for interpretation. Assemblage-specific estimates can be calculated for each model parameter by adding the relevant $\nu_{\operatorname{site}}$ estimate to the ‘grand mean’ parameter, which would again act to describe the average composition of the assemblage regardless of its elemental composition.

Prior distributions for $\sigma_{\operatorname{Element}}$ values (and $\sigma_{\operatorname{Site}}$ and $\sigma_{\operatorname{Interaction}}$ values in multisite models) are weakly-informative priors based on the scale of the parameter and the expectation for variation for the parameter. For example, there is likely more variation in $\theta$ parameters—that govern the relative composition of immature, female, and male animals—among element portions than variation in $\sigma$ parameters that govern size variability within each group. Similarly, it is expected that average body sizes of females $\mu_{2}$ will vary more between sites $\sigma_{\operatorname{Site}}[3]$ than between elements within a site $\sigma_{\operatorname{Element}}[3]$. The impacts of these prior distribution definitions were evaluated using prior predictive checking, suggesting that these prior distributions allow enough variability to encompass reasonable size estimates without providing too much prior support to implausible or impossible values (see Section 6 of the Model Supplement).

*Prior Distributions for Element-level Variation (Multilevel Component):*
\begin{equation}
\begin{split}
\sigma_{\operatorname{Element}}[1,2] & \sim \operatorname{Half-Normal}(0, 0.5) \\
\sigma_{\operatorname{Element}}[3,4,5,6,7,8] & \sim \operatorname{Half-Normal}(0, 0.05) \\
\end{split}
\end{equation}

*Prior Distributions for Site-level Variation (Multilevel Component):*
\begin{equation}
\begin{split}
\sigma_{\operatorname{Site}}[1,2] & \sim \operatorname{Half-Normal}(0, 0.5) \\
\sigma_{\operatorname{Site}}[3,4,5] & \sim \operatorname{Half-Normal}(0, 0.1) \\
\sigma_{\operatorname{Site}}[6,7,8] & \sim \operatorname{Half-Normal}(0, 0.05) \\
\rho_{\operatorname{Site}} & = LKJcorr(2)
\end{split}
\end{equation}

*Prior Distributions for Interaction Effect (Multilevel Component):*
\begin{equation}
\begin{split}
\sigma_{\operatorname{Interaction}}[1,2] & \sim \operatorname{Half-Normal}(0, 0.25) \\
\sigma_{\operatorname{Interaction}}[3,4,5] & \sim \operatorname{Half-Normal}(0, 0.1) \\
\sigma_{\operatorname{Interaction}}[6,7,8] & \sim \operatorname{Half-Normal}(0, 0.05) \\
\rho_{\operatorname{Interaction}} & = LKJcorr(2)
\end{split}
\end{equation}

## *5. Prior Distributions for the Model Hyper-Parameters (Simulations and Archaeological Cases)*

*Prior Distribution Definitions for the Single Assemblage Simulation Model Hyper-Parameters*
\begin{equation}
\begin{split}
\theta_1 & \sim \operatorname{Normal}(-0.5, 1.5) \\
\theta_2 & \sim \operatorname{Normal}(0.0, 1.5) \\
\mu_2 & \sim \operatorname{Normal}(0.0, 0.1) \\
\log \delta_1 & \sim \operatorname{Normal}(-3.5, 0.4) \\
\log \delta_2 & \sim \operatorname{Normal}(-2.7, 0.1) \\
\log \sigma_1 & \sim \operatorname{Normal}(-3.05, 0.1) \\
\log \sigma_2 & \sim \operatorname{Normal}(-3.10, 0.1) \\
\log \sigma_3 & \sim \operatorname{Normal}(-3.10, 0.1)
\end{split}
\end{equation}

*Prior Distribution Definitions for the Multisite Simulation Model Hyper-Parameters*
\begin{equation}
\begin{split}
\theta_1 & \sim \operatorname{Normal}(-0.5, 1.5) \\
\theta_2 & \sim \operatorname{Normal}(0.0, 1.5) \\
\mu_2 & \sim \operatorname{Normal}(0.0, 0.2) \\
\log \delta_1 & \sim \operatorname{Normal}(-3.5, 0.5) \\
\log \delta_2 & \sim \operatorname{Normal}(-2.7, 0.2) \\
\log \sigma_1 & \sim \operatorname{Normal}(-3.05, 0.1) \\
\log \sigma_2 & \sim \operatorname{Normal}(-3.10, 0.1) \\
\log \sigma_3 & \sim \operatorname{Normal}(-3.10, 0.1)
\end{split}
\end{equation}

*Prior Distribution Definitions for the Pinarbaşı B Sheep Model Hyper-Parameters*
\begin{equation}
\begin{split}
\theta_1 & \sim \operatorname{Normal}(-0.5, 1.5) \\
\theta_2 & \sim \operatorname{Normal}(0.0, 1.5) \\
\mu_2 & \sim \operatorname{Normal}(0.0, 0.1) \\
\log \delta_1 & \sim \operatorname{Normal}(-3.5, 0.4) \\
\log \delta_2 & \sim \operatorname{Normal}(-2.7, 0.1) \\
\log \sigma_1 & \sim \operatorname{Normal}(-3.05, 0.1) \\
\log \sigma_2 & \sim \operatorname{Normal}(-3.10, 0.1) \\
\log \sigma_3 & \sim \operatorname{Normal}(-3.10, 0.1)
\end{split}
\end{equation}

*Prior Distribution Definitions for the Northwest Anatolian Cattle Model Hyper-Parameters*
\begin{equation}
\begin{split}
\theta_1 & \sim \operatorname{Normal}(-0.5, 1.5) \\
\theta_2 & \sim \operatorname{Normal}(0.0, 1.5) \\
\mu_2 & \sim \operatorname{Normal}(-0.1, 0.1) \\
\log \delta_1 & \sim \operatorname{Normal}(-3.5, 0.5) \\
\log \delta_2 & \sim \operatorname{Normal}(-2.0, 0.5) \\
\log \sigma_1 & \sim \operatorname{Normal}(-3.05, 0.1) \\
\log \sigma_2 & \sim \operatorname{Normal}(-3.10, 0.1) \\
\log \sigma_3 & \sim \operatorname{Normal}(-3.10, 0.1)
\end{split}
\end{equation}

## *6. Simulating Assemblages from the Prior Distributions (Prior Predictive Checking)*

Prior predictive checks are a critical component of Bayesian model development workflows, ensuring that reasonable prior definitions are chosen [@gabry_etal_2019; @gelman_etal_2020_workflow]. This process uses the model's prior distribution definitions to simulate data, which can then be evaluated against domain knowledge and observed data. This is particularly important when dealing with model parameters that are difficult to examine in isolation, like parameters that govern inter-element variation ($\sigma_{\operatorname{Element}}$) in a multilevel model structure. Prior predictive checking is an iterative process, informing researchers about the potential consequences of their prior distribution definitions; in particular, it can highlight how excessively imprecise definitions can provide considerable prior weight on implausible and even impossible values for data [@gabry_etal_2019: Figure 4]. Thus, prior predictive checking allows researchers to create more accurate summaries of their domain knowledge but also produces more efficient MCMC performance because less time is spent evaluating parameter values that are inconsistent with even cursory prior knowledge about the problem being modeled.

```{r prior predictive checking, include = FALSE, eval = TRUE}
#Functions to run the prior predictive checks#

#samples from a normal distribution, but only accepts positive values
rposnorm <- function(n, mu, sigma) {
  #samples from a defined normal distribution until it finds a positive value
  accept = FALSE
  while(!accept) {
    value <- rnorm(n, mean = mu, sd = sigma)
    if(sum(value > 0) == length(value)) accept <- TRUE
  }
  value
}

#creates a correlation matrix using the sampled marginal correlation values
rho_matrix_maker <- function(lkjcorr, K) {
  Rho_matrix <- matrix(NA, ncol = K, nrow = K)
  Rho_matrix[upper.tri(Rho_matrix)] <- lkjcorr
  diag(Rho_matrix) <- 1
  for(i in 1:K) {
    for(j in 1:K) {
      if(is.na(Rho_matrix[i,j])) {
        Rho_matrix[i,j] <- Rho_matrix[j,i]
      }
    }
  }
  Rho_matrix
}

#Single Site simulation#
LSI_mixture_singlesite_prior_predictive_check <- function(N_Element_Portions = 5, grand_prior = singlesite_grand_prior_list, element_sigma_vals = singlesite_element_sigma_values, std_animal = priorpredict_sheep_standard_animal) {
  #Simulating from the priors, start at the top
  grand_theta_raw = c(rnorm(1, grand_prior$prior_theta_raw_1[1], grand_prior$prior_theta_raw_1[2]), rnorm(1, grand_prior$prior_theta_raw_2[1], grand_prior$prior_theta_raw_2[2]))
  grand_mu_female = rnorm(1, grand_prior$prior_mu_female[1], grand_prior$prior_mu_female[2])
  grand_logdelta_immature = rnorm(1, grand_prior$prior_logdelta_immature[1], grand_prior$prior_logdelta_immature[2])
  grand_logdelta_male = rnorm(1, grand_prior$prior_logdelta_male[1], grand_prior$prior_logdelta_male[2])
  grand_logsigma_immature = rnorm(1, grand_prior$prior_logsigma_immature[1], grand_prior$prior_logsigma_immature[2])
  grand_logsigma_female = rnorm(1, grand_prior$prior_logsigma_female[1], grand_prior$prior_logsigma_female[2])
  grand_logsigma_male = rnorm(1, grand_prior$prior_logsigma_male[1], grand_prior$prior_logsigma_male[2])
  #Internal priors for the multilevel modeling
  element_sigma <- c(rposnorm(1, 0, element_sigma_vals[1]),
                     rposnorm(1, 0, element_sigma_vals[2]),
                     rposnorm(1, 0, element_sigma_vals[3]),
                     rposnorm(1, 0, element_sigma_vals[4]),
                     rposnorm(1, 0, element_sigma_vals[5]),
                     rposnorm(1, 0, element_sigma_vals[6]),
                     rposnorm(1, 0, element_sigma_vals[7]),
                     rposnorm(1, 0, element_sigma_vals[8]))
  #
  z_element <- matrix(rnorm(8 * N_Element_Portions, 0, 1), ncol = N_Element_Portions, nrow = 8)
  #number of random correlations: ((Number of parameters)^2 - (Number of parameters)) / 2
  Rho_element <- rlkjcorr_marginal((length(element_sigma)^2 - length(element_sigma)) / 2, K = 8, eta = 2)
  
  #Calculate a single set of the parameter values
  v_element <- t(diag(element_sigma) %*% rho_matrix_maker(Rho_element, 8) %*% diag(element_sigma) %*% z_element)
  #
  
  #calculating the output variables (untransformed)
  theta_raw <- matrix(NA, nrow = 2, ncol = N_Element_Portions)
  theta_raw[1, ] <- grand_theta_raw[1] + v_element[, 1]
  theta_raw[2, ] <- grand_theta_raw[2] + v_element[, 2]
  mu_female = grand_mu_female + v_element[, 3]
  logdelta_immature = grand_logdelta_immature + v_element[, 4];
  logdelta_male = grand_logdelta_male + v_element[, 5];
  logsigma_immature = grand_logsigma_immature + v_element[, 6];
  logsigma_female = grand_logsigma_female + v_element[, 7];
  logsigma_male = grand_logsigma_male + v_element[, 8];
  
  #mu and sigma transformations
  mu_immature = mu_female - exp(logdelta_immature)
  mu_male = mu_female + exp(logdelta_male)
  sigma_immature = exp(logsigma_immature)
  sigma_female = exp(logsigma_female)
  sigma_male = exp(logsigma_male)
  #theta transformations
  grand_theta <- rep(NA, 3)
  grand_theta[1] = boot::inv.logit(grand_theta_raw[1] + log(0.5));
  grand_theta[2] = (1 - grand_theta[1]) * boot::inv.logit(grand_theta_raw[2] + log(1));
  grand_theta[3] = 1 - sum(grand_theta[1:2]);
  theta <- matrix(NA, nrow = 3, ncol = N_Element_Portions)
  for(i in 1:N_Element_Portions) {
    theta[1, i] = boot::inv.logit(theta_raw[1, i] + log(0.5));
    theta[2, i] = (1 - theta[1, i]) * boot::inv.logit(theta_raw[2, i] + log(1));
    theta[3, i] = 1 - sum(theta[1:2, i]);
  }
  p_immature = grand_theta[1]
  theta_female = grand_theta[2] / sum(grand_theta[2:3])
  
  #
  grand_mu_immature = grand_mu_female - exp(grand_logdelta_immature)
  grand_mu_male = grand_mu_female + exp(grand_logdelta_male)
  grand_sigma_immature = exp(grand_logsigma_immature)
  grand_sigma_female = exp(grand_logsigma_female)
  grand_sigma_male = exp(grand_logsigma_male)
  
  #use the proportions to get a set number of immature, female, and male specimens (25 per element)
  specimen_LSI_values <- data.table(Specimen_No = 1:(25 * N_Element_Portions), rbindlist(lapply(1:N_Element_Portions, function(x) data.table(Element_Portion = x, Sex = sample(c("Immature", "Female", "Male"), 25, prob = theta[, x], replace = T)))))
  specimen_LSI_values[Sex %in% "Immature", LSI_specimen := rnorm(1, mu_immature[Element_Portion], sigma_immature[Element_Portion]), Specimen_No]
  specimen_LSI_values[Sex %in% "Female", LSI_specimen := rnorm(1, mu_female[Element_Portion], sigma_female[Element_Portion]), Specimen_No]
  specimen_LSI_values[Sex %in% "Male", LSI_specimen := rnorm(1, mu_male[Element_Portion], sigma_male[Element_Portion]), Specimen_No]
  
  #turn these into broad measurements
  #Element_Portion 1: Humerus Bd and Humerus BT
  #Element_Portion 2: Radius Bp
  #Element_Portion 3: Metacarpal Bp
  #Element_Portion 4: Metatarsal Bp
  #Element_Portion 5: Astragalus Bd
  measurement_LSI_calc <- function(Specimen_No, Element_Portion, LSI, N_measurements) {
    data.table(Specimen_No, Element_Portion, Measurement = paste0(Element_Portion, ".", 1:N_measurements), LSI_measurement = rnorm(N_measurements, LSI, 0.02))
  }
  measurement_LSI_values <- rbind(
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 1, Specimen_No], function(x) measurement_LSI_calc(Specimen_No = x, Element_Portion = 1, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 2))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 2, Specimen_No], function(x) measurement_LSI_calc(Specimen_No = x, Element_Portion = 2, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 3, Specimen_No], function(x) measurement_LSI_calc(Specimen_No = x, Element_Portion = 3, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 4, Specimen_No], function(x) measurement_LSI_calc(Specimen_No = x, Element_Portion = 4, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 5, Specimen_No], function(x) measurement_LSI_calc(Specimen_No = x, Element_Portion = 5, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1)))
  )
  
  measurement_values <- measurement_LSI_values[std_animal[!is.na(Measurement), .(EL, Measurement, Measure, Reference_value = Standard)], on = "Measurement"]
  measurement_values[, Measurement_value := exp(LSI_measurement) * Reference_value]
  #
  list(Overall_Parameters = c(grand_theta, grand_mu_immature, grand_mu_female, grand_mu_male, grand_sigma_immature, grand_sigma_female, grand_sigma_male, p_immature, theta_female), Element_Parameters = cbind(t(theta), mu_immature, mu_female, mu_male, sigma_immature, sigma_female, sigma_male), Specimens = specimen_LSI_values, Measurements = measurement_values)
}

#Multisite simulation#
LSI_mixture_multisite_prior_predictive_check <- function(N_Element_Portions = 5, N_Sites = 3, grand_prior = multisite_grand_prior_list, site_sigma_vals = multisite_site_sigma_values, element_sigma_vals = multisite_element_sigma_values, interaction_sigma_vals = multisite_interaction_sigma_values, std_animal = priorpredict_sheep_standard_animal) {
  #Simulating from the priors, start at the top
  grand_theta_raw = c(rnorm(1, grand_prior$prior_theta_raw_1[1], grand_prior$prior_theta_raw_1[2]), rnorm(1, grand_prior$prior_theta_raw_2[1], grand_prior$prior_theta_raw_2[2]))
  grand_mu_female = rnorm(1, grand_prior$prior_mu_female[1], grand_prior$prior_mu_female[2])
  grand_logdelta_immature = rnorm(1, grand_prior$prior_logdelta_immature[1], grand_prior$prior_logdelta_immature[2])
  grand_logdelta_male = rnorm(1, grand_prior$prior_logdelta_male[1], grand_prior$prior_logdelta_male[2])
  grand_logsigma_immature = rnorm(1, grand_prior$prior_logsigma_immature[1], grand_prior$prior_logsigma_immature[2])
  grand_logsigma_female = rnorm(1, grand_prior$prior_logsigma_female[1], grand_prior$prior_logsigma_female[2])
  grand_logsigma_male = rnorm(1, grand_prior$prior_logsigma_male[1], grand_prior$prior_logsigma_male[2])
  #Internal priors for the multilevel modeling
  element_sigma <- c(rposnorm(1, 0, element_sigma_vals[1]),
                     rposnorm(1, 0, element_sigma_vals[2]),
                     rposnorm(1, 0, element_sigma_vals[3]),
                     rposnorm(1, 0, element_sigma_vals[4]),
                     rposnorm(1, 0, element_sigma_vals[5]),
                     rposnorm(1, 0, element_sigma_vals[6]),
                     rposnorm(1, 0, element_sigma_vals[7]),
                     rposnorm(1, 0, element_sigma_vals[8]))
  #
  z_element <- matrix(rnorm(8 * N_Element_Portions, 0, 1), ncol = N_Element_Portions, nrow = 8)
  #number of random correlations: ((Number of parameters)^2 - (Number of parameters)) / 2
  Rho_element <- rlkjcorr_marginal((length(element_sigma)^2 - length(element_sigma)) / 2, K = 8, eta = 2)
  #
  site_sigma <- c(rposnorm(1, 0, site_sigma_vals[1]),
                     rposnorm(1, 0, site_sigma_vals[2]),
                     rposnorm(1, 0, site_sigma_vals[3]),
                     rposnorm(1, 0, site_sigma_vals[4]),
                     rposnorm(1, 0, site_sigma_vals[5]),
                     rposnorm(1, 0, site_sigma_vals[6]),
                     rposnorm(1, 0, site_sigma_vals[7]),
                     rposnorm(1, 0, site_sigma_vals[8]))
  #
  z_site <- matrix(rnorm(8 * N_Sites, 0, 1), ncol = N_Sites, nrow = 8)
  Rho_site <- rlkjcorr_marginal((length(site_sigma)^2 - length(site_sigma)) / 2, K = 8, eta = 2)
  #
  interaction_sigma <- c(rposnorm(1, 0, interaction_sigma_vals[1]),
                     rposnorm(1, 0, interaction_sigma_vals[2]),
                     rposnorm(1, 0, interaction_sigma_vals[3]),
                     rposnorm(1, 0, interaction_sigma_vals[4]),
                     rposnorm(1, 0, interaction_sigma_vals[5]),
                     rposnorm(1, 0, interaction_sigma_vals[6]),
                     rposnorm(1, 0, interaction_sigma_vals[7]),
                     rposnorm(1, 0, interaction_sigma_vals[8]))
  #
  z_interaction <- matrix(rnorm(8 * N_Sites * N_Element_Portions, 0, 1), ncol = N_Sites * N_Element_Portions, nrow = 8)
  Rho_interaction <- rlkjcorr_marginal((length(interaction_sigma)^2 - length(interaction_sigma)) / 2, K = 8, eta = 2)
  #Calculate a single set of the parameter values
  v_element <- t(diag(element_sigma) %*% rho_matrix_maker(Rho_element, 8) %*% diag(element_sigma) %*% z_element)
  v_site <- t(diag(site_sigma) %*% rho_matrix_maker(Rho_site, 8) %*% diag(site_sigma) %*% z_site)
  v_interaction <- t(diag(interaction_sigma) %*% rho_matrix_maker(Rho_interaction, 8) %*% diag(interaction_sigma) %*% z_interaction)
  #
  
  #calculating site-level outputs variables (untransformed)
  site_theta_raw <- matrix(NA, nrow = 2, ncol = N_Sites)
  site_theta_raw[1, ] <- grand_theta_raw[1] + v_site[, 1]
  site_theta_raw[2, ] <- grand_theta_raw[2] + v_site[, 2]
  site_mu_female = grand_mu_female + v_site[, 3]
  site_logdelta_immature = grand_logdelta_immature + v_site[, 4]
  site_logdelta_male = grand_logdelta_male + v_site[, 5]
  site_logsigma_immature = grand_logsigma_immature + v_site[, 6]
  site_logsigma_female = grand_logsigma_female + v_site[, 7]
  site_logsigma_male = grand_logsigma_male + v_site[, 8]
  
  #calculating the site + element output variables (untransformed)
  theta_raw <- array(NA, dim = c(2, N_Sites, N_Element_Portions))
  mu_female <- matrix(NA, nrow = N_Sites, ncol = N_Element_Portions)
  logdelta_immature <- matrix(NA, nrow = N_Sites, ncol = N_Element_Portions)
  logdelta_male <- matrix(NA, nrow = N_Sites, ncol = N_Element_Portions)
  logsigma_immature <- matrix(NA, nrow = N_Sites, ncol = N_Element_Portions)
  logsigma_female <- matrix(NA, nrow = N_Sites, ncol = N_Element_Portions)
  logsigma_male <- matrix(NA, nrow = N_Sites, ncol = N_Element_Portions)
  for(i in 1:N_Sites) {
    for(j in 1:N_Element_Portions) {
      theta_raw[1, i, j] = grand_theta_raw[1] + v_site[i, 1] + v_element[j, 1] + v_interaction[N_Element_Portions * (i - 1) + j, 1];
      theta_raw[2, i, j] = grand_theta_raw[2] + v_site[i, 2] + v_element[j, 2] + v_interaction[N_Element_Portions * (i - 1) + j, 2];
      mu_female[i, j] = grand_mu_female + v_site[i, 3] + v_element[j, 3] + v_interaction[N_Element_Portions * (i - 1) + j, 3];
      logdelta_immature[i, j] = grand_logdelta_immature + v_site[i, 4] + v_element[j, 4] + v_interaction[N_Element_Portions * (i - 1) + j, 4];
      logdelta_male[i, j] = grand_logdelta_male + v_site[i, 5] + v_element[j, 5] + v_interaction[N_Element_Portions * (i - 1) + j, 5];
      logsigma_immature[i, j] = grand_logsigma_immature + v_site[i, 6] + v_element[j, 6] + v_interaction[N_Element_Portions * (i - 1) + j, 6];
      logsigma_female[i, j] = grand_logsigma_female + v_site[i, 7] + v_element[j, 7] + v_interaction[N_Element_Portions * (i - 1) + j, 7];
      logsigma_male[i, j] = grand_logsigma_male + v_site[i, 8] + v_element[j, 8] + v_interaction[N_Element_Portions * (i - 1) + j, 8];
    }
  }

  #mu and sigma transformations
  site_mu_immature = site_mu_female - exp(site_logdelta_immature)
  site_mu_male = site_mu_female + exp(site_logdelta_male)
  site_sigma_immature = exp(site_logsigma_immature)
  site_sigma_female = exp(site_logsigma_female)
  site_sigma_male = exp(site_logsigma_male)
  mu_immature = mu_female - exp(logdelta_immature)
  mu_male = mu_female + exp(logdelta_male)
  sigma_immature = exp(logsigma_immature)
  sigma_female = exp(logsigma_female)
  sigma_male = exp(logsigma_male)
  #theta transformations
  grand_theta <- rep(NA, 3)
  grand_theta[1] = boot::inv.logit(grand_theta_raw[1] + log(0.5));
  grand_theta[2] = (1 - grand_theta[1]) * boot::inv.logit(grand_theta_raw[2] + log(1));
  grand_theta[3] = 1 - sum(grand_theta[1:2]);
  #site-level and site+element-level
  site_theta <- matrix(NA, nrow = 3, ncol = N_Sites)
  theta <- array(NA, dim = c(3, N_Sites, N_Element_Portions))
  site_p_immature <- rep(NA, N_Sites)
  site_theta_female <- rep(NA, N_Sites)
  for(i in 1:N_Sites) {
    site_theta[1, i] = boot::inv.logit(site_theta_raw[1, i] + log(0.5));
    site_theta[2, i] = (1 - site_theta[1, i]) * boot::inv.logit(site_theta_raw[2, i] + log(1));
    site_theta[3, i] = 1 - sum(site_theta[1:2, i]);
    site_p_immature[i] = site_theta[1, i];
    site_theta_female[i] = site_theta[2, i] / sum(site_theta[2:3, i]);
    for(j in 1:N_Element_Portions) {
      theta[1, i, j] = boot::inv.logit(theta_raw[1, i, j] + log(0.5));
      theta[2, i, j] = (1 - theta[1, i, j]) * boot::inv.logit(theta_raw[2, i, j] + log(1));
      theta[3, i, j] = 1 - sum(theta[1:2, i, j]);
    }
  }
  #
  grand_mu_immature = grand_mu_female - exp(grand_logdelta_immature)
  grand_mu_male = grand_mu_female + exp(grand_logdelta_male)
  grand_sigma_immature = exp(grand_logsigma_immature)
  grand_sigma_female = exp(grand_logsigma_female)
  grand_sigma_male = exp(grand_logsigma_male)
  
  #use the proportions to get a set number of immature, female, and male specimens (25 per element per site)
  specimen_LSI_simulate <- function(Site_No, N_Element_Portions, theta, mu_immature, mu_female, mu_male, sigma_immature, sigma_female, sigma_male) {
    specimen_LSI_values <- data.table(Site_No = Site_No, Specimen = paste0(Site_No, ".", 1:(25 * N_Element_Portions)), rbindlist(lapply(1:N_Element_Portions, function(x) data.table(Element_Portion = x, Sex = sample(c("Immature", "Female", "Male"), 25, prob = theta[, Site_No, x], replace = T)))))
    specimen_LSI_values[Sex %in% "Immature", LSI_specimen := rnorm(1, mu_immature[Site_No, Element_Portion], sigma_immature[Site_No, Element_Portion]), Specimen]
    specimen_LSI_values[Sex %in% "Female", LSI_specimen := rnorm(1, mu_female[Site_No, Element_Portion], sigma_female[Site_No, Element_Portion]), Specimen]
    specimen_LSI_values[Sex %in% "Male", LSI_specimen := rnorm(1, mu_male[Site_No, Element_Portion], sigma_male[Site_No, Element_Portion]), Specimen]
    specimen_LSI_values
  }
  specimen_LSI_values <- rbindlist(lapply(1:N_Sites, function(x) specimen_LSI_simulate(Site_No = x, N_Element_Portions = N_Element_Portions, theta = theta, mu_immature = mu_immature, mu_female = mu_female, mu_male = mu_male, sigma_immature = sigma_immature, sigma_female = sigma_female, sigma_male = sigma_male)))
  specimen_LSI_values[, Specimen_No := as.numeric(as.factor(Specimen))]

  #turn these into broad measurements
  #Element_Portion 1: Humerus Bd and Humerus BT
  #Element_Portion 2: Radius Bp
  #Element_Portion 3: Metacarpal Bp
  #Element_Portion 4: Metatarsal Bp
  #Element_Portion 5: Astragalus Bd
  measurement_LSI_calc <- function(Site_No, Specimen_No, Element_Portion, LSI, N_measurements) {
    data.table(Site_No, Specimen_No, Element_Portion, Measurement = paste0(Element_Portion, ".", 1:N_measurements), LSI_measurement = rnorm(N_measurements, LSI, 0.02))
  }
  measurement_LSI_values <- rbind(
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 1, Specimen_No], function(x) measurement_LSI_calc(Site_No = specimen_LSI_values[Specimen_No %in% x, Site_No], Specimen_No = x, Element_Portion = 1, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 2))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 2, Specimen_No], function(x) measurement_LSI_calc(Site_No = specimen_LSI_values[Specimen_No %in% x, Site_No], Specimen_No = x, Element_Portion = 2, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 3, Specimen_No], function(x) measurement_LSI_calc(Site_No = specimen_LSI_values[Specimen_No %in% x, Site_No], Specimen_No = x, Element_Portion = 3, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 4, Specimen_No], function(x) measurement_LSI_calc(Site_No = specimen_LSI_values[Specimen_No %in% x, Site_No], Specimen_No = x, Element_Portion = 4, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1))),
    rbindlist(lapply(specimen_LSI_values[Element_Portion %in% 5, Specimen_No], function(x) measurement_LSI_calc(Site_No = specimen_LSI_values[Specimen_No %in% x, Site_No], Specimen_No = x, Element_Portion = 5, LSI = specimen_LSI_values[Specimen_No %in% x, LSI_specimen], 1)))
  )
  
  measurement_values <- measurement_LSI_values[std_animal[!is.na(Measurement), .(EL, Measurement, Measure, Reference_value = Standard)], on = "Measurement"]
  measurement_values[, Measurement_value := exp(LSI_measurement) * Reference_value]
  #
  list(Overall_Parameters = c(grand_theta, grand_mu_immature, grand_mu_female, grand_mu_male, grand_sigma_immature, grand_sigma_female, grand_sigma_male), Site_Parameters = cbind(t(site_theta), site_mu_immature, site_mu_female, site_mu_male, site_sigma_immature, site_sigma_female, site_sigma_male, site_p_immature, site_theta_female), Element_Parameters = list(theta, mu_immature, mu_female, mu_male, sigma_immature, sigma_female, sigma_male), Specimens = specimen_LSI_values, Measurements = measurement_values)
}

priorpredict_sheep_standard_animal <- data.table(referencesDatabase$`Ovis orientalis`$Uerpmann)
priorpredict_sheep_standard_animal[EL %in% "Humerus" & Measure %in% "Bd", Measurement := "1.1"]
priorpredict_sheep_standard_animal[EL %in% "Humerus" & Measure %in% "BT", Measurement := "1.2"]
priorpredict_sheep_standard_animal[EL %in% "Radius" & Measure %in% "Bp", Measurement := "2.1"]
priorpredict_sheep_standard_animal[EL %in% "Metacarpus" & Measure %in% "Bp", Measurement := "3.1"]
priorpredict_sheep_standard_animal[EL %in% "Metatarsus" & Measure %in% "Bp", Measurement := "4.1"]
priorpredict_sheep_standard_animal[EL %in% "Astragalus" & Measure %in% "Bd", Measurement := "5.1"]
```

```{r prior predict (single site), include = FALSE, eval = TRUE}
#Single assemblage run#
priorpredict_iterations <- 1000
singlesite_N_Element_Portions <- 5
singlesite_grand_prior_list <- list(
  prior_theta_raw_1 = c(-0.5, 1.5),
  prior_theta_raw_2 = c(0, 1.5),
  prior_mu_female = c(0, 0.1),
  prior_logdelta_immature = c(-3.5, 0.4),
  prior_logdelta_male = c(-2.7, 0.1),
  prior_logsigma_immature = c(-3.05, 0.1),
  prior_logsigma_female = c(-3.1, 0.1),
  prior_logsigma_male = c(-3.1, 0.1)
)
singlesite_element_sigma_values <- c(
  0.5, #theta_1
  0.5, #theta_2
  0.05, #mu_female
  0.05, #delta_immature
  0.05, #delta_male
  0.05, #sigma_immature
  0.05, #sigma_female
  0.05 #sigma_male
)

clusterExport(cl, list('data.table', 'rbindlist', 'rposnorm', 'rho_matrix_maker', 'rlkjcorr_marginal', 'inv.logit', 'LSI_mixture_singlesite_prior_predictive_check', 'singlesite_N_Element_Portions', 'singlesite_grand_prior_list', 'singlesite_element_sigma_values', 'priorpredict_sheep_standard_animal'))

singlesite_predicted_assemblages <- parLapply(cl = cl, 1:priorpredict_iterations, fun = function(x) LSI_mixture_singlesite_prior_predictive_check(N_Element_Portions = singlesite_N_Element_Portions))
```

```{r prior predict (multisite), include = FALSE, eval = TRUE}
multisite_N_Sites <- 3
multisite_N_Element_Portions <- 5
multisite_grand_prior_list <- list(
  prior_theta_raw_1 = c(-0.5, 1.5),
  prior_theta_raw_2 = c(0, 1.5),
  prior_mu_female = c(0, 0.2),
  prior_logdelta_immature = c(-3.5, 0.5),
  prior_logdelta_male = c(-2.7, 0.2),
  prior_logsigma_immature = c(-3.05, 0.1),
  prior_logsigma_female = c(-3.1, 0.1),
  prior_logsigma_male = c(-3.1, 0.1)
)
multisite_site_sigma_values <- c(
  0.5, #theta_1
  0.5, #theta_2
  0.1, #mu_female
  0.1, #delta_immature
  0.1, #delta_male
  0.05, #sigma_immature
  0.05, #sigma_female
  0.05 #sigma_male
)
multisite_element_sigma_values <- c(
  0.5, #theta_1
  0.5, #theta_2
  0.05, #mu_female
  0.05, #delta_immature
  0.05, #delta_male
  0.05, #sigma_immature
  0.05, #sigma_female
  0.05 #sigma_male
)
multisite_interaction_sigma_values <- c(
  0.25, #theta_1
  0.25, #theta_2
  0.05, #mu_female
  0.05, #delta_immature
  0.05, #delta_male
  0.05, #sigma_immature
  0.05, #sigma_female
  0.05 #sigma_male
)

clusterExport(cl, list('data.table', 'rbindlist', 'rposnorm', 'rho_matrix_maker', 'rlkjcorr_marginal', 'inv.logit', 'LSI_mixture_multisite_prior_predictive_check', 'multisite_N_Sites', 'multisite_N_Element_Portions', 'multisite_grand_prior_list', 'multisite_site_sigma_values', 'multisite_element_sigma_values', 'multisite_interaction_sigma_values', 'priorpredict_sheep_standard_animal'))

multisite_predicted_assemblages <- parLapply(cl = cl, 1:1000, fun = function(x) LSI_mixture_multisite_prior_predictive_check(N_Sites = multisite_N_Sites, N_Element_Portions = multisite_N_Element_Portions))
```

```{r making the prior prediction tables and figures, include = FALSE, eval = TRUE}
#Model Supplement Table 1: the element portions and related measurements
model_supplement_table_1 <- data.table(
  `Element Portion` = c("Humerus", "Humerus", "Radius", "Metacarpus", "Metatarsus", "Astragalus"),
  `Measured Dimension` = c("Bd", "BT", "Bp", "Bp", "Bp", "Bd"),
  `Reference Value (mm)` = c(priorpredict_sheep_standard_animal[!is.na(Measurement)][order(Measurement), Standard])
)

#Model Supplement Figure 1#
singlesite_proportions <- rbindlist(lapply(singlesite_predicted_assemblages, function(x) x$Specimens[, .(Proportion = .N / 125), Sex]))
singlesite_proportion_plot <- ggplot(singlesite_proportions[, .(group = factor(Sex, levels = c("Immature", "Female", "Male")), Proportion)]) + aes(x = group, y = Proportion) +
  stat_slab(normalize = "groups", aes(fill = group, fill_ramp = stat(cut_cdf_qi(cdf, .width = c(1, 0.95, 0.80), labels = scales::percent_format())))) +
  stat_pointinterval(.width = c(0.8, 0.95), aes(color = group), position = position_dodge(width = 0.2, preserve = "single")) +
  scale_fill_ramp_discrete(name = "Interval", range = c(0.8, 0.3), na.translate = F) +
  scale_color_manual(name = "Group", values = c("black", "blue", "red"), na.translate = F) +
  scale_fill_manual(name = "Group", values = c("black", "blue", "red"), na.translate = F) +
  scale_x_discrete(name = NULL, breaks = NULL) +
  scale_y_continuous(name = "Proportion", limits = c(0, 1)) +
  labs(title = "Single Assemblage") +
  theme_classic() + theme(legend.text = element_text(size = 10), legend.position = "bottom", legend.justification = c("center"), legend.box.just = "right", legend.margin = margin(6, 6, 6, 6), axis.text.x = element_text(size = 10), axis.title = element_text(size = 16), axis.text.y = element_text(angle = 90, size = 10, hjust = 0.5))
#
multisite_proportions <- rbindlist(lapply(multisite_predicted_assemblages, function(x) x$Specimens[, .(Proportion = .N / (3 * 125)), Sex]))
multisite_proportion_plot <- ggplot(multisite_proportions[, .(group = factor(Sex, levels = c("Immature", "Female", "Male")), Proportion)]) + aes(x = group, y = Proportion) +
  stat_slab(normalize = "groups", aes(fill = group, fill_ramp = stat(cut_cdf_qi(cdf, .width = c(1, 0.95, 0.80), labels = scales::percent_format())))) +
  stat_pointinterval(.width = c(0.8, 0.95), aes(color = group), position = position_dodge(width = 0.2, preserve = "single")) +
  scale_fill_ramp_discrete(name = "Interval", range = c(0.8, 0.3), na.translate = F) +
  scale_color_manual(name = "Group", values = c("black", "blue", "red"), na.translate = F) +
  scale_fill_manual(name = "Group", values = c("black", "blue", "red"), na.translate = F) +
  scale_x_discrete(name = NULL, breaks = NULL) +
  scale_y_continuous(name = "Proportion", limits = c(0, 1)) +
  labs(title = "Multisite") +
  theme_classic() + theme(legend.text = element_text(size = 10), legend.position = "bottom", legend.justification = c("center"), legend.box.just = "right", legend.margin = margin(6, 6, 6, 6), axis.text.x = element_text(size = 10), axis.title = element_text(size = 16), axis.text.y = element_text(angle = 90, size = 10, hjust = 0.5))
#
singlesite_simulated_proportions <- list(
  simulated_p_immature = paste0(100 * round(quantile(singlesite_proportions[Sex %in% "Immature", Proportion], 0.025), 2), "-", 100 * round(quantile(singlesite_proportions[Sex %in% "Immature", Proportion], 0.975), 2), "%"),
  simulated_p_female = paste0(100 * round(quantile(singlesite_proportions[Sex %in% "Female", Proportion], 0.025), 2), "-", 100 * round(quantile(singlesite_proportions[Sex %in% "Female", Proportion], 0.975), 2), "%"),
  simulated_p_male = paste0(100 * round(quantile(singlesite_proportions[Sex %in% "Male", Proportion], 0.025), 2), "-", 100 * round(quantile(singlesite_proportions[Sex %in% "Male", Proportion], 0.975), 2), "%")
)
#
multisite_simulated_proportions <- list(
  simulated_p_immature = paste0(100 * round(quantile(multisite_proportions[Sex %in% "Immature", Proportion], 0.025), 2), "-", 100 * round(quantile(multisite_proportions[Sex %in% "Immature", Proportion], 0.975), 2), "%"),
  simulated_p_female = paste0(100 * round(quantile(multisite_proportions[Sex %in% "Female", Proportion], 0.025), 2), "-", 100 * round(quantile(multisite_proportions[Sex %in% "Female", Proportion], 0.975), 2), "%"),
  simulated_p_male = paste0(100 * round(quantile(multisite_proportions[Sex %in% "Male", Proportion], 0.025), 2), "-", 100 * round(quantile(multisite_proportions[Sex %in% "Male", Proportion], 0.975), 2), "%")
)
#
singlesite_humbd <- data.table(Model = "Single Assemblage", Hum_Bd = c(sapply(singlesite_predicted_assemblages, function(x) x$Measurements[Measurement %in% "1.1", Measurement_value])))
multisite_humbd <- data.table(Model = "Multisite", Hum_Bd = c(sapply(multisite_predicted_assemblages, function(x) x$Measurements[Measurement %in% "1.1", Measurement_value])))

singlesite_humbd_histogram <- ggplot(data = singlesite_humbd) + aes(x = Hum_Bd) +
  geom_histogram(binwidth = 2.5, alpha = 0.8) +
  geom_vline(data = data.table(Hum_Bd = priorpredict_sheep_standard_animal[Measurement %in% "1.1", Standard]), aes(xintercept = Hum_Bd), col = "black", linetype = "dashed") +
  geom_vline(data = data.table(Hum_Bd = c(25, 39)), aes(xintercept = Hum_Bd), col = "black", linetype = "dotted") +
  stat_pointinterval(.width = c(0.8, 0.95), y = -50) +
  scale_x_continuous(limits = c(10, 75)) +
  labs(x = "Simulated Humerus Bd (mm)", y = "Frequency", title = "", subtitle = "") +
  theme_classic() + theme(legend.text = element_text(size = 10), legend.position = "bottom", legend.justification = c("center"), legend.box.just = "right", legend.margin = margin(6, 6, 6, 6), axis.text.x = element_text(angle = 90, size = 10), axis.title = element_text(size = 16), axis.text.y = element_text(angle = 90, size = 10, hjust = 0.5))

multisite_humbd_histogram <- ggplot(data = multisite_humbd) + aes(x = Hum_Bd) +
  geom_histogram(binwidth = 2.5, alpha = 0.8) +
  geom_vline(data = data.table(Hum_Bd = priorpredict_sheep_standard_animal[Measurement %in% "1.1", Standard]), aes(xintercept = Hum_Bd), col = "black", linetype = "dashed") +
  geom_vline(data = data.table(Hum_Bd = c(25, 39)), aes(xintercept = Hum_Bd), col = "black", linetype = "dotted") +
  stat_pointinterval(.width = c(0.8, 0.95), y = -50) +
  scale_x_continuous(limits = c(10, 75)) +
  labs(x = "Simulated Humerus Bd (mm)", y = "Frequency", title = "", subtitle = "") +
  theme_classic() + theme(legend.text = element_text(size = 10), legend.position = "bottom", legend.justification = c("center"), legend.box.just = "right", legend.margin = margin(6, 6, 6, 6), axis.text.x = element_text(angle = 90, size = 10), axis.title = element_text(size = 16), axis.text.y = element_text(angle = 90, size = 10, hjust = 0.5))

model_supplement_figure_1 <- ggarrange(singlesite_proportion_plot, multisite_proportion_plot, singlesite_humbd_histogram, multisite_humbd_histogram, nrow = 2, ncol = 2, labels = c("A", "", "B", ""), common.legend = T, legend = "bottom")
Cairo(width = 8, height = 10, units = "in", file = "./Output/Figures/model_supplement_figure_1.png", type = "png", dpi = 600)
model_supplement_figure_1
dev.off()

#text summary#
humbd_summary <- list(
  singlesite_humbd_95CI = paste0(round(quantile(singlesite_humbd[, Hum_Bd], c(0.025)), 0), "-", round(quantile(singlesite_humbd[, Hum_Bd], c(0.975)), 0)),
  multisite_humbd_95CI = paste0(round(quantile(multisite_humbd[, Hum_Bd], c(0.025)), 0), "-", round(quantile(multisite_humbd[, Hum_Bd], c(0.975)), 0))
)
```

Prior predictive checks were developed for both the single-assemblage and multisite model fits, using the prior distribution definitions used in the sheep simulations in the main text (see Section 5 of the Model Supplement). In each simulation, 25 $\operatorname{LSI}_{\operatorname{Specimen}}$ values were calculated for each of the `r singlesite_N_Element_Portions` element portions based on the relevant prior distributions and model structures; for the multisite simulation, this was done for `r multisite_N_Sites` assemblages. To evaluate the feasibility of these prior distribution definitions, the $\operatorname{LSI}_{\operatorname{Specimen}}$ were converted into simulated measurement values based on the reference values of the *Ovis orientalis* female standard animal (FMC 57951) from Uerpmann and Uerpmann [-@uerpmann_uerpmann_1994: Table 12]. Table 1 shows how the five element portions were converted into measurements from specific dimensions, following the equations in Section 2 of this model supplement. Each simulation was run `r priorpredict_iterations` times, producing `r priorpredict_iterations` assemblages of relevant simulated measurement values.

```{r model_supplement_table_1, echo = FALSE}
kbl(model_supplement_table_1, caption = "Values used to convert simulated LSI values into measurements in the prior predictive simulations. Dimension definitions follow von den Driesch (1976). Reference value refers to the female standard mouflon FMC 57951 (Uerpmann and Uerpmann 1994: Table 12).", booktabs = T, escape = FALSE, linesep = c('', '\\addlinespace', rep('\\addlinespace', 4)))
```

Figure 1 shows two results for the single assemblage (left) and multisite (right) prior prediction simulations. The top row shows the simulated proportions of immature, female, and male specimens in the simulated assemblages (Figure 1A). These proportions were sampled directly from the element-specific mixture proportion variables, showing the range of potential distributions the model is expecting before seeing any data. The 95% quantiles of the single assemblage model's proportion of immature specimens ranges from `r singlesite_simulated_proportions$simulated_p_immature`, `r singlesite_simulated_proportions$simulated_p_female` for female specimens, and `r singlesite_simulated_proportions$simulated_p_male` for male specimens. It is noteworthy that the simulated proportion of immature specimens for the single assemblage model does not reach the extreme value seen in the Pinarbaşı B sheep data, which may explain some of the long tails in the element-specific compositional estimates (Figure 9 of the main text). The multisite model has similar expected ranges (immature: `r multisite_simulated_proportions$simulated_p_immature`, female: `r multisite_simulated_proportions$simulated_p_female`, male: `r multisite_simulated_proportions$simulated_p_male`).


```{r model_supplement_figure_1, echo = FALSE, message = FALSE, warning = FALSE, fig.align = "center", out.width = "\\textwidth", fig.cap = "Single Assemblage and Multisite Prior Predictive Checks. Top row (A): Estimates of the proportion of immature, female, and male specimens. Bottom row (B): Histograms of simulated humerus Bd measurements from the simulations. Vertical lines show the value of the standard reference value (dashed) and two extreme archaeological samples (dotted)."}
knitr::include_graphics("./Output/Figures/model_supplement_figure_1.png")
```

The bottom row shows the distribution of simulated humerus Bd measurements for the assemblages (Figure 1B). These plots include all simulated specimens, showing the range of sheep humerus Bd values the model expects before seeing any data. Several vertical lines on the plots give a sense of domain knowledge about sheep humerus Bd values. First, the standard value (33.0 mm) from the *Ovis orientalis* female standard animal (FMC 57951) from Uerpmann and Uerpmann [-@uerpmann_uerpmann_1994: Table 12] is shown in the red dashed line. Second, vertical blue lines show ranges of observed archaeological sheep humerus Bd values: the largest sheep (technically *Ovis orientalis*) humerus Bd (39.0 mm) from the 10th millennium BP site of Körtik Tepe, in southeastern Anatolia [@arbuckle_ozkaya_2006: Table b] and the smallest sheep (*Ovis aries*) humerus Bd (25.0 mm) from the fifth-sixth century CE site of West Stow, United Kingdom [@crabtree_1990: Table 29]. These plots highlight how the multisite model's structure allows for much more variation in body size, especially large measurements. While the 95% quantiles of the single assemblage model's simulated measurements (`r humbd_summary$singlesite_humbd_95CI` mm) do not exactly encapsulate the range of the observed extreme measurements, the multisite model's simulated measurements go well beyond the range (`r humbd_summary$multisite_humbd_95CI` mm). While one may not expect this full range of measurements in a single assemblage, the results of the prior predictive checks show that the multisite model could plausibly encapsulate variation in body size across diverse assemblages, though possibly at the cost of being somewhat inefficient (i.e., evaluating parameter values that are somewhat beyond reasonable expectations).

These prior predictive checks show that the Bayesian multilevel mixture model's structure is robust for diverse archaeological applications. The chosen prior distribution definitions for the multilevel variation components of the model encapsulate a reasonable range of expected variation and thus do not require extensive retooling as they are applied to new archaeological situations or taxa. Moreover, the multilevel model appears to have enough variability to model drastic changes in body size, making the models' structure relevant for examining broad spatial and temporal variation in animal biometry. While experts are encouraged to use domain knowledge to update and adjust the models to better fit the questions they ask, these models are widely-applicable tools that are suitable for asking many questions about animal body size and the composition of zooarchaeological assemblages.

## References

<div id = "refs"></div>