Statistical modelling, model fitting and BIC for dummies

Are you preparing for a departmental journal club? Do you find yourself stumped at the many different ‘models’ and ‘statistical tests’ described in the statistical analysis section?

You are not alone!

Ironically, despite being the most important thing in a research paper, most of us tend to rush through this part, if not completely skip it entirely.

Some of the things I have always found myself pondering over and get stuck at are

  • What exactly is a statistical model?

  • How do the authors know to do these particular statistical tests anyway?

  • And how do they determine if these tests are actually the best fit for the particular study?

In simplest of terms,

  • A model is a relationship between variables

  • A model is a simplified representation of reality or real-world phenomenon

  • A model is a mathematical equation which helps us understand and/or predict real life events using a sample of data available at hand (and some heavy-duty statistics).

For example: Given that we know the weight (x) of a person we can predict the height (y) using a mathematical model y = mx + b. This is the simplest form of the regression model with one dependent and one independent variable, as shown below.


Once we have the data (eg in the NephJC paper being discussed, a cohort of 350 study participants in Leon and Chinandega regions of Nicaragua with CKD), the right mathematical representation of the data needs to be identified and used with respect to the assumptions we're willing to make, the predictions we are trying to make, and the data itself.

In this case, the authors use Growth Mixture Modelling (GMM) and the latent classes within this particular model were selected using the Bayesian Information Criteria (BIC).

In general, mixture models aim to uncover unobserved heterogeneity in a population and to find clinically meaningful groups of people that are similar in their responses to measured variables.

The latent class growth analysis allows identification of classes of individuals following similar progressions of the outcome over time

Even CKD itself is a general term for heterogeneous disorders affecting the structure and function of the kidney from a variety of different underlying causes from congenital kidney disease to GN to DM etc.  But do all patients with existing CKD progress at the same rate of GFR decline? Not really. A commonly used, typical linear mixed model assumes a homogeneous population, i.e. only one mean trajectory within the population. Also, it is assumed that covariates influence each individual in the same way. But we know that not all patients with CKD have a steady GFR progression over time.

Boucquemontet al have done an excellent review of  different statistical analyses aimed at investigating the association between various CKD outcomes and their risk factors, and have described a number of methodological issues along the way.

Describing an entire population using a single trajectory estimate is oversimplifying the complex  patterns that describe continuity and change among members of different groups. Instead, a latent class or growth mixture modeling (GMM) approach seems to be the most appropriate method here for fully capturing information about interindividual differences and heterogeneity of different groups within a larger population.

Growth mixture modelling (GMM) is used to identify subpopulations with distinct trajectories of renal function and to identify factors discriminating these subpopulations. In the current study, using GMM, the authors were able to identify three different subgroups in men and two subgroups in women on the basis of the model intercept (baseline eGFR) and slope (change in eGFR over time).

Assuming there are multiple classes, how does one determine how many there are? This can be done using Bayesian Information Criteria (BIC).

Bayesian statistics or Bayesian inference is an approach to statistics that acknowledges that the weight of any observed data should be balanced against the pre-existing support for a given hypothesis. Traditional (frequentist) statistics describe how unlikely the data is assuming there is no real relationship between the variables. Bayesian statistics allow you to tell how likely the relationship between the variables is, given the data. This is a subtle but critical difference. Broadly, we are all “Bayesians” when we read a medical study, because we examine the data, but interpret it in the light of all the studies that have come before. A new study coming out suggesting that smoking doesn’t cause lung cancer would be looked at quite skeptically, regardless of how statistically significant the presented data was. Bayes inference lets you update your guesses based on any new information or facts that become available.

Bayesian statistics are named after an 18th century English statistician (and philosopher and minister) Thomas Bayes. Interestingly, the Bayes rule is an important principle used in Artificial Intelligence (AI) to calculate the probability of a robot’s next steps given the steps it has already taken, e.g. based on prior knowledge that  ‘X’ is true, then ‘Y’ might be true and we can predict the outcome with this kind of conditional probability. Alan Turing, the World War II scientist hero, used Bayes rule to build the world’s first computer and crack secret cryptic German codes to help win the war. (check out The Imitation Game if you haven’t already).

The Bayesian Information Criterion (BIC) is an index used in Bayesian statistics to choose between two or more alternative models. Model selection is the task of selecting a statistical model from a set of candidate models, given data. The goodness of fit of a statistical model describes how well it fits a set of observations. A good model selection technique will balance goodness of fit with how simple or how complex the model is given the data.

Source: ‘When a Good Fit can be Bad’    Pitt and Myung, 2002

Source: ‘When a Good Fit can be Bad’ Pitt and Myung, 2002

Using a mathematical equation, Bayesian information criterion involves calculating the BIC for each model. The model with the lowest BIC is considered the best.

In the end, though, model-building is as much art as science. Statisticians and researchers often create their model using clinical knowledge. While they may look at the Akaike information criteria (AIC) or BIC to decide on one vs another model where there isn’t a subjective reason to pick one vs. the other, these tests are simply tools in the armamentarium to create models that do what we want models to do - either explain an observed phenomenon, or predict an outcome in the future.

Commentary by Manasi Bapat, Nephrologist, California

NSMC Intern, Class of 2018