Plain maximum likelihood underestimates variances because it measures spread around fitted quantities, which sit closer to the data than the truth does; it is the same bias that makes dividing by n wrong and dividing by n - 1 right for a sample variance. In mixed models, where several fixed effects are estimated alongside the variance components, the distortion grows with the number of coefficients.
REML removes the bias by averaging the likelihood over all possible fixed-effect values instead of maximizing at the best one, which adds a determinant term charging the variance components for the estimation of the mean. It is the default in statsmodels’ MixedLM and R’s lme4, with one operational rule: models differing in fixed effects must be compared with ML, not REML, because REML likelihoods are not comparable across different mean structures.
