8 Part 6: Systematic Measurement Error on Bounded Scales

▶ Load required packages

library(lavaan); library(semTools); library(MASS); library(ggplot2)
library(dplyr); library(tidyr); library(knitr)

8.1 The Problem in Plain Language

Almost every scale used in behavioral research has bounds — rating scales run from 1 to 7, percentages from 0 to 100, frequency measures from “never” to “always.” These bounds are usually treated as a minor formatting detail. In practice, they introduce a systematic bias that most researchers never think about.

Where this fits in the broader workshop arc

Bounded scales are ubiquitous across the social, health, economic, and statistical sciences — and so is the bias they create. But the problem appears under different names depending on the discipline: ceiling and floor effects in psychometrics, corner solutions and censored data in econometrics, bounded outcomes in health statistics, item difficulty effects in item response theory. Despite the different terminology, the underlying mathematics is identical: a hard limit on what a scale can record truncates the true distribution, and the truncated mean departs systematically from the truth.

This measurement-layer problem cascades upward through the rest of the inferential chain. Biased observed means corrupt hypothesis tests — because what is being tested is not the construct being theorized about. Those corrupted tests in turn compromise causal inference — because treatment-effect estimates compare biased observations rather than true latent values. A researcher who finds a null result comparing a control group (true mean 50) to a treated group (true mean 75, both on a 0–100 scale with SD 20) may be observing a genuine effect masked by differential truncation bias: the treated group’s distribution is clipped more severely because it sits closer to the ceiling. The measurement layer is not preliminary housekeeping — it shapes every conclusion drawn above it.

Here is the mechanism. Suppose a participant’s true level of some construct is 85 on a 0–100 scale, and they have a genuine tendency to vary ±15 points around that true value due to measurement noise (mood, attention, question framing). On the low side, they can express that variation freely — they might score 70, 75, 80. On the high side, they hit a ceiling. Scores of 95, 100, 105 are impossible — those attempts to go higher just pile up at 100. The result: the observed distribution of scores is not centered at 85. It is pulled toward the center of the scale. The ceiling cuts off the upper tail, making the observed mean lower than the true mean.

The same thing happens at the lower bound: if the true value is close to the floor, the floor cuts off the lower tail and pulls the observed mean upward. Both effects push observations toward the middle of the scale.

Three factors make this worse:

Proximity to a bound — the closer the true value to a ceiling or floor, the more severe the truncation and the greater the bias.
More variability — the more spread there is in true scores (or measurement noise), the more the tails get cut off and the stronger the pull toward the center.
Tighter scale — a 7-point scale truncates more than a 100-point scale for the same true variability.

One-sided vs. two-sided bounds

Scales bounded on both ends (1–7 Likert, 0–100 slider) produce symmetric pull toward the center: ceiling effects at the top and floor effects at the bottom, each biasing estimates toward the midpoint.

Scales bounded on one end only — like response time (lower bound at 0, no upper bound), log-transformed variables, or count data — produce asymmetric pull toward the bounded end only. The unbounded tail is free; the bounded tail is clipped. This asymmetry means the direction of bias depends on which bound participants approach.

For right-skewed distributions common in bounded-at-zero measures (reaction time, income, sales), the floor effect is usually the active constraint, and the bias pushes mean estimates upward.

8.2 Interactive Simulator: See the Bias in Real Time

The simulator below overlays two density curves. The gray shaded curve is the true distribution — what scores would look like if the scale had no bounds. The blue shaded curve is the observed distribution — what researchers actually record after the bounds clip the tails. Vertical reference lines mark the true mean (dashed black), the observed mean (solid blue), and the observed median (dashed red). Adjust the sliders to see how the gap between true and observed means grows as the true mean approaches a bound or as variability increases.

viewof bound_mu  = Inputs.range([5, 95], {step: 1, value: 50, label: "True mean (population)"})
viewof bound_sd  = Inputs.range([2, 40], {step: 1, value: 15, label: "True SD (variability / noise)"})
viewof bound_lo  = Inputs.select(["Bounded at 0", "No lower bound"], {value: "Bounded at 0", label: "Lower bound"})
viewof bound_hi  = Inputs.select(["Bounded at 100", "No upper bound"], {value: "Bounded at 100", label: "Upper bound"})

function phi_fn(x) {
  return Math.exp(-0.5 * x * x) / Math.sqrt(2 * Math.PI);
}
function erf_fn(x) {
  const s = x >= 0 ? 1 : -1, ax = Math.abs(x);
  const t = 1 / (1 + 0.3275911 * ax);
  const y = 1 - (((((1.061405429*t - 1.453152027)*t + 1.421413741)*t - 0.284496736)*t + 0.254829592)*t * Math.exp(-ax*ax));
  return s * y;
}
function Phi_fn(x) { return 0.5 * (1 + erf_fn(x / Math.sqrt(2))); }

// Inverse normal CDF (Acklam approximation)
function qnorm(p) {
  if (p <= 0) return -Infinity; if (p >= 1) return Infinity;
  const a = [-3.969683028665376e1, 2.209460984245205e2, -2.759285104469687e2, 1.383577518672690e2, -3.066479806614716e1, 2.506628277459239];
  const b = [-5.447609879822406e1, 1.615858368580409e2, -1.556989798598866e2, 6.680131188771972e1, -1.328068155288572e1];
  const c = [-7.784894002430293e-3, -3.223964580411365e-1, -2.400758277161838, -2.549732539343734, 4.374664141464968, 2.938163982698783];
  const d = [7.784695709041462e-3, 3.224671290700398e-1, 2.445134137142996, 3.754408661907416];
  const pL = 0.02425;
  let q;
  if (p < pL) {
    q = Math.sqrt(-2*Math.log(p));
    return (((((c[0]*q+c[1])*q+c[2])*q+c[3])*q+c[4])*q+c[5]) / ((((d[0]*q+d[1])*q+d[2])*q+d[3])*q+1);
  } else if (p <= 1-pL) {
    q = p - 0.5; const r = q*q;
    return (((((a[0]*r+a[1])*r+a[2])*r+a[3])*r+a[4])*r+a[5])*q / (((((b[0]*r+b[1])*r+b[2])*r+b[3])*r+b[4])*r+1);
  } else {
    q = Math.sqrt(-2*Math.log(1-p));
    return -(((((c[0]*q+c[1])*q+c[2])*q+c[3])*q+c[4])*q+c[5]) / ((((d[0]*q+d[1])*q+d[2])*q+d[3])*q+1);
  }
}

loB = bound_lo === "Bounded at 0"   ?   0 : -Infinity
hiB = bound_hi === "Bounded at 100" ? 100 :  Infinity

// Truncated normal moments (analytical)
trunc_moments = {
  const mu = bound_mu, sigma = bound_sd;
  const lo = loB, hi = hiB;
  const alpha = lo === -Infinity ? -Infinity : (lo - mu) / sigma;
  const beta  = hi ===  Infinity ?  Infinity : (hi - mu) / sigma;
  const pA = alpha === -Infinity ? 0 : Phi_fn(alpha);
  const pB = beta  ===  Infinity ? 1 : Phi_fn(beta);
  const Z  = pB - pA;
  const fA = alpha === -Infinity ? 0 : phi_fn(alpha);
  const fB = beta  ===  Infinity ? 0 : phi_fn(beta);
  const obs_mean = mu + sigma * (fA - fB) / Z;
  const obs_med  = mu + sigma * qnorm((pA + pB) / 2);
  const obs_var  = sigma*sigma * (1 + (alpha*fA - beta*fB)/Z - ((fA-fB)/Z)**2);
  return {obs_mean, obs_med, obs_sd: Math.sqrt(Math.max(0, obs_var)), Z};
}

// Truncated (observed) density
density_data = {
  const mu = bound_mu, sigma = bound_sd;
  const lo = loB, hi = hiB;
  const pA = lo === -Infinity ? 0 : Phi_fn((lo - mu)/sigma);
  const pB = hi ===  Infinity ? 1 : Phi_fn((hi - mu)/sigma);
  const Z  = pB - pA;

  const xlo = Math.max(lo === -Infinity ? mu - 4.5*sigma : lo, -5);
  const xhi = Math.min(hi ===  Infinity ? mu + 4.5*sigma : hi, 105);
  const n   = 300;
  return Array.from({length: n}, (_, i) => {
    const x = xlo + (i / (n-1)) * (xhi - xlo);
    const pdf = (x < lo || x > hi) ? 0 : phi_fn((x-mu)/sigma) / (sigma * Z);
    return {x, pdf};
  });
}

// True (untruncated) normal density — what the distribution would be without bounds
density_true = {
  const mu = bound_mu, sigma = bound_sd;
  const xlo = Math.max((loB === -Infinity ? mu - 4.5*sigma : loB) - 2, -5);
  const xhi = Math.min((hiB ===  Infinity ? mu + 4.5*sigma : hiB) + 2, 105);
  const n   = 300;
  return Array.from({length: n}, (_, i) => {
    const x = xlo + (i / (n-1)) * (xhi - xlo);
    return {x, pdf: phi_fn((x - mu) / sigma) / sigma};
  });
}

bias_mean   = trunc_moments.obs_mean - bound_mu
bias_med    = trunc_moments.obs_med  - bound_mu
bias_dir    = Math.abs(bias_mean) < 0.5 ? "negligible" : bias_mean > 0 ? "upward" : "downward"

html`<div style="display:flex;gap:12px;flex-wrap:wrap;margin:12px 0;font-size:.88em;">
  <div style="flex:1;min-width:160px;border-left:4px solid #222;padding:6px 12px;background:#f5f5f5;border-radius:4px;">
    <strong>True mean:</strong> ${bound_mu.toFixed(1)}
  </div>
  <div style="flex:1;min-width:160px;border-left:4px solid #457b9d;padding:6px 12px;background:#f0f4f8;border-radius:4px;">
    <strong>Observed mean:</strong> ${trunc_moments.obs_mean.toFixed(2)}
    <span style="color:${Math.abs(bias_mean)>1?'#e63946':'#888'}"> (bias: ${bias_mean > 0 ? '+' : ''}${bias_mean.toFixed(2)})</span>
  </div>
  <div style="flex:1;min-width:160px;border-left:4px solid #e63946;padding:6px 12px;background:#fdf0f0;border-radius:4px;">
    <strong>Observed median:</strong> ${trunc_moments.obs_med.toFixed(2)}
    <span style="color:${Math.abs(bias_med)>1?'#e63946':'#888'}"> (bias: ${bias_med > 0 ? '+' : ''}${bias_med.toFixed(2)})</span>
  </div>
</div>`

Plot.plot({
  width: 650, height: 300,
  marginLeft: 55, marginBottom: 45, marginTop: 20, marginRight: 20,
  x: {
    label: "Scale value →",
    domain: [
      Math.max((loB === -Infinity ? bound_mu - 4.5*bound_sd : loB) - 2, -5),
      Math.min((hiB ===  Infinity ? bound_mu + 4.5*bound_sd : hiB) + 2, 105)
    ]
  },
  y: {label: "Density", domain: [0, null]},
  marks: [
    // True (untruncated) distribution — what scores would look like without bounds
    Plot.areaY(density_true, {x: "x", y: "pdf", fill: "#aaa", fillOpacity: 0.18}),
    Plot.line(density_true,  {x: "x", y: "pdf", stroke: "#999", strokeWidth: 1.5,
                              strokeDasharray: "5,3"}),
    // Observed (truncated) distribution — what researchers actually see
    Plot.areaY(density_data, {x: "x", y: "pdf", fill: "#457b9d", fillOpacity: 0.35}),
    Plot.line(density_data,  {x: "x", y: "pdf", stroke: "#457b9d", strokeWidth: 2.2}),
    // Reference lines
    Plot.ruleX([bound_mu],               {stroke: "#222",     strokeDasharray: "6,3",  strokeWidth: 2.2}),
    Plot.ruleX([trunc_moments.obs_mean], {stroke: "#457b9d",                           strokeWidth: 2.5}),
    Plot.ruleX([trunc_moments.obs_med],  {stroke: "#e63946",  strokeDasharray: "4,3",  strokeWidth: 2.5}),
    ...(loB > -Infinity ? [Plot.ruleX([loB], {stroke: "#aaa", strokeWidth: 1.5})] : []),
    ...(hiB <  Infinity ? [Plot.ruleX([hiB], {stroke: "#aaa", strokeWidth: 1.5})] : [])
  ]
})

Figure 8.1: Gray filled area = true (untruncated) distribution — what scores would look like without scale bounds. Blue filled area = observed (truncated) distribution — what researchers actually record. Dashed black vertical line = true mean; solid blue = observed mean; dashed red = observed median. Notice how the bounds clip the true distribution’s tails, compressing the observed distribution and displacing its peak and mean toward the scale’s center.

What this means for your research

Means are biased toward the scale midpoint. When comparing groups or conditions, if one group has a true mean closer to a bound than the other, the truncation bias differs between groups — the apparent treatment effect is distorted by the scale itself, not just the construct. This is particularly dangerous in pre–post designs, where an intervention may move participants toward a ceiling, making the post-treatment distribution more severely clipped and therefore making the estimated gain look smaller than it truly is.

The problem gets worse as variability increases. High within-person variability (which researchers often interpret as low reliability) amplifies truncation bias. Ironically, a noisy measure produces more bias, not just more noise. This also means that between-group comparisons of variance — common in psychometric analyses — are confounded whenever the groups sit at different distances from a bound.

Transformations can help, but not fully eliminate the problem. Logit-transforming proportions (e.g., log(p/(1-p))) or arcsine-transforming bounded scores can partially stabilize the bias. Tobit regression (Wooldridge, 2010) models the truncated distribution explicitly and estimates the latent mean directly — it was developed in econometrics for exactly this structure, where household expenditure is zero for non-buyers and positive (and therefore bounded below) for buyers.

The problem has different names across disciplines — but the same solution. In econometrics it is censored data; in health research it is ceiling and floor effects in patient-reported outcome (PRO) instruments; in psychometrics it appears as item difficulty in item response theory; in statistics it is truncated distribution estimation. Each field has converged on similar remedies: model the latent distribution explicitly (Tobit / truncated regression), use response scales whose range substantially exceeds the likely spread of true scores, and — as a minimum — report the proportion of respondents at or near each bound before interpreting means and treatment comparisons.

8.3 References

Pieters, R., Srivastava, J., & Bagchi, R. (2025). Improving the discriminant validation of multi-item scales. Journal of Marketing Research. https://doi.org/10.1177/00222437251322089

Henseler, J., Ringle, C. M., & Sarstedt, M. (2015). A new criterion for assessing discriminant validity in variance-based structural equation modeling. Journal of the Academy of Marketing Science, 43(1), 115–135.

Millsap, R. E., & Kwok, O.-M. (2004). Evaluating the impact of partial factor loading and intercept invariance on selection in two populations. Psychological Methods, 9(1), 93–115.

Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90.

Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36.

Scrucca, L., Fop, M., Murphy, T. B., & Raftery, A. E. (2016). mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1), 289–317.

Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. ACM SIGMOD Record, 29(2), 93–104.

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), 226–231.

Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.). Springer.

Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data (2nd ed.). MIT Press.