The Logic of Meta-Analysis
Quantitative Research Synthesis
1. Why Measurement Matters
“If you cannot measure it, you cannot improve it.” — Lord Kelvin code-fold: show —
1. Approaches to Research Synthesis
We can compare two dominant approaches: Narrative Reviews and Quantitative Reviews (Meta-Analysis).
Focus: Differences in statistical significance (\(p < .05\)).
| Advantages | Critiques |
|---|---|
| 1. Able to give greater attention to high quality studies. | 1. Looks at the wrong results (\(p\) vs Effect Size). |
| 2. Sensitive to ancillary findings and design idiosyncrasies. | 2. Danger: Misinterprets sampling error as substantive patterns. |
| 3. Ideal for telling a nuanced story. | 3. Vulnerable to selection bias and confirmation of author’s belief. |
Focus: Magnitude of Effect Size (\(\delta\)) and Precision (\(N\)).
| Advantages | Critiques |
|---|---|
| 1. Exhaustive search avoids selection bias; definitive summary. | 1. “Apples and Oranges”: Combines different studies. |
| 2. Statistically defensible summary of findings. | 2. Lack of discrimination: Equal weight to well- and poorly-designed studies. |
| 3. Tests if differences exceed Sampling Error. | 3. Inattention to nuances and indirect evidence. |
| 4. Identifies gaps for future research. |
2. The Logic of NHST
Null Hypothesis Significance Testing (NHST) follows a specific logic:
- Assume \(H_0\): usually \(\delta = 0\) (No Effect).
- Compute: The observed differences.
- Compare: Observed \(d\) to hypothetical sampling distribution under \(H_0\).
- Decision:
- If improbable (\(p < .05\)), Reject \(H_0\).
- If probable (\(p > .05\)), Fail to Reject \(H_0\).
The Error Matrix
| Population \(\delta = 0\) (Null True) | Population \(\delta \neq 0\) (Effect Exists) | |
|---|---|---|
| Result: Not Sig | Correct Decision | Type II Error (Missed Effect) |
| Result: Sig (\(p<.05\)) | Type I Error (False Positive) | Correct Decision |
3. Schmidt’s Critique
The Problem: “We act as though Type II Errors (Missed Effects) are of no consequence.”
- Over-protection: In most areas, we are obsessed with avoiding Type I errors.
- The Cost: This leads to reporting conventions (dichotomous decisions) that obscure the findings.
- The Reality: With typical sample sizes (\(N=30\)), statistical power is low (~50%). We expect inconsistent significance results purely due to Sampling Error.
Sampling Error Visualized
Every group mean (\(M\)) has error: \(SE_M = \sigma / \sqrt{N}\). When we look at differences (\(d\)), this error is compounded. Meta-Analysis allows us to see through this noise by aggregating the samples.
Definition (Glass, 1976): “The statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings.”
Key Differences
| Feature | Narrative Review | Meta-Analysis |
|---|---|---|
| Focus | Statistical Significance (\(p\)) | Effect Size Magnitude (\(d, r\)) |
| Handling Differences | “Conflicting Results” | “Sampling Error” + Moderators |
| Goal | Qualitative Summary | Quantitative Estimation |
| Precision | Ignored | Central (Weight by \(N\)) |
The “Apples and Oranges” Critique
Critique: “Meta-analysis mixes different studies (Apples and Oranges) together.” Response: 1. We exclude “Rotten Fruit” (bad studies) via strict criteria. 2. We want to generalize to “Fruit” (a broad concept). 3. We can code “Fruit Type” as a moderator to test if Apples differ from Oranges.
4. Sampling Error & Confidence Intervals
In Meta-Analysis, we treat each study’s Effect Size (ES) as an estimate of the Population Parameter.
- \(\hat{\rho}\) = Observed Correlation in Study \(i\)
- \(\sigma_{\hat{\rho}}\) = Standard Error (Precision)
We don’t just look at the point estimate. We look at the Confidence Interval (CI).
- Wide CI: Low precision (Small \(N\)). Little information.
- Narrow CI: High precision (Large \(N\)). Much information.
Meta-analysis is essentially a weighted average where Precision is the Weight.