Math
Correlation Coefficient Calculator
Calculate Pearson r and determine correlation strength
One point per line. Separate x and y with a comma or space.
Pearson correlation coefficient
0.9962
↗Very strong positive correlation
r
0.9962
r²
0.9923
n
10
t-statistic
32.2077
p-value
< 0.0001
Mean x
10.4000
Mean y
11.7000
Std x
6.0773
Significance (two-tailed, df = 8): p < 0.001 — highly significant
r scale: −1 ← → +1
−10+1
## Correlation Coefficient Calculator — Pearson r & Significance Test
The Pearson correlation coefficient (r) is a fundamental statistical measure used across every quantitative discipline. It condenses the relationship between two variables into a single number between −1 and +1.
### The Formula
$$r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \cdot \sum(y_i - \bar{y})^2}}$$
This can be rewritten as:
$$r = \frac{S_{xy}}{\sqrt{S_{xx} \cdot S_{yy}}}$$
### Interpreting Correlation Strength
| r value | Strength | Description |
|---------|----------|-------------|
| 0.9 – 1.0 | Very strong | Near-perfect linear relationship |
| 0.7 – 0.9 | Strong | Clear linear trend |
| 0.5 – 0.7 | Moderate | Noticeable but scattered |
| 0.3 – 0.5 | Weak | Small trend, lots of variability |
| 0.1 – 0.3 | Very weak | Barely detectable |
| 0.0 – 0.1 | Negligible | Essentially no linear relationship |
Negative values indicate an inverse relationship — as x increases, y tends to decrease.
### Statistical Significance (t-test)
A correlation in a sample may be due to chance. To test significance:
$$t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}}$$
This t-statistic has n − 2 degrees of freedom. The p-value (two-tailed) tells you the probability of observing such a correlation by chance. Common thresholds: p < 0.05 (significant) and p < 0.01 (highly significant).
**Note**: With large n, even very small correlations (r = 0.1) become statistically significant. Statistical significance ≠ practical significance. Always consider the actual magnitude of r alongside the p-value.
### Common Applications
- **Medical research**: Correlation between diet and health outcomes
- **Finance**: Correlation between asset returns (portfolio diversification)
- **Education**: Correlation between study hours and exam scores
- **Engineering**: Correlation between process inputs and quality outputs
- **Social science**: Correlation between socioeconomic variables
### Limitations of Pearson r
1. **Linear only** — Pearson r measures linear association. Non-linear relationships may show low r even if the variables are strongly related.
2. **Sensitive to outliers** — A single extreme data point can dramatically inflate or deflate r.
3. **No causation** — Correlation does not imply causation.
4. **Normality assumption** — Strictly requires bivariate normality for inference (significance testing).
Frequently Asked Questions
What does a Pearson r of 0 mean?
r = 0 means no linear relationship between x and y. However, there could still be a non-linear relationship — for example, a perfect parabola (y = x²) has r = 0 because the positive and negative sides cancel out. Always look at a scatter plot alongside the correlation coefficient.
What is the difference between correlation and causation?
Correlation measures the statistical association between two variables, but does not prove that one causes the other. Both could be caused by a third variable (confounding), or the relationship could be coincidental. Pearson r only detects linear association.
How do I know if a correlation is statistically significant?
A t-test is used: t = r√(n−2) / √(1−r²). The resulting p-value tells you the probability of observing this correlation by random chance if the true correlation is zero. If p < 0.05, the correlation is statistically significant at the 5% level.
What is the difference between Pearson and Spearman correlation?
Pearson measures linear correlation using raw values. Spearman measures monotonic correlation using ranks — it is less sensitive to outliers and can detect non-linear but monotonically increasing or decreasing relationships. Use Spearman when data is not normally distributed or has outliers.
What does r² mean?
r² (the square of Pearson r) is the coefficient of determination. It indicates the proportion of variance in y that is shared with x. For example, r = 0.8 gives r² = 0.64, meaning 64% of the variance in y is explained by x.
Can correlation be used with non-numeric data?
Pearson correlation requires numeric, approximately normally distributed data. For ordinal data (rankings), use Spearman correlation. For nominally coded categorical data, use point-biserial (binary) or Cramer's V (multi-category) correlations instead.