Math

P-Value Calculator

Convert test statistics to p-values for any distribution

P-Value Calculator: Convert Test Statistics to Probability

The p-value is one of the most important concepts in inferential statistics. It measures the probability of obtaining a test statistic at least as extreme as what you observed, assuming the null hypothesis is true. A small p-value (typically below 0.05) indicates the data are unlikely under the null hypothesis — strong evidence to reject it.

This calculator converts any test statistic (z, t, chi-square, or F) into a precise p-value, saving you from looking up statistical tables or writing code. Whether you are a researcher, data scientist, or student, this tool makes hypothesis testing fast and reliable.

How to Use the P-Value Calculator

  1. Select distribution type — Choose Z for large samples or known variance, T for small samples with unknown variance, chi-square for goodness-of-fit or independence tests, or F for ANOVA and regression F-tests.
  2. Choose the tail type — Two-tailed tests check for any difference; left-tailed tests check if the parameter is smaller than hypothesized; right-tailed tests check if it is larger.
  3. Enter the test statistic — This is the value computed from your sample (z-score, t-statistic, chi-square statistic, or F-ratio).
  4. Set degrees of freedom — Required for T, chi-square, and F distributions. For F, enter both numerator and denominator df.
  5. Set significance level (alpha) — Usually 0.05 (5%) or 0.01 (1%). The calculator compares your p-value to this threshold automatically.

Understanding the Results

P-value is the main output: the probability of seeing a test statistic this extreme (or more) if H0 is true. The left-tail and right-tail probabilities break this down by direction.

Significance decision: When p ≤ alpha, the result is statistically significant — you reject the null hypothesis. When p > alpha, you fail to reject it (you do not "accept" the null; absence of evidence is not evidence of absence).

Distribution Formulas

Z-distribution: Uses the standard normal CDF. Appropriate when population standard deviation is known or n > 30.
T-distribution: Uses Student's t-CDF with df = n - 1. Wider tails than Z, accounting for uncertainty in estimating population variance.
Chi-square: Right-skewed distribution used for count data. P-value is always right-tailed for goodness-of-fit; you choose the tail for independence tests.
F-distribution: Ratio of two chi-square variables scaled by their df. Used in ANOVA and linear regression F-tests.

Real-World Examples

Example 1 — Drug trial (two-tailed T-test): You compare blood pressure before and after treatment in 20 patients. The paired t-statistic is 2.36 with df = 19. This calculator gives p = 0.0295, which is less than 0.05 — you reject the null hypothesis that the treatment has no effect.

Example 2 — Website A/B test (Z-test): Two landing pages are compared. The z-statistic is 1.84 (right-tailed). The p-value is 0.033, indicating the new page has a statistically significantly higher conversion rate at the 5% level.

Example 3 — Chi-square independence test: A chi-square statistic of 9.49 with 2 df gives p = 0.0087, indicating a significant association between two categorical variables.

Common Mistakes to Avoid

  • The p-value is NOT the probability that H0 is true. It assumes H0 is true and measures how surprising the data are.
  • Statistical significance does not imply practical significance. A tiny effect can be significant with a large sample size.
  • Reporting exact p-values (e.g., p = 0.023) is more informative than "p < 0.05".
  • Choose your significance threshold before running the test to avoid p-hacking.

Frequently Asked Questions

What is a p-value in simple terms?
A p-value tells you how likely it is that you saw results this extreme (or more extreme) purely by chance, assuming there is actually no real effect. A p-value of 0.03 means there is a 3% chance of seeing your data if the null hypothesis is true.
What p-value means statistical significance?
The most commonly used threshold is 0.05 (5%). If your p-value is at or below 0.05, is considered statistically significant and you reject the null hypothesis. Some fields (physics, medical trials) use stricter thresholds like 0.01 or 0.001.
When should I use a two-tailed vs. one-tailed test?
Use a two-tailed test when you have no pre-specified direction for the effect (e.g., "does the drug change blood pressure?"). Use one-tailed only when theory strongly predicts a direction before collecting data (e.g., "does the drug lower blood pressure?"). Two-tailed tests are generally more conservative and widely accepted.
What is the difference between a Z-test and a T-test?
A Z-test is used when the population standard deviation is known or the sample size is large (n > 30). A T-test is used with small samples where you estimate the standard deviation from the sample. With large samples, both give nearly identical results.
How do degrees of freedom affect the p-value?
Higher degrees of freedom make the T and chi-square distributions more concentrated around their means, so a given test statistic corresponds to a smaller p-value. With very large df, the T-distribution approaches the normal Z-distribution.
Can a p-value be greater than 1?
No. P-values are probabilities and must lie between 0 and 1 (inclusive). A p-value very close to 1 means the test statistic is exactly what you would expect under the null hypothesis — little to no evidence against it.
What does it mean to fail to reject the null hypothesis?
When p > alpha, you fail to reject H0. This does not prove H0 is true — it simply means the data do not provide sufficient evidence against it. The study may have been underpowered (too few observations) to detect a real effect.
What is the F-distribution used for?
The F-distribution is used in ANOVA (comparing means across three or more groups) and in linear regression to test whether the model explains significant variance. The F-statistic is the ratio of between-group variance to within-group variance.