Missing Values Analyzer
Find missing data by cell, row, or column in any CSV
Missing Values Analyzer
Missing data is one of the most common problems in real-world datasets. Before handling missing values through imputation or exclusion, you need to know which columns or rows are affected, how severely, and what pattern the missingness follows. This tool performs that analysis instantly on any CSV dataset, entirely in your browser.
How to Use This Tool
- Paste your CSV — the first row is treated as the header.
- Missing tokens — enter a comma-separated list of values that represent missingness in your data. Defaults: blank, NA, N/A, null, none, undefined. Add custom values like "?" or "-".
- Analysis mode — choose how to report missing data:
- Cell: counts every individual missing cell across the entire dataset.
- Row: counts and lists rows where at least one value is missing.
- Column: counts missing values per column with rates and identifies the most affected columns.
Understanding the Analysis Modes
Cell mode gives you the total missing cell count and overall missing rate — useful for a single-number summary of data completeness.
Row mode identifies incomplete records. Useful when you plan to use listwise deletion (removing rows with any missing value) and need to know how many rows you would lose.
Column mode shows per-column missing rates, helping you decide which columns are candidates for imputation vs. removal.
Types of Missingness
MCAR (Missing Completely At Random): Missingness has no pattern — it is as likely in any row or column. Listwise deletion is valid but reduces sample size.
MAR (Missing At Random): Missingness depends on other observed variables but not on the missing value itself. Imputation based on other columns is appropriate.
MNAR (Missing Not At Random): The probability of a value being missing depends on the missing value itself (e.g., high earners skip income questions). This requires careful modeling and cannot be fixed with simple imputation.
Common Strategies for Handling Missing Data
- Remove rows — if few rows have missing values and the data is MCAR.
- Remove columns — if a column has >40–50% missing values and imputation would introduce too much noise.
- Mean/median imputation — fill with column mean (numeric) or mode (categorical). Simple but introduces bias for MAR data.
- Model-based imputation — predict missing values using a regression model trained on observed columns. More accurate for MAR.
- Indicator variables — add a binary column flagging whether a value was imputed, letting the model learn the missingness pattern.