How to remove duplicate values in excel
Tutor 5 (62 Reviews)
Excel Tutor
Still stuck with a Excel question
Ask this expertAnswer
To remove duplicate values in Excel, use the Remove Duplicates tool under the Data tab, Power Query for large datasets, or Conditional Formatting to highlight duplicates for manual deletion. These methods permanently alter or help clean your data by deleting duplicate rows based on selected columns while keeping the first occurrence.
What methods remove duplicate values in Excel?
Excel provides three primary methods for removing duplicate values: the Remove Duplicates tool, Power Query, and Conditional Formatting with manual deletion. The Remove Duplicates tool and Power Query retain the first occurrence of each duplicate row and delete subsequent matches. Conditional Formatting highlights duplicates visually, allowing users to review and delete them manually.
| Method | Function | Version Support |
|---|---|---|
| Remove Duplicates Tool | Deletes duplicate rows, keeps first instance | Excel 2007 and later |
| Power Query | Removes duplicates with refreshable queries | Excel 2016+, Excel 365 |
| Conditional Formatting | Highlights duplicates for manual deletion | Excel 2007 and later |
How to use the Remove Duplicates tool
To use the Remove Duplicates tool, select your data range and access the feature through the Data tab. This method permanently alters the original data, so work on a copy when handling critical datasets.
Step-by-step process
- Select the range of cells containing your data, or click any cell within a formatted table. Excel auto-detects the range.
- Navigate to the Data tab, then click Remove Duplicates in the Data Tools group.
- In the dialog box, check the columns to evaluate for duplicates. All columns are selected by default for full-row matching. Check "My data has headers" when your first row contains column labels to exclude it from duplication checks.
- Click OK. Excel removes duplicates, keeps the first occurrence, and displays a summary of removed duplicates and remaining unique values.
Column selection considerations
Selecting specific columns checks for matches only in those fields. This approach supports multi-key deduplication, such as checking patient ID and visit date together in clinical datasets. Uncheck "My data has headers" for headerless ranges to avoid unintended removals where labels are treated as data.
Data backup recommendations
The Remove Duplicates tool processes entire contiguous ranges and cannot undo changes after saving. Press Ctrl+Z immediately to reverse the operation when needed. Test on a subset first for large datasets before applying to the full range.
How to remove duplicates using Power Query
To remove duplicates using Power Query, load your data into the Power Query Editor and use the Remove Duplicates command. Power Query handles large datasets with refreshable queries and preserves the transformation steps for repeated use.
Step-by-step process
- Select any cell within your data range.
- Go to Data tab, click Get Data, then select From Table/Range. Your data loads into the Power Query Editor.
- Select the column or columns to check for duplicates by clicking the column header. Hold Ctrl to select multiple columns.
- Right-click the selected column header and choose Remove Duplicates, or go to Home tab and click Remove Rows, then Remove Duplicates.
- Click Close & Load to return the deduplicated data to Excel as a new table.
Advantages for large datasets
Power Query stores transformation steps in a query that refreshes when source data changes. This feature eliminates the need to repeat manual deduplication for recurring data imports. Power Query is standard in Excel 2016 and later versions, including Excel 365.
How to use Conditional Formatting to identify and remove duplicates
To use Conditional Formatting for duplicate removal, apply the Duplicate Values rule to highlight matching cells, then manually review and delete the flagged rows. This method provides visual verification before permanent deletion.
Step-by-step process
- Select the range or column to check for duplicates.
- Go to Home tab, click Conditional Formatting, then select Highlight Cells Rules and choose Duplicate Values.
- Select a formatting style from the dropdown (default highlights duplicates in light red with dark red text) and click OK.
- Review the highlighted cells, then manually delete duplicate rows by right-clicking and selecting Delete Row, or use the filter to isolate and delete duplicates.
When to use this method
Conditional Formatting suits scenarios requiring visual inspection before deletion. Users can verify which values Excel identifies as duplicates and decide which rows to remove. This method is available in Excel 2007 and later versions.
Which method should you use?
The Remove Duplicates tool provides the fastest solution for straightforward deduplication tasks on small to medium datasets. Power Query suits large datasets requiring repeated deduplication or integration with data transformation workflows. Conditional Formatting works best when manual review of duplicates is necessary before deletion.
| Scenario | Recommended Method |
|---|---|
| Quick one-time deduplication | Remove Duplicates Tool |
| Large datasets with recurring updates | Power Query |
| Visual review before deletion | Conditional Formatting |
| Audit trail requirements | Power Query (preserves steps) |
. Was this Helpful?
Get Online Tutoring or Questions answered by Experts.
You can post a question for a tutor or set up a tutoring session
Answers · 1
What is the formula to highlight duplicate values
Answers · 1
What is the formula to remove duplicates
Answers · 1
How to delete duplicate values in excel
Answers · 1
How to delete duplicate rows based on one column
Answers · 1