How to remove duplicates in excel based on two columns
Tutor 5 (54 Reviews)
Excel Tutor
Still stuck with a Excel question
Ask this expertAnswer
To remove duplicates in Excel based on two columns, you can use four primary methods: the built-in Remove Duplicates feature, formula-based approaches with helper columns, Power Query, or VBA macros. Each method identifies duplicate rows where values in both specified columns match exactly and keeps the first occurrence by default.
Using Excel's built-in remove duplicates feature
Excel's Remove Duplicates feature under the Data tab eliminates duplicate rows based on selected columns. This tool works efficiently for datasets with headers and identifies duplicates only where all chosen columns match exactly.
Step-by-step process:
- Select the range of data containing the two columns (include headers for accurate detection).
- Navigate to the Data tab, then click Remove Duplicates in the Data Tools group.
- In the Remove Duplicates dialog box, confirm "My data has headers" when applicable.
- Check only the two specific columns to base duplicates on and uncheck all others.
- Click OK; Excel displays a summary of removed duplicates and retained rows.
Back up your data first, as removals are permanent without Undo in some cases. The tool deletes entire rows where duplicates occur across the selected columns.
Formula-based methods to identify duplicates
Excel offers formula-based methods to identify or remove duplicates based on two columns without altering the original data. These approaches use helper columns with functions like COUNTIFS for identification or UNIQUE/FILTER for extraction in Excel 365/2021. They provide flexibility for review before deletion.
Identifying duplicates with COUNTIFS
Create a helper column (e.g., column D) with this formula to flag duplicates based on columns A and B, assuming data starts in row 2:
=IF(COUNTIFS($A$2:$A$1000,$A2,$B$2:$B$1000,$B2)>1,"Duplicate","Unique")
Drag the formula down to cover your range. The formula counts matching pairs and labels rows beyond the first occurrence. Filter or sort by this column to review duplicates before taking action.
Extracting unique rows in Excel 365/2021
Use the UNIQUE function in a new sheet to pull unique rows:
=UNIQUE(A2:C1000)
This extracts unique rows based on all columns. For two-column specificity, create a helper concatenation column first using =A2&"|"&B2, then apply UNIQUE to that column.
Helper column concatenation method
Combine two columns in a helper column to avoid false matches (e.g., "AB" matching "A" + "B"):
=A2&"|"&B2
Apply COUNTIFS on the concatenated column:
=COUNTIFS($D$2:$D$1000,D2)>1
Delete filtered duplicates manually or use Advanced Filter on the unique list for non-destructive removal.
Using Power Query to remove duplicates
Yes, Power Query in Excel provides a straightforward way to remove duplicates based on two specific columns. This method is non-destructive to the source data and refreshes automatically on updates.
Step-by-step process:
- Select your data range or table.
- Go to Data > From Table/Range to load it into Power Query Editor.
- Select only the two columns for duplicate detection (hold Ctrl to pick non-adjacent columns).
- Right-click a selected column header or go to Home > Remove Rows > Remove Duplicates.
- Click Close & Load to output the cleaned data as a new table or connection.
The status bar shows removed duplicate counts. Sorting beforehand ensures the first instance kept is predictable.
Power Query keeps other columns intact while basing removal solely on the chosen pair, unlike full-row checks. Use Group By for advanced scenarios requiring aggregation of duplicates first.
VBA macro method for automation
Yes, VBA macros provide an automated way to remove duplicates based on two specific columns. The Range.RemoveDuplicates method handles this efficiently by specifying columns via an array, keeping the first occurrence.
Basic VBA code
Press Alt+F11 to open the VBA editor, insert a new module, and use this code for a range like A1:C100 on the active sheet:
Sub RemoveDuplicatesTwoColumns()
ActiveSheet.Range("A1:C100").RemoveDuplicates Columns:=Array(1, 2), Header:=xlYes
End Sub
Run it via F5 or assign to a button. The macro checks columns 1 (A) and 2 (B) for matches, deleting entire duplicate rows.
Advanced automation for multiple sheets
For dynamic ranges or batch processing across multiple sheets, use this expanded code:
Sub BatchRemoveDuplicates()
Dim ws As Worksheet
For Each ws In ThisWorkbook.Worksheets
Dim rng As Range
Set rng = ws.UsedRange
If rng.Rows.Count > 1 Then
rng.RemoveDuplicates Columns:=Array(1, 2), Header:=xlYes
End If
Next ws
End Sub
This loops through all sheets and adjusts for data size automatically.
Test on a copy first, as changes are irreversible without Undo. Customize the array (e.g., Array(1,3)) for non-adjacent columns. This method works in Excel 2010 and later versions.
Get Online Tutoring or Questions answered by Experts.
You can post a question for a tutor or set up a tutoring session
Answers · 1
How to show duplicates in red
Answers · 1
How to apply duplicate conditional formatting rule
Answers · 1
How to duplicate conditional formatting excel
Answers · 1
What is excel conditional formatting duplicates
Answers · 1