delete duplicate rows in sql

How to Delete Duplicate Rows in SQL?

April 7th, 2026
2191
10:00 Minutes

Struggling with duplicate rows in your SQL database? These can potentially compromise data integrity, slow down queries, give inaccurate results and waste storage space. Cleansing duplicate records from our database is a major maintenance task for making sure of data accuracy and performance. The duplicate rows in a SQL table might lead to data inconsistencies and performance issues. This makes it important to identify and remove them successfully.

This article is all about how to delete duplicate rows in SQL. It is designed with the help of experienced professionals to explain every step in detail, from identification to deleting duplicate rows in SQL through different techniques. With this note, let's first understand what duplicate rows are in SQL.

What are Duplicate Rows in SQL?

Duplicate rows are basically records with identical values in one or more columns of SQL databases. This issue mostly occurs due to user errors, multiple imports or missing constraints like unique indexes or primary keys. These are typically deleted by using SQL queries. It involves identifying them using different functions like COUNT() or ROW_NUMBER() and then deleting them with different techniques.

The presence of duplicate rows in SQL can lead to the following issues -

  • Inaccurate data reporting
  • Storage waste
  • Decreased query performance

Related Article - SQL Server Interview Questions And Answers

Become a Microsoft SQL Server Professional

Boost your expertise in database design and performance tuning.

Explore Now

Why Delete Duplicate Rows in SQL?

There are many reasons to delete duplicate rows in SQL. The ultimate goal of this task is to improve the performance of the database. This involves eliminating redundant data to solve problems related to data inaccuracy like performance issues, heavy storage use and slow query performance.

Now we will discuss how to delete duplicate rows in SQL. Deleting a duplicate row in SQL is a two-step process. First, we have to find the duplicate data and then delete it.

How to Find Duplicate Rows in SQL?

Companies nowadays use humongous datasets and it is complicated to find duplicate rows in them. It is something that needs to be done using different queries. SQL mostly stores the data in a tabular form and the same data can be anywhere in it.

There are two clauses, namely GROUP BY and HAVING. This combination of clauses can find duplicate rows. GROUP BY groups all the rows based on a special column. The HAVING filters the identical groups from them.

1. Finding Duplicate Rows in SQL

Let's take an example of a table to understand this process. Use the query given below to group rows by Name and Section. The condition HAVING COUNT(*) > 1 filters the groups that have more than one appearance in the table.

Example -

NameSection
XYZA1
ABCA2
XYZA1

Query -

SELECT Name, Section

FROM tbl

GROUP BY Name, Section

HAVING COUNT(*) > 1;

Output -

NameSectionCNT
XYZA12

2. Finding Duplicate Columns

Duplicate data can also be available in columns. The query given below identifies identical names from columns. The condition HAVING COUNT(Name) > 1 finds the name from columns that appear more than once.

Example -

IDName
1SQL
2MySQL
3SQL

Query -

SELECT Name, COUNT(*) AS CNT

FROM Person

GROUP BY Name

HAVING COUNT(Name) > 1;

Output -

NameCNT
SQL2

3. Finding Duplicate Values from a Temporary Table

What if you have a temporary table? The query given below uses a subquery that creates a temporary table. This table counts the repetition of each Name in the table. The outer query then finds the rows that have appeared more than once.

Query -

SELECT Name

FROM (

SELECT Name, COUNT(*) AS num

FROM Person

GROUP BY Name

) AS statistic

WHERE num > 1;

Output -

NameCNT
SQL2

Related Article -  What is PL/SQL- Everything You Need To Know

Master Oracle SQL Certification with Expert Training

Boost your skills in querying, database management, and performance tuning.

Explore Now

How to Delete Duplicate Rows in SQL?

There are many ways to do it and the choice of a perfect one depends on the situation the individual is dealing with. Let's create a sample table and then apply different methods to it to understand it better.

Query-

CREATE TABLE Employee

(

[ID] INT IDENTITY(1,1),

[FirstName] VARCHAR(100),

[LastName] VARCHAR(100),

[Country] VARCHAR(100)

);

GO

INSERT INTO Employee ([FirstName], [LastName], [Country])

VALUES

('Virat', 'Kohli', 'India'),

('Virat', 'Kohli', 'India'),

('Arijit', 'Singh', 'India'),

('Sam', 'Curran', 'USA'),

('Sam', 'Curran', 'USA'),

('Sam', 'Curran', 'USA');

Output (Table):

IDFirst NameLast NameCountry
1ViratKohliIndia
2ViratKohliIndia
3ArijitSinghIndia
4SamCurranUSA
5SamCurranUSA
6SamCurranUSA

I. Using Group By & Having Clause

1. Start with identifying the duplicate rows by using the same methods we have discussed in the above section. This will give you all the duplicate rows with the number of their repetition. The output will be as follows -

IDFirst NameLast NameCountryCNT
1ViratKohliIndia2
4SamCurranUSA3

2. Now we need to remove only duplicate rows from our table. Use the SQL MAX function and the max ID of duplicate rows.

SELECT FirstName, LastName, Country, COUNT(*) AS CNT

FROM Employee

GROUP BY FirstName, LastName, Country

HAVING COUNT(*) > 1;

3. Now use SQL delete statements to remove the duplicate rows as shown below.

DELETE FROM Employee

WHERE ID NOT IN (

SELECT MAX(ID)

FROM Employee

GROUP BY FirstName, LastName, Country

);

Final Output -

IDFirst NameLast NameCountry
2ViratKohliIndia
6SamCurranUSA

II. Using Common Table Expressions (CTE)

Using CTE is one of the best methods and is only applicable on SQL Server starting from 2005.

1. It uses the SQL ROW_NUMBER function and adds unique sequential numbers for each row.

WITH CTE AS (

SELECT ID, FirstName, LastName, Country,

ROW_NUMBER() OVER(PARTITION BY FirstName, LastName, Country ORDER BY ID) AS RowNum

FROM Employee

)

DELETE FROM CTE

WHERE RowNum > 1;

Output -

IDFirst NameLast NameCountryDupilcate Count
1ArijitSinghIndia1
2SamCurranUSA1
3SamCurranUSA2
4SamCurranUSA3
5ViratKohliIndia1
6ViratKohliIndia2

2. Now use the CTE function to remove the duplicate rows.

WITH CTE([FirstName],

[LastName],

[Country],

DuplicateCount)

AS (SELECT [FirstName],

[LastName],

[Country],

ROW_NUMBER() OVER(PARTITION BY [FirstName],

[LastName],

[Country]

ORDER BY ID) AS DuplicateCount

FROM [SampleDB].[dbo].[Employee])

DELETE FROM CTE

WHERE DuplicateCount > 1;

Final Output -

IDFirst NameLast NameCountry
1ViratKohliIndia
3ArjitSinghIndia
4SamCurranUSA

III. Using Rank Function

Use of SQL RANK gives a unique ID to each row, including the identical ones.

1. This query uses a PARTITION BY clause that prepares a data subset and gives a rank to each partition of it.

WITH Ranked AS (

SELECT ID, FirstName, LastName, Country,

ROW_NUMBER() OVER(PARTITION BY FirstName, LastName, Country ORDER BY ID) AS RowNum

FROM Employee

)

DELETE FROM Ranked

WHERE RowNum > 1;

Output -

IDFirst NameLast NameCountryRank
1ViratKohliIndia1
2ViratkohliIndia2
3ArijitSinghIndia1
4SamCurranUSA1
5SamCurranUSA2
6SamCurranUSA3

2. Now use the following query and remove the rows that have more than one rank -

DELETE E

FROM [SampleDB].[dbo].[Employee] E

INNER JOIN

(

SELECT *,

RANK() OVER(PARTITION BY firstname,

lastname,

country

ORDER BY id) rank

FROM [SampleDB].[dbo].[Employee]

) T ON E.ID = t.ID

WHERE rank > 1;

Final Output -

IDFirst NameLast NameCountry
1ViratKolhiIndia
3ArijitSinghIndia
4SamCurranUSA

Wrapping Up

Data impurities like duplicate rows in SQL can lead to major performance issues for companies. This may affect all the tasks related to data, from decision-making to business intelligence. This is why it is important to find and delete duplicate rows in SQL before it's too late. There are many ways to establish data integrity and this article has already explained many of them.

FAQs

Q1. What are Duplicate Records in SQL?

Duplicate records in SQL are rows that repeat more than once. It is a type of data impurity that can reduce the performance of a company. These have to be deleted as soon as possible.

Q2. Are duplicate rows allowed in SQL?

The use of unique constraints in a table can prevent duplicate rows. Many experts use this best practice in their work. There are still some chances where duplicate rows can occur such as human error, application bug or uncleaned data.

Q3. Why am I getting duplicate rows in SQL?

This often occurs when data is refreshed in upstream systems or ETL jobs. It can also arise due to incorrect joins, duplicate data in source tables, etc.

About the Author
Nehal Somani
About the Author

Nehal Somani is a technology writer specializing in Machine Learning, Artificial Intelligence, Deep Learning, and Robotic Process Automation. She simplifies complex concepts into clear, practical insights with an engaging style, helping beginners and professionals build knowledge, explore innovations, and stay updated in the fast-evolving tech landscape.

Drop Us a Query
Fields marked * are mandatory
×

Your Shopping Cart


Your shopping cart is empty.