1-800-7430-173 (US Toll Free)

+91-7240-740-740 (India)

info@igmguru.com

Blog Big Data How to Delete Duplicate Rows in SQL?

delete duplicate rows in sql

How to Delete Duplicate Rows in SQL?

By: Nehal Somani

Last Updated: April 7th, 2026

Views: 2321

Read Time: 10:00 Minutes

1. What are Duplicate Rows in SQL?

2. Why Delete Duplicate Rows in SQL?

3. How to Find Duplicate Rows in SQL?

1. Finding Duplicate Rows in SQL

2. Finding Duplicate Columns

3. Finding Duplicate Values from a Temporary Table

4. How to Delete Duplicate Rows in SQL?

I. Using Group By & Having Clause

II. Using Common Table Expressions (CTE)

III. Using Rank Function

5. Wrapping Up

6. FAQs

Q1. What are Duplicate Records in SQL?

Q2. Are duplicate rows allowed in SQL?

Q3. Why am I getting duplicate rows in SQL?

Struggling with duplicate rows in your SQL database? These can potentially compromise data integrity, slow down queries, give inaccurate results and waste storage space. Cleansing duplicate records from our database is a major maintenance task for making sure of data accuracy and performance. The duplicate rows in a SQL table might lead to data inconsistencies and performance issues. This makes it important to identify and remove them successfully.

This article is all about how to delete duplicate rows in SQL. It is designed with the help of experienced professionals to explain every step in detail, from identification to deleting duplicate rows in SQL through different techniques. With this note, let's first understand what duplicate rows are in SQL.

What are Duplicate Rows in SQL?

Duplicate rows are basically records with identical values in one or more columns of SQL databases. This issue mostly occurs due to user errors, multiple imports or missing constraints like unique indexes or primary keys. These are typically deleted by using SQL queries. It involves identifying them using different functions like COUNT() or ROW_NUMBER() and then deleting them with different techniques.

The presence of duplicate rows in SQL can lead to the following issues -

Inaccurate data reporting

Storage waste

Decreased query performance

Related Article - SQL Server Interview Questions And Answers

Why Delete Duplicate Rows in SQL?

There are many reasons to delete duplicate rows in SQL. The ultimate goal of this task is to improve the performance of the database. This involves eliminating redundant data to solve problems related to data inaccuracy like performance issues, heavy storage use and slow query performance.

Now we will discuss how to delete duplicate rows in SQL. Deleting a duplicate row in SQL is a two-step process. First, we have to find the duplicate data and then delete it.

How to Find Duplicate Rows in SQL?

Companies nowadays use humongous datasets and it is complicated to find duplicate rows in them. It is something that needs to be done using different queries. SQL mostly stores the data in a tabular form and the same data can be anywhere in it.

There are two clauses, namely GROUP BY and HAVING. This combination of clauses can find duplicate rows. GROUP BY groups all the rows based on a special column. The HAVING filters the identical groups from them.

1. Finding Duplicate Rows in SQL

Let's take an example of a table to understand this process. Use the query given below to group rows by Name and Section. The condition HAVING COUNT(*) > 1 filters the groups that have more than one appearance in the table.

Example -

Name	Section
XYZ	A1
ABC	A2
XYZ	A1

Query -

SELECT Name, Section

FROM tbl

GROUP BY Name, Section

HAVING COUNT(*) > 1;

Output -

Name	Section	CNT
XYZ	A1	2

2. Finding Duplicate Columns

Duplicate data can also be available in columns. The query given below identifies identical names from columns. The condition HAVING COUNT(Name) > 1 finds the name from columns that appear more than once.

Example -

ID	Name
1	SQL
2	MySQL
3	SQL

Query -

SELECT Name, COUNT(*) AS CNT

FROM Person

GROUP BY Name

HAVING COUNT(Name) > 1;

Output -

Name	CNT
SQL	2

3. Finding Duplicate Values from a Temporary Table

What if you have a temporary table? The query given below uses a subquery that creates a temporary table. This table counts the repetition of each Name in the table. The outer query then finds the rows that have appeared more than once.

Query -

SELECT Name

FROM (

SELECT Name, COUNT(*) AS num

FROM Person

GROUP BY Name

) AS statistic

WHERE num > 1;

Output -

Name	CNT
SQL	2

Related Article - What is PL/SQL- Everything You Need To Know

How to Delete Duplicate Rows in SQL?

There are many ways to do it and the choice of a perfect one depends on the situation the individual is dealing with. Let's create a sample table and then apply different methods to it to understand it better.

Query-

CREATE TABLE Employee

(

[ID] INT IDENTITY(1,1),

[FirstName] VARCHAR(100),

[LastName] VARCHAR(100),

[Country] VARCHAR(100)

);

GO

INSERT INTO Employee ([FirstName], [LastName], [Country])

VALUES

('Virat', 'Kohli', 'India'),

('Virat', 'Kohli', 'India'),

('Arijit', 'Singh', 'India'),

('Sam', 'Curran', 'USA'),

('Sam', 'Curran', 'USA'),

('Sam', 'Curran', 'USA');

Output (Table):

ID	First Name	Last Name	Country
1	Virat	Kohli	India
2	Virat	Kohli	India
3	Arijit	Singh	India
4	Sam	Curran	USA
5	Sam	Curran	USA
6	Sam	Curran	USA

I. Using Group By & Having Clause

1. Start with identifying the duplicate rows by using the same methods we have discussed in the above section. This will give you all the duplicate rows with the number of their repetition. The output will be as follows -

ID	First Name	Last Name	Country	CNT
1	Virat	Kohli	India	2
4	Sam	Curran	USA	3

2. Now we need to remove only duplicate rows from our table. Use the SQL MAX function and the max ID of duplicate rows.

SELECT FirstName, LastName, Country, COUNT(*) AS CNT

FROM Employee

GROUP BY FirstName, LastName, Country

HAVING COUNT(*) > 1;

3. Now use SQL delete statements to remove the duplicate rows as shown below.

DELETE FROM Employee

WHERE ID NOT IN (

SELECT MAX(ID)

FROM Employee

GROUP BY FirstName, LastName, Country

);

Final Output -

ID	First Name	Last Name	Country
2	Virat	Kohli	India
6	Sam	Curran	USA

II. Using Common Table Expressions (CTE)

Using CTE is one of the best methods and is only applicable on SQL Server starting from 2005.

1. It uses the SQL ROW_NUMBER function and adds unique sequential numbers for each row.

WITH CTE AS (

SELECT ID, FirstName, LastName, Country,

ROW_NUMBER() OVER(PARTITION BY FirstName, LastName, Country ORDER BY ID) AS RowNum

FROM Employee

)

DELETE FROM CTE

WHERE RowNum > 1;

Output -

ID	First Name	Last Name	Country	Dupilcate Count
1	Arijit	Singh	India	1
2	Sam	Curran	USA	1
3	Sam	Curran	USA	2
4	Sam	Curran	USA	3
5	Virat	Kohli	India	1
6	Virat	Kohli	India	2

2. Now use the CTE function to remove the duplicate rows.

WITH CTE([FirstName],

[LastName],

[Country],

DuplicateCount)

AS (SELECT [FirstName],

[LastName],

[Country],

ROW_NUMBER() OVER(PARTITION BY [FirstName],

[LastName],

[Country]

ORDER BY ID) AS DuplicateCount

FROM [SampleDB].[dbo].[Employee])

DELETE FROM CTE

WHERE DuplicateCount > 1;

Final Output -

ID	First Name	Last Name	Country
1	Virat	Kohli	India
3	Arjit	Singh	India
4	Sam	Curran	USA

III. Using Rank Function

Use of SQL RANK gives a unique ID to each row, including the identical ones.

1. This query uses a PARTITION BY clause that prepares a data subset and gives a rank to each partition of it.

WITH Ranked AS (

SELECT ID, FirstName, LastName, Country,

ROW_NUMBER() OVER(PARTITION BY FirstName, LastName, Country ORDER BY ID) AS RowNum

FROM Employee

)

DELETE FROM Ranked

WHERE RowNum > 1;

Output -

ID	First Name	Last Name	Country	Rank
1	Virat	Kohli	India	1
2	Virat	kohli	India	2
3	Arijit	Singh	India	1
4	Sam	Curran	USA	1
5	Sam	Curran	USA	2
6	Sam	Curran	USA	3

2. Now use the following query and remove the rows that have more than one rank -

DELETE E

FROM [SampleDB].[dbo].[Employee] E

INNER JOIN

(

SELECT *,

RANK() OVER(PARTITION BY firstname,

lastname,

country

ORDER BY id) rank

FROM [SampleDB].[dbo].[Employee]

) T ON E.ID = t.ID

WHERE rank > 1;

Final Output -

ID	First Name	Last Name	Country
1	Virat	Kolhi	India
3	Arijit	Singh	India
4	Sam	Curran	USA

Wrapping Up

Data impurities like duplicate rows in SQL can lead to major performance issues for companies. This may affect all the tasks related to data, from decision-making to business intelligence. This is why it is important to find and delete duplicate rows in SQL before it's too late. There are many ways to establish data integrity and this article has already explained many of them.

FAQs

Q1. What are Duplicate Records in SQL?

Duplicate records in SQL are rows that repeat more than once. It is a type of data impurity that can reduce the performance of a company. These have to be deleted as soon as possible.

Q2. Are duplicate rows allowed in SQL?

The use of unique constraints in a table can prevent duplicate rows. Many experts use this best practice in their work. There are still some chances where duplicate rows can occur such as human error, application bug or uncleaned data.

Q3. Why am I getting duplicate rows in SQL?

This often occurs when data is refreshed in upstream systems or ETL jobs. It can also arise due to incorrect joins, duplicate data in source tables, etc.

About the Author

Nehal Somani

About the Author

Nehal Somani

Nehal Somani is a technology writer specializing in Machine Learning, Artificial Intelligence, Deep Learning, and Robotic Process Automation. She simplifies complex concepts into clear, practical insights with an engaging style, helping beginners and professionals build knowledge, explore innovations, and stay updated in the fast-evolving tech landscape.

Drop Us a Query

Fields marked * are mandatory

Name

Email

Phone Number

Comments

Recent Post

Tableau Certifications Guide
July 3rd, 2026
Top Artificial Intelligence Interview Questions
July 3rd, 2026
Top Mendix Interview Questions and Answers
July 3rd, 2026
13 Best ChatGPT Alternatives for 2026
July 3rd, 2026
Python vs. R: What's the Difference?
July 3rd, 2026

Recommended Reads

splunk careers

Splunk Careers and Opportunities in Big Data

what is splunk

What is Splunk?

MongoDB vs MySQL

MongoDB vs MySQL: Understanding Key Differences

how to install mongodb on windows

How to Install MongoDB on Windows?

Splunk Logs

Splunk Logs- A Guide For Beginners

What is Informatica

What is Informatica and How Does It Work?

Python vs JavaScript

Python vs JavaScript: A Comparison Guide

SQL Courses

View All

×

Your Shopping Cart

Your shopping cart is empty.

Checkout

Continue Shopping