Web Design Blog

This is where we store some of our Web Development thoughts, tips and tricks, just because we like to share.
spacer
Web Design Blog » MySQL Functions, Tips & Tricks »

MySQL- Remove Duplicate Rows

Tweet

I’ve been working with meduim sized tables (2.1million rows) in MySQL lately. One particular table had a lot of duplicate rows, which I needed to filter out. I’m quickly going to demonstrate how I did it. I’m sure there are many ways of doing this, but this method proved to be the easiest for me.

Example 1: Removing rows with a specific duplicate field:

As you can see, the field “postcode” has duplicated rows.
spacer

STEP 1: Copy table structure

I used the following code to duplicate the table structure of `postcodes` (if you use PhpMyAdmin, you can use the shortcuts):

1
2
3
4
5
CREATE TABLE `DB_NAME`.`postcodes2` (
`postcodenospace` varchar( 10 ) NOT NULL ,
`longitude` varchar( 15 ) NOT NULL ,
`latitude` varchar( 15 ) NOT NULL
)

STEP 2: run query:

1
2
3
4
5
6
7
8
9
10
INSERT INTO
postcodes2
(
postcode,
longitude,
latitude
)
SELECT postcode,longitude,latitude
FROM `postcodes` 
GROUP BY postcode

That’s it. All the rows with a unique postcode will get inserted into the table “postcodes2″.

The result

spacer

The key is in the SELECT query, where the postcode field is grouped together. That effectively puts all the duplicate postcodes together, generating a result of rows with unique postcodes.

Example 2: Removing duplicate rows:

STEP 1: Copy table structure

Use the code above to copy the table structure.

This example is slightly different, look at the data:

spacer

The previous example simply filtered the data by duplicate postcodes. As you can see, the highlighted row has different longitude and latitude values. So if you want to filter the data by unique rows, you need the following query:

STEP 2: Run the query

1
2
3
4
5
6
7
8
9
10
INSERT INTO
postcodes2
(
postcode,
longitude,
latitude
)
SELECT postcode,longitude,latitude
FROM `postcodes` 
GROUP BY concat(postcode,longitude,latitude)

The result

spacer

Again, the key is the SELECT query. This time the query groups all the fields together (using the CONCAT function), to return rows with completely unique values.

spacer
22 Nov 2009 / 5 Comments / MySQL Functions, Tips & Tricks / by Maruf

Related Articles

5 Comments

  1. spacer
    Kevin Pajak
    06/02/2010
    1

    Good article! One point of clarification to your readers though: you will want to be sure that the name of the column you are entering data into matches the name of the column as you set it up when you were creating your table. Ex: You created the table thusly:
    CREATE TABLE `DB_NAME`.`postcodes2` (
    `postcodenospace` varchar( 10 ) NOT NULL ,
    `longitude` varchar( 15 ) NOT NULL ,
    `latitude` varchar( 15 ) NOT NULL
    )

    So when you INSERT data it would be

    INSERT INTO
    postcodes2
    (
    postcodenospace,
    longitude,
    latitude
    )

    Nice work – thanks again for posting spacer .

  2. spacer
    kandar
    06/02/2010
    2

    Great! my question is how to prevent it happen again in the future?

  3. spacer
    Maruf
    06/02/2010
    3

    @Kevin,
    Good point spacer

    @Kandar,
    I will write an article on this shortly. Not sure I can explain how to do it without a live example.

    Kind regards.

  4. spacer
    kandar
    08/02/2010
    4

    thanks for your response. I’ll be wait for that article.

  5. spacer
    cursedhymn
    11/11/2010
    5

    <3 ty so much for this post. This post helped me figure out how to knock out the query I was having issues with.

Leave a Reply

spacer spacer
spacer
Privacy Policy  |   Terms & Conditions For Use Of This Website  |   Web Design Terms & Conditions
© 2012 BrightCherry :)
gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.