Pages

Sunday, June 23, 2013

How to Get Rid of Duplicates in MySQL

If your MySQL table contains hundreds or thousands of records, you don't want to manually look through each one to try to identify duplicate records. MySQL does not have a single query that you may use to remove duplicate entries in a table, but you may use a combination of queries and execute them together to eliminate your doubles. Even if your table uses a primary key to uniquely identify records, you may remove duplicates by not selecting the primary key field in your queries.

Instructions

    1

    Log in to your MySQL database and access the query tool.

    2

    Enter the following query:

    CREATE TEMPORARY TABLE temp SELECT DISTINCT col1,col2,col3 FROM table_name GROUP BY col1,col2,col3 HAVING COUNT(*)>1;

    This creates a temporary table that checks the table supplied by "table_name" and when it finds a duplicate record, it inserts that duplicate into the temporary table, but only once. For example, if "table_name" has three entries, "Jim," "Bob" and "Bob," the "temp" table will only have one record, "Bob." If "col1" acts as a primary key field in the table, do not include "col1" in the query and simply select the two other columns.

    3

    Enter the following query:

    DELETE FROM table_name WHERE (col1,col2,col3) IN (SELECT col1,col2,col3 FROM temp);

    This query checks each record in "table_name" and deletes all records that have duplicates, including the original. For example, if the table has three entries, "Jim," "Bob" and "Bob," the query will delete both "Bob" records, leaving only "Jim" in "table_name." Again, remove references to "col1" in the query if "col1" is the primary key field.

    4

    Enter the following query:

    INSERT INTO table_name(col1,col2,col3) SELECT col1,col2,col3 FROM temp;

    This query essentially copies all of the records from the temporary table and inserts them back into "table_name." If at this point "table_name" only has "Jim" and "temp" only has "Bob," after the query, "table_name" will have "Jim" and "Bob" records while the temporary table gets deleted. Remember to remove "col1" references if it is a primary key.

    5

    Execute all of the queries.

0 comments:

Post a Comment