« HAPPY BIRTHDAY!! | Main | My Wiki is Wikied »
March 22, 2005
The Perl saga continues
I spent today working on the Perl script that I mentioned in my post from yesterday. As I previously mentioned, alot of stuff continues to come back to me as I continue to work on my scripts. But I have come across things that I've never done before, so Google has been my friend. It's amazing the things that you can find with a simple Google search.
For example, I was having an issue with getting duplicates of the same data. By doing a Google search on "delete duplicate data perl", I was able to come up with a good site that gave me all of the information that I needed to continue with the project. Seems that the easiest way to do this simple task is to create a temporary table and copy the existing table into it while deleting duplicate data. Here's an example:
CREATE TEMPORARY TABLE temptable AS SELECT DISTINCT * FROM realtable;
This causes SQL to create a temporary table called "temptable" and to be filled with the data from the table called "realtable". Now, since I had this data transfered over, I needed to delete the contents of "realtable" by doing the following:
DELETE FROM realtable;
Now, that command deletes all of the rows from the existing "realtable" without deleting any of the columns or other important information, like indexes and primary keys. I was then able to re-insert the data from the temp table with the following command:
INSERT INTO realtable SELECT * FROM temptable;
Now, "realtable" has all of the original data that it originally contained, minus any of the duplicate data. I have hit one minor issue, however, which should be pretty simple to fix but I'm too tired to fuck with it. In "realtable", I have a column called ID that is a non-null primary key integer that auto populates when data is entered into the table. The problem stems from this auto integer. When the data is copied from "realtable" into "temptable", it's copying the ID column as well.
So what's the big deal?
Well, if there are duplicate rows, they're going to have different ID numbers. The "SELECT DISTINCT" literally looks for rows that are exactly alike. Since these rows have different ID numbers, "SELECT DISTINCT" considers them to be unique and looks over them. The fix should be simple however ... I should be able to leave out the ID column when transfering the data and everything should be all good. I'm just too lazy to do it tonight.
...
Yeah, I'm really going to have to write up a SQL how-to one of these days ...
03/23/2005 - Update: I've fixed the issue I had discussed. The problem was exactly what I had mentioned in this post: the table was including too many columns. The fix was pretty simple. Instead of using the wildcard, I created a new variable with all of the proper fields and plugged in the variable to the existing temp and accounting SQL commands.
~out...
Posted by ed at March 22, 2005 09:28 PM
Comments
Post a comment
Thanks for signing in, . Now you can comment. (sign out)
(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)