View Full Version : SQL Query - A Hard One
mlbfan
01-19-2002, 06:41 AM
I am certain this query is easy for somebody here, but it not something I've ever tried so I could use some direction. Here are the details.
I have a table called Person. It has roughly 16,000 unique records. I am going to add a new field in it called FinalYear.
I have another table called Batting. It has more than 80,000 records. Each persons batting statistics are kept here.
So there is only one ripkenca01 (Cal Ripken) in person, but there are lots of them in batting as its his user ID, then the YEAR he played, then the team he played with, then his batting line for that year.
I emphasized YEAR because therein lies the problem. I want a query that goes through batting, looks at each and every PersonID (that is the name of the field with the ripkenca01 item), picks the final year (which would be the one closest to the current year) that PersonID played, then finds that same PersonID record in the Person table and puts the result in the newly vacant FinalYear field.
Hard to me, but hopefully easy to somebody else...
Sean
I can't think of any efficient ways to do that. Seems like it would require looping through each of the Person ID's and selecting the MAX Year in the Batting table corresponding to it. You could save some time on the other end by placing the ID/Year combos into a VALUES() type of string that could be inserted all at once into the database once the loop is complete (instead of 16,000 separate inserts).
Another thought which might reduce processing time or might not: Insert 2001 (or 2002) for everyone's last year and then attack it rom the opposite direction, doing a MIN search and replacing the year for non current players. If you have a lot of active players in the database, this should reduce the amount of time.
Dan
mlbfan
01-19-2002, 01:47 PM
Efficient is a major issue, but I don't mind investing some time towards this as it will mean that I can add 1 new .php script and get more than 200 pages on my site.
How does something like this sound and would it work well?
Update FinalYear in Persons where Maximum Year="1999" in Batting
Each year would consist of 70-200 entries. If I could create a query like that then I would be pulling that number from the 81,000 in batting and updating only that number in Persons.
Does this sound better? Is is possible? Any ideas how to construct said query? I know I would need to switch the year manually, but as I mentioned above it would be a worthwhile tradeoff...
Sean
GregJ
01-19-2002, 02:53 PM
This might be done as a two step process. I've not tested yet so I don't know for sure. Some of the experts here could critique this idea for validity.
First select PersonID and max(Year) from Batting with a group by on PersonID and Year having Year < 'This-year -1' into a temporary table. Then using the temporary table to update the Person Table where FinalYear is empty.
It seems to me that if this is run only once a year, efficiency isn't of paramount importance, although we don't want to get out of hand.
-g
Wouldn't this be a one-time thing to get the final years into your existing/updated table? If it is a recurring query you are talking about, then efficiency becomes very important for that size of a table... Personally, I would do a one-time update and leave the FinalYear column as an open ended field that gets filled in down the road when someone retires (no entry means still active).
Dan
mlbfan
01-19-2002, 05:14 PM
I was able to make a temporary table with ONLY two fields:
PersonID
FinalYear
This new table is called temp
I crosschecked a set of the data and they were all correct. So part one is done. Now I can't figure out how to update 1 table with data from another table where a condition is being met. I've tried countless combinations and the documention on the Update command is sparse...
Note: temp2 is the name of the table with a blank field called FinalYear that needs the data in temp.FinalYear
Basically I'm trying to:
UPDATE temp2 SET FinalYear=temp.FinalYear WHERE temp2.PersonID is the same as temp.PersonID;
Obviously I need a little syntax assistance. Since I'm working with temp tables I've tried quite a few variations and none seem to be working. Any help would be most welcome...
Sean
I'm not sure a temp table is going to be of any help here. You still need to select each item from the temp table and add it to the persons table, which when combined with the creation of the temp table initially, seems no more efficient.
I don't think there's any way to do it for UPDATE, but you want something like:
INSERT INTO Person SELECT PersonID, FinalYear FROM Temp;
Obviously, that would not accomplish what you want (something like that would work for creating the Person table as you go, but not for adding to existing fields)...
Dan
jadero
01-20-2002, 12:38 PM
Originally posted by dank:
I'm not sure a temp table is going to be of any help here. You still need to select each item from the temp table and add it to the persons table, which when combined with the creation of the temp table initially, seems no more efficient.
I don't think there's any way to do it for UPDATE, but you want something like:
INSERT INTO Person SELECT PersonID, FinalYear FROM Temp;
Obviously, that would not accomplish what you want (something like that would work for creating the Person table as you go, but not for adding to existing fields)...
Dan
... and that hits the nail on the head. I'm not familiar enough with the syntax and allowable operations of MySQL compared to Access (but I'm getting there) to say for sure, but why not try this:
1. rename Person to PersonOLD
2. create table Person using structure of PersonOLD
3. run INSERT from Person-->Batting INTO Person
4. delete PersonOLD (after some kind of verification, of course!)
It might seem like a silly thing to do, but the solution is a pure 'set' solution (i.e. no record-walking loops), so it should be very efficient. Depending on what JOIN syntax MySQL allows, one of the following should work:
(No JOIN syntax)
INSERT INTO tblNewPlayer ( plyID, plyName, plyFinalYear )
SELECT tblPlayer.plyID, tblPlayer.plyName, Max(tblBat.Year) AS MaxOfYear
FROM tblPlayer, tblBat
WHERE (((tblBat.plyID)=[tblPlayer].[plyID]))
GROUP BY tblPlayer.plyID, tblPlayer.plyName
(JOIN syntax)
INSERT INTO tblNewPlayer ( plyID, plyName, plyFinalYear )
SELECT tblPlayer.plyID, tblPlayer.plyName, Max(tblBat.Year) AS MaxOfYear
FROM tblPlayer INNER JOIN tblBat ON tblPlayer.plyID = tblBat.plyID
GROUP BY tblPlayer.plyID, tblPlayer.plyName
Note that for speed and simplicity, I developed the query specs in Access, so there might be some clean-up required to make one of these work with MySQL.
jadero
01-20-2002, 12:43 PM
Forgot something in my last post. Depending on what you are actually trying to accomplish, there may not be a need to actually transfer the MAX(Year) to the Person table. Simply build a SELECT query similar to one of the INSERT queries I posted and use that as your record source. To be perfectly honest, I can't really think of any reason why you would need to break normalisation on this one.
Good Luck!
mlbfan
01-20-2002, 01:15 PM
Wow! I had assumed incorrectly a couple items. Since I've slowly come to enjoy working in mySQL I believed taking data from 1 table and putting it another would have been MUCH easier.
Its difficult to believe they don't have something as simple as:
Update temp2.FinalYear from temp.FinalYear group by PersonID;
I was guessing I could use that query just one time to get the desired result. Then once per year running something very similar:
Update temp2.FinalYear from temp.FinalYear group by PersonID where FinalYear="2002";
Does anybody know if this is possible?
-- Sean
Here's untested code that I think will work, although given the size of the database you're working with, you should still try to at least break it down into smaller steps (assuming you're not working on a dedicated server).
$year_array = array ();
$sql = "SELECT ID, MAX(Year) AS FinalYear FROM Batting WHERE GROUP BY ID ORDER BY ID ASC";
$result = mysql_query($sql);
if (!$result) {
echo("<p>Error Performing Query: ". mysql_error() ."</p>");
exit();
} else {
while($row = mysql_fetch_array($result)) {
$year_array[$row["ID"]] = $row["Year"];
}
}
for ($i=0; $i < count($year_array); $i++) {
if (isset($year_array[$i]) && ($year_array[$i] != "") {
$sql = "UPDATE Player SET FinalYear = '". $year_array[$i] ."' WHERE ID = '$i'";
$result = mysql_query($sql);
if (!$result) {
echo "<p>Error Performing Query: ". mysql_error() ."</p>";
exit();
} else {
echo "Player #$i updated.<br>";
}
}
}
UPDATE requires a WHERE clause to match the specific row being updated, and I don't see any way to make that function the way you want it to (i.e. joining tables).
Dan
vBulletin® v3.6.8, Copyright ©2000-2013, Jelsoft Enterprises Ltd.