View Full Version : Efficiency for Large Databases
outline
02-11-2000, 08:42 PM
OK...[nbsp][nbsp]A while back I asked for the best approach to building a database which uses several hundred thousands records (Oracle vs MySQL).[nbsp][nbsp]From everyone's response I chose MySQL.
Overall I have been very happy with it, however, we just went ahead and loaded junk content.[nbsp][nbsp]Their are about 12 different tables joined through a master keyword index table.[nbsp][nbsp]We loaded about 500,000 keywords into the database and since then I have noticed a significant speed problem.
Everything is being implemented through PHP and the efficiency is pretty decent - roughly O(N^2/3).[nbsp][nbsp]We are in a shared environment still.. but my question is this:
What can I do to greatly increase the speed?
If I have the client make the leap from a $200 / month server to a $800 / month dedicated server - will there be a significant increase in overall speed when the site is being accessed by many people?
If I compile my PHP scripts and run ZEND, will that increase the speed or is the bottleneck most likely MySQL?
Any other suggestions would be greatly appreciated...[nbsp][nbsp]I'm still trying to figure out how www.northernlight.com (http://www.northernlight.com) is so stinking fast!![nbsp][nbsp]I find that amazing.
ANY HELP APPRECIATED!!!!
------------------
Stephen
02-12-2000, 04:28 AM
i'm no guru on improving database performance, but i do know that when you're joining tables you need to be very careful about how you do the joins (e.g. the order of the joins) to get the best efficiency. haphazard joining probably won't reveal any problems until you try to scale things up. so be sure to consult an SQL reference manual for this particular aspect of the problem. you might even discover that with some tweaking you can make some decent gains.
of course, i'm just thinking out loud...
heath
02-12-2000, 08:55 AM
There is an entire section in the mySQL docs on tuning the database.
mySQL is the fastest database in the world for doing SELECTs, so I don't think that going with a different platform is going to help a whole lot in terms of speed.
Other things you may or may not have done:
- Make sure your queries are optimized - a good programming axiom is to never, ever use the "*" - either in regex or in SQL statements.[nbsp][nbsp]
- Make sure your database is properly indexed - basically any column that follows a "WHERE" in your queries ought to be indexed.
- Who is your provider?[nbsp][nbsp]They may not have mySQL configured as well as it could be.[nbsp][nbsp]I can tell you that my site pounds the daylights out of mySQL - doing about 15 queries per page and some days getting 8-10k pageviews.[nbsp][nbsp]Do the math - that's alot of requests to the database and I haven't noticed any speed problems.[nbsp][nbsp] About 10 of these queries per page are to thanks to phpADS - which does a select and an Insert for each adview.[nbsp][nbsp]The adviews table is regularly around 2.5-3 million rows.[nbsp][nbsp]Maybe I'm in denial - but my site **seems** quick to me...
- One thing that Northern Light (I would assume they and most sites of their ilk) is/are doing that anyone could do that uses a RDBMS to generate pages is to 'pseudo-cache' the most frequently requested pages or search requests.[nbsp][nbsp] Just have a perl or php script grind over the database every 15 mintues and write the code to static HTML - the result are pages that normally whack the database to serve the HTML are now static.
- Search the mySQL mailing list (I think at lists.mysql.com) and the php mailing list (comp-progressive.com)
- There is a new book on mySQL called "mySQL" (not the O'Reilly book) that is supposed to be pretty good.[nbsp][nbsp]I haven't read it yet, but it might have some helpful hints.
- Lastly, make sure you've got about 2 hours or so, and Go to slashdot.org and do a search for "mysql" or "php".[nbsp][nbsp]A ton of good info there.
Please keep us posted on what you find - I am interested to hear of your performance woes and experiences.
Heath
[This message has been edited by heath (edited 02-12-00@07:59 am)]
Justin
02-12-2000, 11:50 AM
Heath makes some excellent points - I want to stress the indexes, too, as you *will* notice a significant difference when your tables are properly indexed.
With joins, I have found that the best order is smallest to largest - for example, if we have three tables:
Vendors
Products
Orders
Where each Vendor can have multiple Products, and each Product can have multiple Orders, within a query I would call them in the above order:
</font><font face="Courier" size="3">
mysql> select V.Name, P.Type, O.Price
[nbsp][nbsp][nbsp][nbsp]-> from Vendors V, Products P, Orders O where
[nbsp][nbsp][nbsp][nbsp]-> V.ID = P.Vendor and
[nbsp][nbsp][nbsp][nbsp]-> P.ID = O.Product and
[nbsp][nbsp][nbsp][nbsp]-> O.Ticket = '000123';
</font><font face="Verdana, Arial" size="2">
Basically keeping the table names in that order, and placing the final conditionals at the end. I do not know if this is the best way, but in my own experience, this method seems to be the fastest (especially with 5+ table joins).
Hope this helps.
------------------
Justin Nelson
FutureQuest (http://www.FutureQuest.net/index.php) Support
heath
02-12-2000, 10:50 PM
Justin -
I never thought about joining tables smallest to biggest, but it makes sense.
After looking over the O'Reilly book last night, it suggests many other things that I didn't cover on how to make mySQL faster if its getting bogged down.
I'd suggest to whoever was having this problem to take a look there as well - I'd do a gazillion things before I went to a different RDBMS platform - that seems to be a last resort - especially considering you are not going to find many VIRTUAL hosts offering informix or Oracle - and if I remember, the original poster was trying to keep costs down - switching databases would do the exact opposite.
Heath
vBulletin® v3.6.8, Copyright ©2000-2009, Jelsoft Enterprises Ltd.