PDA

View Full Version : Site search facility


Alexander Bogomolny
10-16-2001, 10:01 AM
Hello:

What does it take to have a site search facility?

I could use CGI and grep, but mine is a big site. I heard of site indexing, but never actually did anything of the sort.

Thank you,
Alexander Bogomolny

hobbes
10-16-2001, 12:05 PM
If you have less than 500 pages, try Atomz's free index/search:
http://www.atomz.com/products/atomz_search/03_express_search.htm

Alexander Bogomolny
10-16-2001, 12:38 PM
Thank you.

I am looking into it. Pity my site is about 3,000 pages.

sheila
10-16-2001, 01:00 PM
Well, here is a search engine that is worth a look:
http://www.xav.com/scripts/search/

You can try out the installation on my site:
http://www.thinkspot.net/cgi-bin/search/search.pl

I'm not getting as much traffic as you, or have anywhere near the number of pages that you do. However, if you purchase the full version of the script, you can define different zones, and I think you can just reindex parts of your site, or add a new page URL without having to re-index the full site. It has a lot of features, but they do caution you that it may be a bit heavy on the server. If your site gets LOTS of traffic, it may not be server friendly.

Alexander Bogomolny
10-16-2001, 01:41 PM
Sheila,

many thanks. I can afford a $40 license. But where at their site can I find the warning about being server unfriendly.

Alexander Bogomolny

sheila
10-16-2001, 02:24 PM
Originally posted by Alexander Bogomolny:
But where at their site can I find the warning about being server unfriendly.
On this page:
http://www.xav.com/scripts/search/features.html

Scroll down to the section entitled:
Bad Features (Known Limitations and Problems)

and you will find this paragraph:


Memory and CPU Needs - this search engine was designed to provide a rich feature set, which requires more memory and processor power. There may be leaner search engines available - if this becomes an issue for you, look for a leaner engine on www.cgi-resources.com.


If you need to look for other search scripts, I would also recommend beside checking out www.cgi-resources.com that you also check out hotscripts.com. They have lost of scripts and are very up-to-date.

Good Luck!

Alexander Bogomolny
10-16-2001, 03:05 PM
You must be a busy woman, Sheila. I very much appreciate your help.

I have installed and am learning the script. So far looks very thoughtfully and professionally designed.

Many thanks,
Alexander Bogomolny

janderk
10-17-2001, 08:16 AM
Originally posted by Alexander Bogomolny:
What does it take to have a site search facility? Take a look at PerlFect (http://perlfect.com/freescripts/search/). It's very fast and free.

Jan Derk

koala
10-17-2001, 08:12 PM
Namo have a product called Deep Search (http://www.namo.com/products/deepsearch/) which looks good, though I haven't tried it.

It may be worth a look :)

Simon

Alexander Bogomolny
10-17-2001, 10:10 PM
Many thanks to all who offered help. As Sheila was the first and I was serious about getting the search facility, I have installed and tested the one she pointed to. I think that for $40 I got a real bargain.

I have about 2500 pages at my site. The indexing took less than 1 minute, the search is almost instanteneous. It's by far more than I expected.

All the best,
Alexander Bogomolny

Z28SS
10-17-2001, 11:23 PM
On a dedicated server, I'd either use alkaline or ht//Dig, but on a virtual server, you can go with KSearch:

KSearch (http://www.kscripts.com/scripts.shtml)

It's FREE and probably the highest rated perl search script on all the major cgi web sites. It's breeze-city to set up as well.

Jeff
10-18-2001, 12:10 AM
Requirements:
Free space on your server of at least the total size of all files you want to index (for all options)
(For Unix/Linux users) If you want to index PDF files, you must install Xpdf from http://www.foolabs.com/xpdf/

I notice a lot of search scripts seem to have a much higher requirement for free space than I had expected. Why do they require so much?

sheila
10-18-2001, 12:19 AM
Originally posted by Jeff:

I notice a lot of search scripts seem to have a much higher requirement for free space than I had expected. Why do they require so much?
I don't know. But the last time I brought up this topic (there must be a thread here on AOTA about 3 - 4 months old, with posts in it where I'm asking about search engine script recommendations), Paul or Rich or somebody saw me ask about KSearch and strongly suggested (based on the required disk space) that I NOT go there. So, I didn't.

Don't know about PerlFect Search. I saw that one, too. I really think that the FDE search engine it quite nice. Haven't tried out any others. It doesn't require that much extra space.

Justin
10-18-2001, 12:47 AM
Most search scripts tend to index the entire page, which effectively defeats the purpose of indexing in the first place...

I used to offer a free search script, but unfortunately I don't know where the zip file might be... but it took less than ten percent of the total file space (we had indexed all of Aota's UBB HTML files as a stress test). I indexed only text (stripped the HTML), and stripped out common words from the index.

I don't know of any that do this now, and I have no idea where I might find a copy of that script... so this probably doesn't help much... I will see if I can locate it if anyone's interested.

sheila
10-18-2001, 01:36 AM
Originally posted by Justin:
Most search scripts tend to index the entire page, which effectively defeats the purpose of indexing in the first place...

I used to offer a free search script, but unfortunately I don't know where the zip file might be... but it took less than ten percent of the total file space (we had indexed all of Aota's UBB HTML files as a stress test). I indexed only text (stripped the HTML), and stripped out common words from the index.

This appears to be essentially how the FDE search engine works. I have about 19 Meg of files in my www subdirectory. About 1 - 2 Meg is graphics. The entire contents of the search folder in my cgi bin is under 2 Meg.

I looked at the index file. Appears that for each page, they save the URL, the title, and keywords from the content, all HTML stripped. They also have a common words list, that is user configurable, that will skip those words in the index file.

janderk
10-18-2001, 03:47 AM
Originally posted by Jeff:
I notice a lot of search scripts seem to have a much higher requirement for free space than I had expected. Why do they require so much? Several scripts allow you to show the context of matches found, just like Google does. While this is a nice feature it requires all pages to be completely cached. Luckily perlfect switches this feature off by default, thus greatly reducing the index file size.

I would recommend against using ksearch. It is a perlfect spinoff with several nice extensions. However, these additional features make it dog slow for larger web sites. Perlfect searches are almost instantanious. Even for sites with 1000's of pages.

Jan Derk

Jeff
10-18-2001, 07:46 AM
Just tried the FDE search - WOW!

Very very nice interface and options, even in the free version!

Z28SS
10-18-2001, 06:38 PM
I guess I'll have to give the Fluid Dynamics script a look-see. I've used their Apache Guardian script since version minus 1 or something, and it's very handy.

If the Fluid Dynamics search is that good, then KSearch is going to get put out to pasture. <moo />

Thanks for the tip!

:y