FutureQuest, Inc. FutureQuest, Inc. FutureQuest, Inc.

FutureQuest, Inc.
Go Back   FutureQuest Community > General Site Owner Support (All may read/respond) > Email & Mailing List Management
User Name
Password  Lost PW

Reply
 
Thread Tools Search this Thread Display Modes
Old 02-01-2002, 10:22 AM   Postid: 61272
DB
Registered User

Forum Notability:
10 pts: User-friendly
[Post Feedback]
 
Join Date: Feb 1999
Location: Miami, FL
Posts: 314
E-mail Filter Manager - Beta Testers Needed

The "Sheila Effect" has struck again.

After reading the threads about e-mail filters and Python, I decided to take a hack at it myself. Mind you, I'm a Python newbie, a few days ago I hadn't tried anything more complex than "Hello World." But after playing around with some of Sheila's scripts, I was able to put together a rudimentary Python filter. Since I'm more comfortable writing in PHP, I next created a web browser interface to set-up the Python filter, thus making it easy for someone with no programming experience to begin using e-mail filters.

The program, dubbed "E-mail Filter Manager," can so some basic things at this stage...

Allowed E-mail List (aka a "white list"). Any e-mail address entered in this list will pass through the remaining filters, no questions asked.

Banned E-mail Addresses. Standard feature.

Banned Domains. Unless the address appears on the Allowed list, anything from a banned domain gets filtered.

Banned File Extensions. Based on Sheila's code, checks for banned file extensions in attachments and multi-parts.

Banned Words. Searches message body for banned words and phrases... does not bite on word fragments (ie. "sextant" is not a hit for "sex"... but "SeX!!!" would be filtered).

Banned Subjects. Same as banned words, only for subject lines.

A banned (ie. filtered) e-mail can be handled three ways: normal bounce, blackhole or bounce with message, the latter requiring a slightly different CNC set-up (all explained in the online help).

Throw in a few custom set-up options and that's where the proggie stands now. It doesn't do any logging or checks of external black lists, I'm a bit concerned that those kinds of features could be too resource intensive for accounts which receive a lot of mail. I also haven't done any advanced spam detection using the headers, but that sounds like an interesting next step.

I've tried to keep the Python script created by the PHP program as streamlined as possible, all the filter data is written into the script as arrays and a new script is produced each time the PHP program is used for an update. The Python script has had most comments removed, but a commented version is available to anyone who would like to check/improve my newbie code.

I think I have the basics pretty well covered at this point. One of my main design goals was KISS, so I left out the kitchen sink and opted for user friendliness. No doubt someone could take-up where I left off and create an advanced version deserving of a Ph.D thesis, but after three days of hacking (and writing docs... I hate writing docs!), I'm ready for a little RnR. Of course if some bugs are found I'll get back to work, I'm really hoping some people may find this program useful.

As mentioned in the subject, the program is in beta and I'd like to see it get a good shakedown before goin' public (meaning I don't want to be responsible for someone losing all their important e-mail). So... *turns on grovel mode* please check my work and let me know what you think. I'm open to suggestions but my limited experience with Python makes even the simple stuff a challenge (though now that I found IDLE I may play around some more on the PC and give FQ's servers a rest).

So, for anyone interested, the download page can be found here:
http://diamond-back.com/software/efm/

Side note: Today marks the end of exactly three years of hosting with FutureQuest, and in that time I have learned sooooooo much from reading these forums... from TeRRa, Justin, Sheila... well, it's a long list. So I'd like to thank all of you and offer this program as my contribution to the Stone Soup that makes this place so tasty.
__________________
--Tom aka DiamondBack
DB is offline   Reply With Quote
Old 02-01-2002, 10:42 AM   Postid: 61274
sheila
Site Owner
 
sheila's Avatar

Forum Notability:
0 pts: Even-handed
[Post Feedback]
 
Join Date: Aug 1999
Location: Metro Los Angeles Area
Posts: 7,398
Tom:

This sounds really cool. I'd certainly like to take a look at your stuff. Maybe I'll have some time to tinker around with it this weekend.

Is there a place I can download it? Or do you want to email it?
__________________
sheila
http://www.thinkspot.net/sheilaruns/
sheila is offline   Reply With Quote
Old 02-01-2002, 11:09 AM   Postid: 61275
DB
Registered User

Forum Notability:
10 pts: User-friendly
[Post Feedback]
 
Join Date: Feb 1999
Location: Miami, FL
Posts: 314
I was hoping you'd say that.

You can download the program from the page linked to in the first post (the actual download link is near the bottom of that page). The set-up instructions are in the "read_me_install.txt" file, the rest can be found on the online help (hope I wasn't too gushy in the Acknowledgments section *g*).
DB is offline   Reply With Quote
Old 02-01-2002, 10:35 PM   Postid: 61290
manish
Site Owner
 
manish's Avatar

Forum Notability:
10 pts: User-friendly
[Post Feedback]
 
Join Date: Oct 1999
Location: NJ
Posts: 316
Thanks a lot for the script. I had some suggestions:
1. chmod 777 directory == very very bad. I don't know too much about security but I know that it could be very very dangerous

2. Checking for addresses could be more robust if you use string.lower() instead of just looking for the address.

3. Instead of using
Code:
if fromdomain in banneddomains:
        sys.exit(exitcode)
a better way would be:
Code:
while i < numbanneddomains:
        if (string.find(fromdomain, banneddomains[i]) > -1)
                sys.exit(exitcode)
This will prevent subdomains as well....

4. Filename extensions need tolower as well....

5. A good feature to add would be to search for file names as well (not just extensions - I'm guessing you are sick of the .vbs files as well )

I haven't fully tested it but I'm not sure if Outlook files will go through the filename thing well. I'll test it later on and let you know.

But this looks like an awesome tool....

Manish

Last edited by manish : 02-01-2002 at 10:50 PM.
manish is offline   Reply With Quote
Old 02-01-2002, 11:34 PM   Postid: 61291
manish
Site Owner
 
manish's Avatar

Forum Notability:
10 pts: User-friendly
[Post Feedback]
 
Join Date: Oct 1999
Location: NJ
Posts: 316
There might be a problem with the attachment part. I'm going into an infinite loop somewhere. I'll check the code tomorrow. But, I believe there is a problem in that part of the code....

Have fun,
Manish
manish is offline   Reply With Quote
Old 02-02-2002, 02:58 AM   Postid: 61296
sheila
Site Owner
 
sheila's Avatar

Forum Notability:
0 pts: Even-handed
[Post Feedback]
 
Join Date: Aug 1999
Location: Metro Los Angeles Area
Posts: 7,398
Quote:
Originally posted by manish:
Thanks a lot for the script. I had some suggestions:
1. chmod 777 directory == very very bad. I don't know too much about security but I know that it could be very very dangerous
I was just looking at the "readme" install instructions, and I notice that the 777 is on the directory where the index.php file goes. Is it somehow because of php safe-mode that it requires 777 permissions?
Quote:
[i]3. Instead of using
Code:
if fromdomain in banneddomains:
        sys.exit(exitcode)
a better way would be:
Code:
while i < numbanneddomains:
        if (string.find(fromdomain, banneddomains) > -1)
                sys.exit(exitcode)

This will prevent subdomains as well....
About that code snippet: You aren't incrementing your counter, i, are you?

I'm running a very similar type of thing right now as a global filter on my domain, for certain spammer domains.

Here's the script I'm currently running:

Code:
#!/big/dom/xthinkspot/bin/python

import os, sys

if os.environ.has_key("SENDER"):
    SENDER = os.environ["SENDER"]
else:
    SENDER = ""

badsenders = ['spammy@thinkspot.net',
              'sales@doubleclick.net']

badsubstrings = ['vmadmin.com',
                 'quickinspirations.com',
                 'colonize.com',
                 'ombramarketing.com',
                 'DIRECTQLICK.COM',
                 'example.com',
                 'emsiweb.com',
                 'imustplay.com',
                 'penisshopstore.com']

if SENDER in badsenders:
    print "Invalid User"
    sys.exit(100)
for entry in badsubstrings:
    if SENDER.find(entry) > -1:
        print "Invalid User"
        sys.exit(100)
Notice that on the substring matching for subdomains, that I'm not bothering with a loop counting variable, so no indexing on my list and no incrementing necessary. Just a
Code:
for item in list:
type of loop. Simpler.

The above code bounces the messages because of sys.exit(100) and prints the "Invalid User" as an error message to the Qmail program. The bounced/failure messages contain that error message.
__________________
sheila
http://www.thinkspot.net/sheilaruns/
sheila is offline   Reply With Quote
Old 02-02-2002, 03:05 AM   Postid: 61297
sheila
Site Owner
 
sheila's Avatar

Forum Notability:
0 pts: Even-handed
[Post Feedback]
 
Join Date: Aug 1999
Location: Metro Los Angeles Area
Posts: 7,398
Quote:
I'm open to suggestions but my limited experience with Python makes even the simple stuff a challenge (though now that I found IDLE I may play around some more on the PC and give FQ's servers a rest).

You might also want to check out PythonWin from the Win32all Python extensions (by Mark Hammond).

Check out this link:
http://starship.python.net/crew/skippy/
__________________
sheila
http://www.thinkspot.net/sheilaruns/
sheila is offline   Reply With Quote
Old 02-02-2002, 03:20 AM   Postid: 61298
 Arthur
Developer
 
Arthur's Avatar
 
Join Date: Nov 2000
Location: The Netherlands
Posts: 2,212
Very nice work Tom!

Sheila, do you know if there will be much performance impact if you'd use a regular expression to match bad words in the subject?
String is also an external module, if you replace all string operations by re operations, it wouldn't matter much? This:
Code:
if num_bannedsubjects > 0:
  for subject in bannedsubjects:
    p = re.compile(r'\b(%s)\b' % subject)
    if (p.search(subjectline)):
        sys.exit(exitcode)
is much nicer than the kludge used now.

Tom, in banned_subjects; replace "found = 1" by "sys.exit(exitcode)" and remove the "if found"; no need to loop through the rest of the subject if a hit was already found.
Arthur is offline   Reply With Quote
Old 02-02-2002, 03:23 AM   Postid: 61299
sheila
Site Owner
 
sheila's Avatar

Forum Notability:
0 pts: Even-handed
[Post Feedback]
 
Join Date: Aug 1999
Location: Metro Los Angeles Area
Posts: 7,398
Tom Wrote:
Quote:
but after three days of hacking (and writing docs... I hate writing docs!), I'm ready for a little RnR.

Holy cow, Tom! You sure did write docs. I've put up the script and I'm amazed.

I'd actually been working on something along these lines myself, although the web interface I was using was in Python cgi script (not in PHP) and I had only gotten the barest of rudimentary stuff done on it so far. Yours is ever so much more polished looking than mine. However, mine was going to save the data for filters into data files (rather than writing it into the script) and was going to allow custom filterin on any header in the message, or the smtp envelope recipient or smtp envelope sender. I was also planning some other stuff too.

Anyhow, I'll look your stuff over. Perhaps I could contribute some of my ideas/code for the more advanced stuff? Well, it's a possibility.

(By the way, the filtering on file extension code that you referred to previously was written by mannish, not me. )

Will let you know any comments I might have after I play with the script a bit.
__________________
sheila
http://www.thinkspot.net/sheilaruns/
sheila is offline   Reply With Quote
Old 02-02-2002, 03:58 AM   Postid: 61300
DB
Registered User

Forum Notability:
10 pts: User-friendly
[Post Feedback]
 
Join Date: Feb 1999
Location: Miami, FL
Posts: 314
Thank you Manish, I greatly appreciate your suggestions. Here's where I stand at the moment...

1. "chmod 777 directory." Yes, very bad but so far lower permissions have resulted in denials when writing a new file. The files written from PHP are listed as "somebody/somegrp" and appear to require full permissions to the new directory. I noticed that once a filter file is written, I could lower the directory permissions to a safer 755 and still update the file, but writing a new file was prohibited at that level. Also, in past experiments I haven't had much success using PHP to change directory permissions (Safe-mode issue?), only file permissions. A work-around would be to set full permissions first, then lower them after the filter file has been created, and remember to raise them again prior to creating a new filter.

Sheila, I had similar results with the directory where the PHP script lives, it wants 777 to create the data file, but can update it under 755. Another work-around would be to start off with user uploaded data and filter files, then only the files would need full permissions (so PHP could write over them) and the directories could be lower (unless a new filter was to be created, then it's back to raising the directory permissions). At least I think that might work. During beta testing it would probably be best to go with full permissions, one less thing to go wrong.

2. "Checking for addresses could be more robust if you use string.lower()." If I'm reading the code correctly, this is being done now, though since my e-mail client sends all my addresses in lower case, I haven't tested it with upper. *adds test item to To-Do list*

3. Using string.find. That's not a bad idea, I'll try adding a line to increment the index counter and give it a try.

4. "Filename extensions need tolower as well...." I was thinking that as well, but that section of code still holds a few mysteries so I left it alone. I'm currently reading up on the "email and MIME handling package" and attempting to dump some of the data to temp files so I'm not working blind. If I can get a handle on how everything works (or doesn't work), I'll see what I can do. Meanwhile, suggestions or a walk through of that code would be appreciated.

5. "A good feature to add would be to search for file names..." Agreed, just need to finish doing my homework first.

Sheila, from what I gather reading your code snippet, it looks like a custom bounce message can be sent using the standard (ie. exit 100) bounce... better still, it can be customized for each "banned" list. I was wondering if something like that could be done... I may do away with the "bounce /msg" setting and allow an optional bounce message to be associated with each filter group (much cooler... "You're e-mail has been banned"... "You're domain has been banned"... "Stop talking dirty to me! *g*").

And thanks for the link... I'll check it out.
__________________
--Tom aka DiamondBack
DB is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 visitors)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -4. The time now is 01:51 PM.


Running on vBulletin®
Copyright © 2000 - 2013, Jelsoft Enterprises Ltd.
Hosted & Administrated by FutureQuest, Inc.
Images & content copyright © 1998-2013 FutureQuest, Inc.
FutureQuest, Inc.