Closed Thread Icon

Preserved Topic: Spam filtering techniques (Page 1 of 1) Pages that link to <a href="https://ozoneasylum.com/backlink?for=17237" title="Pages that link to Preserved Topic: Spam filtering techniques (Page 1 of 1)" rel="nofollow" >Preserved Topic: Spam filtering techniques <span class="small">(Page 1 of 1)</span>\

 
Jestah
Maniac (V) Mad Scientist

From: Long Island, NY
Insane since: Jun 2000

posted posted 10-14-2002 19:32

Spam filtering techniques

Six approaches to eliminating unwanted e-mail

David Mertz, Ph.D. (mertz@gnosis.cx)
Analyzer, Gnosis Software, Inc.
September 2002

The problem of unsolicited e-mail has been increasing for years, but help has arrived. In this article, David discusses and compares several broad approaches to the automatic elimination of unwanted e-mail while introducing and testing some popular tools that follow these approaches.
Unethical e-mail senders bear little or no cost for mass distribution of messages, yet normal e-mail users are forced to spend time and effort purging fraudulent and otherwise unwanted mail from their mailboxes. In this article, I describe ways that computer code can help eliminate unsolicited commercial e-mail, viruses, trojans, and worms, as well as frauds perpetrated electronically and other undesired and troublesome e-mail. In some sense, the final and best solution for eliminating spam will probably take place on a legal level. In the meantime, however, you can do some things from a code perspective that can serve as an interim solution to the problem, until (if ever) the laws begin to evolve at the same rate as public frustration.

Considering matters technically -- but also with common sense -- what is generally called "spam" is somewhat broader than the category "unsolicited commercial e-mail"; spam encompasses all the e-mail that we do not want and that is only very loosely directed at us. Such messages are not always commercial per se, and some push the limits of what it means to be solicited. For example, we do not want to get viruses (even from our unwary friends); nor do we generally want chain letters, even if they don't ask for money; nor proselytizing messages from strangers; nor outright attempts to defraud us. In any case, it is usually unambiguous whether a message is spam, and many, many people get the same such e-mails.

The problem with spam is that it tends to swamp desirable e-mail. In my own experience, a few years ago I occasionally received an inappropriate message, perhaps one or two each day. Every day of this month, in contrast, I received many times more spams than I did legitimate correspondences. On average, I probably get 10 spams for every appropriate e-mail. In some ways I am unusual -- as a public writer, I maintain a widely published e-mail address; moreover, I both welcome and receive frequent correspondence from strangers related to my published writing and to my software libraries. Unfortunately, a letter from a stranger -- with who-knows-which e-mail application, OS, native natural language, and so on, is not immediately obvious in its purpose; and spammers try to slip their messages underneath such ambiguities. My seconds are valuable to me, especially when they are claimed many times during every hour of a day.

Hiding contact information
For some e-mail users, a reasonable, sufficient, and very simple approach to avoiding spam is simply to guard e-mail addresses closely. For these people, an e-mail address is something to be revealed only to selected, trusted parties. As extra precautions, an e-mail address can be chosen to avoid easily guessed names and dictionary words, and addresses can be disguised when posting to public areas. We have all seen e-mail addresses cutely encoded in forms like "<mertzHIDDEN@NOSPAM.gnosis.cx>" or "echo zregm@tabfvf.pk

mas
Paranoid (IV) Inmate

From: the space between us
Insane since: Sep 2002

posted posted 10-14-2002 19:45



um...i didn't read it. and i won't

-THE SPACE-

CPrompt
Maniac (V) Inmate

From: there...no..there.....
Insane since: May 2001

posted posted 10-14-2002 19:56

OK, mas. Don't read it.

I am going to print this sucker out and go over it. I get a ton of spam and it is very annoying.

Later,
C:\


~Binary is best~

mas
Paranoid (IV) Inmate

From: the space between us
Insane since: Sep 2002

posted posted 10-14-2002 20:02
quote:
Summary
Given the testing methodology described earlier, let's look at the concrete testing results. While I do not present any quantitative data on speed, the chart is arranged in order of speed, from fastest to slowest. Trigrams are fast, Pyzor (network lookup) is slow. In evaluating techniques, as I stated, I consider false positives very bad, and false negatives only slightly bad. The quantities in each cell represent the number of correctly identified messages vs. incorrectly identified messages for each technique tested against each body of e-mail, good and spam.



good enough, though

sorry, jestah, i didn't want to be unfriendly

Emperor
Maniac (V) Mad Scientist with Finglongers

From: Cell 53, East Wing
Insane since: Jul 2001

posted posted 10-14-2002 20:11

Jestah: Thanks for that - not much more than has been discussed here before but a good article that brings everything together nonetheless.

Also see the FAQ:

How can I eliminate spam?

___________________
Emps

FAQs: Emperor

Veneficuz
Paranoid (IV) Inmate

From: A graveyard of dreams
Insane since: Mar 2001

posted posted 10-14-2002 21:57

Too tired to read all of it right now, but the little I read sounds interesting. I'll definitly read it tomorrow

_________________________
Anyone who has lost track of time when using a computer knows the propensity to dream, the urge to make dreams come true and the tendency to miss lunch.
- copied from the wall of cell 408 -

WebShaman
Maniac (V) Mad Scientist

From: Happy Hunting Grounds...
Insane since: Mar 2001

posted posted 10-15-2002 07:23

Thanks for posting Jestah...I get spam-a-lot!

silence
Maniac (V) Inmate

From: soon to be "the land down under"
Insane since: Jan 2001

posted posted 10-16-2002 04:02

Spam's easy since I admin the exchange server here. I just set up an alias for my real email. That way, outlook easily filters out email sent to that address and puts in arranged folders so i can go through and weed out the spam.

allgood2
Bipolar (III) Inmate

From: Sweden
Insane since: Apr 2000

posted posted 10-16-2002 12:29

I have tried McAfee SpamKiller and it has helped alot in weeding out the spam mail I use to get in my box. You can have a look here and see what you think. As for me...spam never reaches my inbox with this program cause it has a filter of known spam mail that weeds it out instantly. Have a look for yourself here:
http://www.mcafee.com/myapps/msk/

Allgood

Lord_Fukutoku
Paranoid (IV) Inmate

From: West Texas
Insane since: Jul 2002

posted posted 10-16-2002 16:55

Wow, and I thought WS and Emps had some long winded posts...

There was even some good stuff in there. Thanks Jestah.

________________________________________________________________
-- Jack of all trades, master of that which has my attention at
the moment.

Unoriginal Cell 693

« BackwardsOnwards »

Show Forum Drop Down Menu