20031018

OK, here is a refinement of the previous figure. It shows "real mail" as #s, and "spam" as *s. Clearly, I need a definition of spam for this to work. As a rough approximation, I'm counting any message for which I received only one message from that address as spam. Clearly that will include some non-spam, but it will also miss some spam where the spammer sends from the same address multiple times. A quick scan of which addresses this hits shows that it is pretty good, suggesting that whitelisting might be the way ahead.

1300                                                       

1250 *
1200 *
1150 *
1100 * # *
1050 * # *
1000 * # *
950 * * # *
900 * **# *
850 # **#* *
800 # **#* * *
750 # #*## ** **
700 # #### **** ***
650 # #### **** ***
600 * #* ####* ***** ***
550 * #* * ####* ***** **#
500 ** ** #* * #####* *#**#***#
450 *#* *#* #** *# *#####** *##*#*###
400 *## *#* ##***# ***######***####*###
350 ###*###*##***#* ****######**#####*###
300 ###*###*##**##* *#**######**#########
250 ##########*#### ##*########*#########
200* ## ############### #####################
150####** *############### #####################
100###### ################ #####################
50###### ################ #####################
0###### ################ #####################
MAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJAS
1999 2000 2001 2002 2003


Now, this shows that I get a pretty good amount of real mail to spam, by this measure. The diagram is a little misleading, because of the rounding, but it's not too bad. Except that, for the last couple of months, the spam rate has been creeping towards 50%, having jumped to 550 spam messages in september, from an average of about 100 in last year. Of course, most of these seem to be coming from one of my accounts that is being forwarded and filtered into a folder, so maybe I should just switch this off.

Next up, an analysis of who is sending me real mail. I bet you can't wait.

No comments: