r1 - 01 Apr 2004 - 20:30:00 - LeoGalambosYou are here: TWiki >  Egothor Web  > AntiSpam

Anti SPAM block

In org.egothor.robot.apps.Oracul you can apply an anti-SPAM table using the parameter -spam:

org.egothor.robot.apps.Oracul index/ -spam my_table -linksrank linksrank/

It will apply linksrank/ values onto index/. Moreover, my_table is read and may modify the values. The format of the table is as follows:

# this is a comment
# all documents from the server has set their value to 0
# (spaces are required)
domain www.badboyz.example.com:80 = 0
# all documents from the server has set their value decreased by 2
domain www.badboyz.example.com:80 - 2
# or you can also increase the values
domain www.goodboys.example.com:80 + 2
#
# the same can be applied on specific URLs
url http://www.badboyz.example.com:80/stupidpage.html = 0
url http://www.goodboys.example.com:80/greatresource.html + 5

If a domain rule matches, then url rules are not scanned.

In case of a collision, the last rule is applied. In the example www.badboyz.example.com:80 - 2 is only applied, www.badboyz.example.com:80 = 0 is skipped over.

The table is implemented using java.util.HashMap, so the table can be as complex as you like - no performance bottleneck should be encountered. By the way, the table is only applied, when document metadata are of type indexer.html2.HTMLMetadata.

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback