{% macro rule_head() -%} {# #} {%- endmacro %} {% macro rule_tabset(feed_uid, tabs) -%}
Rule | Type | Text | Filtered articles | Delete | |
---|---|---|---|---|---|
{{ uid }} | {{ rtype }} | {{ text }} | {% if feed_uid %}show | {% else %}show | {% endif %}delete |
Rule | Expires | Text | Delete | Update |
---|---|---|---|---|
The filtering rules are Python expressions that are evaluated as booleans. The following variables are always available:
Variable | Description |
---|---|
feed_title | Title of the feed |
title | Title of the article |
title_lc | Title of the article (all in lower case) |
title_words | Set of lower-cased and diacritic-stripped words in the title |
content | Contents of the article |
content_lc | Contents of the article (all in lower case) |
content_words | Set of lower-cased and diacritic-stripped words in the content |
union_lc | Union of the article (all in lower case) |
union_words | Set of lower-cased and diacritic-stripped words in both title and contents |
link | article URL (before dereferencing) |
category | If present, set of categories for the article |
author | Author of the article |
In addition, the convenience functions title_any, content_any, union_any, title_any_lc, content_any_lc, union_any_lc, title_any_words, content_any_words, union_any_wordsare here to simplify rules. They take a list of strings and search in the corresponding title|content(|_lc|_words) (the union_* variants will match either title or contents). If any of the strings in the list matches, the function returns True.
The function link_already(url) checks if the URL passed as its argument is that of an article that was already loaded. This is useful to filter out duplicates or echos from aggregated feeds like Digg or Slashdot, but it also slows down feed processing. You can use the function link_extract(link_text, content) to extract a link from the content (the text of the link must match exactly).
Other variables may be available on a feed-by-feed basis, also depending on which feed standard is used (e.g. Atom vs. RSS). Check the feed details page for the feed you are interested in for more details.
If a variable does not exist, the expression evaluation will throw an exception, and the article will not be filtered out, but in a Python logical OR expression, if the first term evaluates true, the second term is not evaluated and the article will be filtered out even if the second term refers to a variable that does not exist.
You can add comments by starting a line with the character #, and use carriage returns like whitespace for legibility
Should be self-explanatory:
'Salon' in feed_title and ('King Kaufman' in title or 'Letters' in title)
'SAP' in title.split()
or almost equivalently:
'sap' in title_words
'Guardian Unlimited' in feed_title and (content.startswith('Sport:') or 'football' in content_lc or 'cricket' in content_lc)
which is equivalent to:
'Guardian Unlimited' in feed_title and (content.startswith('Sport:') or content_any_lc('football', 'cricket'))
Filter articles referring to SAP, but as a word (i.e. do not filter out 'ASAP'):
union_any_words('sap')