In this project, the concentrator operation is a simple aggregator. My other projects use an advance version of this module to analyze and act according to aggregated data/information, so I left the original module name.
$ java -jar egor.jar concentrator -help usage: egor.core.Concentrator [-a <ID>] [-f <FORMAT_STRING>] [-fc <FILENAME>] [-fm <LENGTH>] [-help] [-i <FILENAME>] [-r <FILENAME>] [-ri] [-u <URL>] -a,--user-agent <ID> user-agent identification -f,--format <FORMAT_STRING> TITLE, DESC, LINK, TAGS variables and any text -fc,--formatted-category-aliases <FILENAME> category aliases table -fm,--formatted-max-len <LENGTH> maximum length of one item output after formatting, default: 450 -ft,--formatted-max-tags <COUNT> maximum number of tags extracted from RSS categories, default: 12 -help print this message -i,--rss-index <FILENAME> file with RSS URLs -r,--refs-db <FILENAME> output file with references to extracted media attachments -ri,--extract-img-src extract media attachments from img src of item description -u,--rss-url <URL> input RSS URL
This operation reads several RSS feeds and produces a single output. The output includes a textual representation of RSS in a format that is suitable for Mastodon import.
The output can be prepared on servers with fast internet connectivity, access to Mastodon instance is not required.
Some servers do not return a valid RSS if the User-agent (in HTTP headers) is not a specific or known value. Local government offices often use these tricks, try "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0" or something like this to solve it.
The format string specifies how your statuses will look like. You can use any text and special variables which are replaced by values from RSS feeds. Double white spaces and control characters are replaced by a single white space.
If this parameter is specified, all --formatted-* parameters are applied as well.
If this parameter is missing, downloaded RSS feeds are aggregated and pretty-printed.
The filename defines the table of tag (category) aliases. One line is a group of aliases. Tags of the group are replaced by the first tag on the line. The lookup operation is case-insensitive.
Example:
Hacker HackerOne hackerNews hackMeUp News RT CNN CBS SpiesTalking BIS VZ UZSI GRU CIA NSA
Categories like hAckErnews, hackMeup, hackMeUp, hacker, HAcKER are all transformed to #Hacker. Categories like RT, rt, Cnn, CNN, etc are all transformed to #News. Finally, agencies labels (case insensitive) are rewritten to #SpiesTalking.
It sets the maximum number of unique category names which are accepted from input RSS feed for a single item (status).
Filename with RSS feed URLs, one URL per one line with (optional) extra tags.
This file will contain remote references which were discovered in RSS, and which are also referenced from a stream printed on the standard output.