News: FeedMe 0.9 is out. Big changes: FeedMe now uses a real HTML parser -- which means that it now comes as two files, and depends on the python-lxml package. The advantages: rewriting of URLs is much more reliable, and it's now possible to download images (set skip_images to False in feedme.conf).
Ever want to download RSS from news sites or blogs to your PDA?
You will need Python, and python's feedparser module (on Ubuntu or Arch, that's the package python-feedparser) and lxml (package python-lxml).
So I wrote FeedMe. It's very minimal but so far it seems to do the job.
FeedMe is sort of an RSS version of Sitescooper. It produces either HTML or Plucker format.
FeedMe can optionally convert each page to plain ascii --
useful if you're producing output for a Palm PDA.
For this option, set
ascii="yes" in feedme.conf
and install my ununicode module
somewhere in your python path.
There's no documentation yet, but but the sample feedme.conf configuration file should be vaguely self-explanatory. Install it in ~/$XDG_CONFIG_HOME/feedme/feedme.conf or ~/.config/feedme/feedme.conf
Supported formats since 0.7 include plucker and epub; to get them,
formats = epub (or plucker, or plucker,epub to get both).
For HTML only and no additional formats, set
formats = none
Downloaded HTML will be put in ~/feeds/ (which must exist; you can specify a different location in feedme.conf).
Feedme can then convert the HTML into one of three formats: epub, plucker, or fb2. You'll need to have appropriate conversion tools installed on your system: plucker for plucker format, calibre's ebook-convert for the other two. You can specify more than one format, separated by commas, in feedme.conf; or format=none if you only need the downloaded HTML.
Feedme's configuration file is ~/.config/feedme/feedme.conf.
Feedme's cache is ~/.cache/feedme/feedme.dat. This file should remain relatively small if you have a sane number of feeds, but it doesn't hurt to keep an eye on it.
~/feeds is where it stores the downloaded HTML. Stories are downloaded as sitename/number.html, e.g. ~/feeds/BBC_World_News/2.html. These stories are cleaned out every save_days (set in your feedme.conf).
If you save to formats beyond plain HTML, there may be other directories
used for the converted files; for example, plucker files are created in
This is never cleaned out by feedme, so you'll have to prune it yourself.
When I used plucker as my feed
reader, I had an alias that ran
to remove the previous day's plucks just before I ran feedme.
FeedMe's license is GPLv2 or (at your option) later. Thanks to Carla Schroder for the name suggestion!
FeedMe is now maintained on GitHub: FeedMe.