I have 100 websites that have RSS feeds exposed in different locations. These locations have several RSS feed links pointing at different feeds. Its nearly identical to the BBC Rss feeds page http://www.bbc.com/news/10628494
Site 1 : domain1.com/rss Site 2 : domain2.com/enviroments/rss
Is there any way to extract out the rss links to the each feed xml.
Somthing similar to this Automatically Extracting feed links (atom, rss,etc) from webpages but I would like to only give the site. So that I get all possible rss feeds for a particular site.
I want to have a list of all the rss feeds from the 100 websites. So then I can monitor them on a dashboard. Oh the feeds aee mixed bith atom and rss.
What I have done. I have looked into apache nutch and the parse-feed plugin. Scrapy was the next option but I am still not sure this what I am looking for.