Awesome
@rowanmanning/feed-parser
A well-tested and resilient Node.js parser for RSS and Atom feeds.
Table of Contents
Introduction
This is a Node.js-based feed parser for RSS and Atom feeds. The project has the following aims:
-
Run automated tests against real-world feeds. It's currently tested against ~40 feeds via Sample Feeds. This ensures that we support real feeds rather than just the specifications.
-
Related to the point above, be as lenient as possible with feed parsing.
-
Keep up to date with the latest Node.js versions, including dropping support for end-of-life versions.
-
Maintain compatibility with the great parts of node-feedparser, e.g. resolving relative URLs.
Requirements
This library requires the following to run:
- Node.js 18+
Usage
Install with npm:
npm install @rowanmanning/feed-parser
Load the library into your code:
const { parseFeed } = require('@rowanmanning/feed-parser');
// or
import { parseFeed } from '@rowanmanning/feed-parser';
You can use the parseFeed
function to parse an RSS or Atom feed as a string. The return value is an object representation of the feed:
const feed = parseFeed('<channel> etc. </channel>');
console.log(feed.title);
This will try to parse even invalid feeds, but if no data can be pulled out an error will be thrown with a code
property set to INVALID_FEED
.
This library does not parse feeds via a URL, you can do so relatively easily with fetch
:
const response = await fetch('https://github.com/rowanmanning/feed-parser/releases.atom');
const feed = parseFeed(await response.text());
Parsed feed
The feed
object returned by parseFeed
has the following properties.
Feed
Represents an RSS or Atom feed.
<table> <tr> <th>Property</th> <th>Type</th> <th>Notes</th> </tr> <tr> <td><code>authors</code></td> <td><code><a href="#feedauthor">FeedAuthor</a>[]</code></td> <td>The feed authors. Always an array but sometimes empty if no authors are found.</td> </tr> <tr> <td><code>categories</code></td> <td><code><a href="#feedcategory">FeedCategory</a>[]</code></td> <td>The feed categories. Always an array but sometimes empty if no categories are found.</td> </tr> <tr> <td><code>copyright</code></td> <td><code>string | null</code></td> <td>The feed's copyright notice.</td> </tr> <tr> <td><code>description</code></td> <td><code>string | null</code></td> <td>A short description of the feed.</td> </tr> <tr> <td><code>generator</code></td> <td><code><a href="#feedgenerator">FeedGenerator</a> | null</code></td> <td>The software that generated the feed.</td> </tr> <tr> <td><code>image</code></td> <td><code><a href="#feedimage">FeedImage</a> | null</code></td> <td>An image representing the feed.</td> </tr> <tr> <td><code>items</code></td> <td><code><a href="#feeditem">FeedItem</a>[]</code></td> <td>The content items in the feed. Always an array but sometimes empty if no items are found.</td> </tr> <tr> <td><code>language</code></td> <td><code>string | null</code></td> <td>The language the feed is written in.</td> </tr> <tr> <td><code>meta</code></td> <td><code><a href="#feedmeta">FeedMeta</a></code></td> <td>Meta information about the format of the feed.</td> </tr> <tr> <td><code>published</code></td> <td><code>Date | null</code></td> <td>The date the feed was last published.</td> </tr> <tr> <td><code>self</code></td> <td><code>string | null</code></td> <td>A URL pointing to the feed itself.</td> </tr> <tr> <td><code>title</code></td> <td><code>string | null</code></td> <td>The name of the feed.</td> </tr> <tr> <td><code>updated</code></td> <td><code>Date | null</code></td> <td>The date the feed was last updated at.</td> </tr> <tr> <td><code>url</code></td> <td><code>string | null</code></td> <td>A URL pointing to the HTML web page that this feed is for.</td> </tr> </table>FeedAuthor
Represents the author of a Feed
or FeedItem
.
FeedCategory
Represents the content category of a Feed
or FeedItem
.
FeedGenerator
Represents software that generated a Feed
.
FeedImage
Represents an image for a Feed
.
FeedItem
Represents an RSS item or Atom entry in a Feed
.
FeedItemMedia
Represents a piece of media attached to a FeedItem
.
FeedMeta
Represents meta information about a Feed
.
Supported feed formats
Standards
Feeds that adhere to the following standards are supported and most properties will be parsed:
- Atom 1.0
- Atom 0.3 (no spec available but an example is here)
- RSS 2.0
- RSS 0.9
- RDF Site Summary 1.0
The following XML namespaces are also parsed, and more data will be parsed out for RSS feeds that implement these:
- DublinCore (e.g.
dc:creator
) - iTunes Podcast RSS feed (e.g.
itunes:author
)
Leniency
Feeds in the real world rarely comply strictly with the standards and can sometimes be invalid XML. We try to be as lenient as possible, only throwing errors if no data can be pulled out of the feed. We test against a suite of real-world feeds.
Contributing
The contributing guide is available here. All contributors must follow this library's code of conduct.
License
Licensed under the MIT license.<br/> Copyright © 2022, Rowan Manning
Credit
This library takes inspiration from the following:
-
Feedparser from Dan MacTough which I've been using for years.