Skip to content

Crawl microformats2 data for h-entry and h-feeds

License

LGPL-3.0, GPL-3.0 licenses found

Licenses found

LGPL-3.0
COPYING.LESSER
GPL-3.0
COPYING
Notifications You must be signed in to change notification settings

strugee/node-crawl-mf2

Repository files navigation

crawl-mf2

Build Status Coverage Status Greenkeeper badge

Crawl a microformats2 site to find things like canonical URLs for h-entrys

Note: this module does not really handle pages with more than one top-level microformats2 nodes.

Installation

npm install crawl-mf2

Example

Start a crawl and log canonical h-entry URLs found on https://strugee.net/blog/:

var crawl = require('crawl-mf2');

var crawler = crawl('https://strugee.net/blog/');

crawler.on('h-entry', function(url, mf2node) {
	console.log(url);
});

API

The module exports a single function, crawlMf2, which takes a single argument, the base URL to crawl from.

It returns an EventEmitter.

Events

'error'

Emitted when an error occurs. Currently this means either the microformats2 parser failed or an HTTP error occurred.

Note: treated specially by Node.js.

'urlDisco'

  • String The URL being discovered

Emitted when a new URL is discovered, including the initial base URL.

'mf2Parse'

Emitted when a URL is parsed for microformats2 markup.

'h-feed'

Emitted when an h-feed page is discovered.

'h-entry'

Emitted when an h-entry page is discovered.

License

LGPL 3.0+

Author

AJ Jordan alex@strugee.net

About

Crawl microformats2 data for h-entry and h-feeds

Topics

Resources

License

LGPL-3.0, GPL-3.0 licenses found

Licenses found

LGPL-3.0
COPYING.LESSER
GPL-3.0
COPYING

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published