Awesome
XML-XMLParserHTML
XMLParserHTML provides SAX and DOM parsers in Pharo for HTML that convert possibly malformed HTML into well-formed XML.
Installation
Metacello new
baseline: 'XMLParserHTML';
repository: 'github://pharo-contributions/XML-XMLParserHTML/src';
load.
Usage
A simple example on how to use the XML parser for HTML:
...
results in the following XML output
...
Scrapping
This library together with XPath enables you to do web scrapping from the confort of the Pharo toolset.
You can learn more about how to do it reading the Scrapping with XPath booklet.
LICENSE
History
This project was migrated from http://smalltalkhub.com/#!/~PharoExtras/XMLParserHTML