htmlparser2
's direct dependencies. Data on all dependencies, including transitive ones, is available via CSV download.Name | Version | Size | License | Type | Vulnerabilities |
---|---|---|---|---|---|
domelementtype | 2.3.0 | 2.84 kB | BSD-2-Clause | prod | |
domhandler | 5.0.3 | 12.29 kB | BSD-2-Clause | prod | |
domutils | 3.1.0 | 23.28 kB | BSD-2-Clause | prod | |
entities | 4.5.0 | 75.55 kB | BSD-2-Clause | prod |
Htmlparser2 is a fast and forgiving HTML/XML parser. This Javascript library is designed for high-speed parsing of HTML and XML documents. It's considered the fastest HTML parser and takes shortcuts to achieve this speed. While it excels in performance, it may not strictly comply with the HTML spec, for which a package like parse5 can be used. It's useful when processing HTML/XML documents in your Node.js applications allowing you to analyze, manipulate, or traverse the structure of the content.
Htmlparser2 is easy to use with a clear API. You need to install it first via npm using the command: npm install htmlparser2
. After installation, you can import the package into your JavaScript file and initiate a Parser instance like this:
import * as htmlparser2 from "htmlparser2";
const parser = new htmlparser2.Parser({
onopentag(name, attributes) {
if (name === "script" && attributes.type === "text/javascript") {
console.log("JS! Hooray!");
}
},
ontext(text) {
console.log("-->", text);
},
onclosetag(tagname) {
if (tagname === "script") {
console.log("That's it?!");
}
},
});
The Parser instance comes with different event handlers such as onopentag
, ontext
, and onclosetag
that execute your custom code when certain events occur. You can also use Htmlparser2 with streams which is efficient for handling large data. Here's an example of using Htmlparser2 with streams:
import { WritableStream } from "htmlparser2/lib/WritableStream";
const parserStream = new WritableStream({
ontext(text) {
console.log("Streaming:", text);
},
});
const htmlStream = fs.createReadStream("./my-file.html");
htmlStream.pipe(parserStream).on("finish", () => console.log("done"));
The documentation for Htmlparser2 is available at the official GitHub repository, both in the README file and the wiki. The README provides a general overview and use-cases, while the wiki delves deep into the parser's events and options. The package is part of a larger 'Ecosystem' that includes other related packages, which you can explore for more advanced use-cases. You can also test out Htmlparser2 at the AST Explorer for interactive learning.