public class DemoHTMLParser extends java.lang.Object implements HTMLParser
| Modifier and Type | Class and Description |
|---|---|
static class |
DemoHTMLParser.Parser
The actual parser to read HTML documents
|
| Constructor and Description |
|---|
DemoHTMLParser() |
| Modifier and Type | Method and Description |
|---|---|
DocData |
parse(DocData docData,
java.lang.String name,
java.util.Date date,
org.xml.sax.InputSource source,
TrecContentSource trecSrc) |
DocData |
parse(DocData docData,
java.lang.String name,
java.util.Date date,
java.io.Reader reader,
TrecContentSource trecSrc)
Parse the input Reader and return DocData.
|
public DocData parse(DocData docData, java.lang.String name, java.util.Date date, java.io.Reader reader, TrecContentSource trecSrc) throws java.io.IOException
HTMLParserparse in interface HTMLParserdocData - result reusedname - name of the result doc data.date - date of the result doc data. If null, attempt to set by parsed data.reader - reader of html text to parse.trecSrc - the TrecContentSource used to parse dates.java.io.IOException - If there is a low-level I/O error.public DocData parse(DocData docData, java.lang.String name, java.util.Date date, org.xml.sax.InputSource source, TrecContentSource trecSrc) throws java.io.IOException, org.xml.sax.SAXException
java.io.IOExceptionorg.xml.sax.SAXExceptionCopyright © 2000–2025 The Apache Software Foundation. All rights reserved.