Class ScriptParseState
- java.lang.Object
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.CharacterReceiver
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.SingleCharacterReceiver
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
-
- Direct Known Subclasses:
MetaParseState
public class ScriptParseState extends org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseStateThis class interprets the tag stream generated by the HTMLParseState class, and causes script sections to be skipped
-
-
Field Summary
Fields Modifier and Type Field Description protected intscriptParseStateprotected static intSCRIPTPARSESTATE_INSCRIPTprotected static intSCRIPTPARSESTATE_NORMAL-
Fields inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
accumBuffer, ampBuffer, bTagDepth, currentAttrList, currentAttrName, currentAttrNameBuffer, currentState, currentTagName, currentTagNameBuffer, currentValueBuffer, inAmpersand, mapLookup, TAGPARSESTATE_IN_ATTR_LOOKING_FOR_VALUE, TAGPARSESTATE_IN_ATTR_NAME, TAGPARSESTATE_IN_ATTR_VALUE, TAGPARSESTATE_IN_BANG_TOKEN, TAGPARSESTATE_IN_BRACKET_TOKEN, TAGPARSESTATE_IN_CDATA_BODY, TAGPARSESTATE_IN_COMMENT, TAGPARSESTATE_IN_DOUBLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_END_TAG_NAME, TAGPARSESTATE_IN_QTAG_ATTR_LOOKING_FOR_VALUE, TAGPARSESTATE_IN_QTAG_ATTR_NAME, TAGPARSESTATE_IN_QTAG_ATTR_VALUE, TAGPARSESTATE_IN_QTAG_DOUBLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_QTAG_NAME, TAGPARSESTATE_IN_QTAG_SAW_QUESTION, TAGPARSESTATE_IN_QTAG_SINGLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_QTAG_UNQUOTED_ATTR_VALUE, TAGPARSESTATE_IN_SINGLE_QUOTES_ATTR_VALUE, TAGPARSESTATE_IN_TAG_NAME, TAGPARSESTATE_IN_TAG_SAW_SLASH, TAGPARSESTATE_IN_UNQUOTED_ATTR_VALUE, TAGPARSESTATE_IN_UNQUOTED_ATTR_VALUE_SAW_SLASH, TAGPARSESTATE_NEED_FINAL_BRACKET, TAGPARSESTATE_NORMAL, TAGPARSESTATE_SAWCOMMENTDASH, TAGPARSESTATE_SAWDASH, TAGPARSESTATE_SAWEXCLAMATION, TAGPARSESTATE_SAWLEFTANGLE, TAGPARSESTATE_SAWRIGHTBRACKET, TAGPARSESTATE_SAWSECONDCOMMENTDASH, TAGPARSESTATE_SAWSECONDRIGHTBRACKET
-
-
Constructor Summary
Constructors Constructor Description ScriptParseState()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleanacceptNewTag()protected booleannoteNonscriptEndTag(java.lang.String tagName)protected booleannoteNonscriptTag(java.lang.String tagName, java.util.Map<java.lang.String,java.lang.String> attributes)protected booleannoteTag(java.lang.String tagName, java.util.Map<java.lang.String,java.lang.String> attributes)protected booleannoteTagEnd(java.lang.String tagName)-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState
noteBTag, noteBTagToken, noteEndBTag, noteEndEscaped, noteEndTag, noteEscaped, noteEscapedCharacter, noteQTag, noteTag
-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
attributeDecode, dealWithCharacter, dumpValues, isPunctuation, isWhitespace, mapChunk, newBuffer, noteNormalCharacter, outputAmpBuffer
-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.SingleCharacterReceiver
dealWithCharacters, dealWithRemainder
-
-
-
-
Field Detail
-
SCRIPTPARSESTATE_NORMAL
protected static final int SCRIPTPARSESTATE_NORMAL
- See Also:
- Constant Field Values
-
SCRIPTPARSESTATE_INSCRIPT
protected static final int SCRIPTPARSESTATE_INSCRIPT
- See Also:
- Constant Field Values
-
scriptParseState
protected int scriptParseState
-
-
Method Detail
-
noteTag
protected boolean noteTag(java.lang.String tagName, java.util.Map<java.lang.String,java.lang.String> attributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Overrides:
noteTagin classorg.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
acceptNewTag
protected boolean acceptNewTag()
- Overrides:
acceptNewTagin classorg.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState
-
noteTagEnd
protected boolean noteTagEnd(java.lang.String tagName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Overrides:
noteTagEndin classorg.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteNonscriptTag
protected boolean noteNonscriptTag(java.lang.String tagName, java.util.Map<java.lang.String,java.lang.String> attributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteNonscriptEndTag
protected boolean noteNonscriptEndTag(java.lang.String tagName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-