Class RobotsManager.RobotsData
- java.lang.Object
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
-
- Enclosing class:
- RobotsManager
protected static class RobotsManager.RobotsData extends java.lang.ObjectThis is a cached data item.
-
-
Field Summary
Fields Modifier and Type Field Description protected longexpirationprotected java.util.ArrayListrecords
-
Constructor Summary
Constructors Constructor Description RobotsData(java.io.InputStream is, long expiration, java.lang.String hostName, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description longgetExpirationTime()Get expirationbooleanisFetchAllowed(java.lang.String userAgent, java.lang.String pathString)Check if fetch is allowedprotected voidparseRobotsTxt(java.io.BufferedReader r, java.lang.String hostName, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities)Parse the robots.txt file using a reader.
-
-
-
Constructor Detail
-
RobotsData
public RobotsData(java.io.InputStream is, long expiration, java.lang.String hostName, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities) throws java.io.IOException, org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionConstructor.- Throws:
java.io.IOExceptionorg.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-
Method Detail
-
isFetchAllowed
public boolean isFetchAllowed(java.lang.String userAgent, java.lang.String pathString)Check if fetch is allowed
-
getExpirationTime
public long getExpirationTime()
Get expiration
-
parseRobotsTxt
protected void parseRobotsTxt(java.io.BufferedReader r, java.lang.String hostName, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities) throws java.io.IOException, org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionParse the robots.txt file using a reader. Is NOT expected to close the stream.- Throws:
java.io.IOExceptionorg.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-