Class ThrottledFetcher.ThrottledConnection
- java.lang.Object
-
- org.apache.manifoldcf.crawler.connectors.rss.ThrottledFetcher.ThrottledConnection
-
- All Implemented Interfaces:
IThrottledConnection
- Enclosing class:
- ThrottledFetcher
protected static class ThrottledFetcher.ThrottledConnection extends java.lang.Object implements IThrottledConnection
This class represents an established connection to a URL.
-
-
Field Summary
Fields Modifier and Type Field Description protected ThrottledFetcher.AbortCheckerabortCheckerAbort checkerprotected org.apache.http.conn.HttpClientConnectionManagerconnectionManagerThe client connection managerprotected org.apache.manifoldcf.connectorcommon.interfaces.IConnectionThrottlerconnectionThrottlerThe throttling object we use to track connectionsprotected intconnectionTimeoutMillisecondsConnection timeout in millisecondsprotected org.apache.http.client.methods.HttpRequestBaseexecuteMethodThe method objectprotected longfetchCounterThe current bytes in the current fetchprotected org.apache.manifoldcf.connectorcommon.interfaces.IFetchThrottlerfetchThrottlerThe throttling object we use to track fetchesprotected java.lang.StringfetchTypeThe kind of fetch we are doingprotected org.apache.http.client.HttpClienthttpClientThe httpclientprotected ThrottledFetcher.ExecuteMethodThreadmethodThreadThe thread that is actually doing the workprotected java.lang.StringmyUrlThe current URL being fetchedprotected java.lang.StringserverNameThe server fqdnprotected longstartFetchTimeThe start-fetch timeprotected intstatusCodeThe status code fetched, if anyprotected booleanthreadStartedSet if thread has been startedprotected java.lang.ThrowablethrowableThe error trace, if any-
Fields inherited from interface org.apache.manifoldcf.crawler.connectors.rss.IThrottledConnection
_rcsid, FETCH_BAD_URI, FETCH_CIRCULAR_REDIRECT, FETCH_IO_ERROR, FETCH_NOT_TRIED, FETCH_SEQUENCE_ERROR, FETCH_UNKNOWN_ERROR, STATUS_NOCHANGE, STATUS_OK, STATUS_PAGEERROR, STATUS_SITEERROR
-
-
Constructor Summary
Constructors Constructor Description ThrottledConnection(java.lang.String serverName, org.apache.manifoldcf.connectorcommon.interfaces.IConnectionThrottler connectionThrottler, int connectionTimeoutMilliseconds, int connectionLimit, java.lang.String proxyHost, int proxyPort, java.lang.String proxyAuthDomain, java.lang.String proxyAuthUsername, java.lang.String proxyAuthPassword, org.apache.manifoldcf.crawler.interfaces.IAbortActivity activities)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbeginFetch(java.lang.String fetchType)Begin the fetch process.voidclose()Close the connection.voiddoneFetch(org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities)Done with the fetch.intexecuteFetch(java.lang.String protocol, int port, java.lang.String urlPath, java.lang.String userAgent, java.lang.String from, java.lang.String lastETag, java.lang.String lastModified)Execute the fetch and get the return code.java.io.InputStreamgetResponseBodyStream()Get the response input stream.intgetResponseCode()Get the http response code.java.lang.StringgetResponseHeader(java.lang.String headerName)Get a specified response header, if it exists.voidlogFetchCount(int count)Log the fetch of a number of bytes.
-
-
-
Field Detail
-
serverName
protected final java.lang.String serverName
The server fqdn
-
connectionThrottler
protected final org.apache.manifoldcf.connectorcommon.interfaces.IConnectionThrottler connectionThrottler
The throttling object we use to track connections
-
fetchThrottler
protected final org.apache.manifoldcf.connectorcommon.interfaces.IFetchThrottler fetchThrottler
The throttling object we use to track fetches
-
connectionTimeoutMilliseconds
protected final int connectionTimeoutMilliseconds
Connection timeout in milliseconds
-
connectionManager
protected final org.apache.http.conn.HttpClientConnectionManager connectionManager
The client connection manager
-
httpClient
protected final org.apache.http.client.HttpClient httpClient
The httpclient
-
executeMethod
protected org.apache.http.client.methods.HttpRequestBase executeMethod
The method object
-
startFetchTime
protected long startFetchTime
The start-fetch time
-
throwable
protected java.lang.Throwable throwable
The error trace, if any
-
myUrl
protected java.lang.String myUrl
The current URL being fetched
-
statusCode
protected int statusCode
The status code fetched, if any
-
fetchType
protected java.lang.String fetchType
The kind of fetch we are doing
-
fetchCounter
protected long fetchCounter
The current bytes in the current fetch
-
methodThread
protected ThrottledFetcher.ExecuteMethodThread methodThread
The thread that is actually doing the work
-
threadStarted
protected boolean threadStarted
Set if thread has been started
-
abortChecker
protected final ThrottledFetcher.AbortChecker abortChecker
Abort checker
-
-
Constructor Detail
-
ThrottledConnection
public ThrottledConnection(java.lang.String serverName, org.apache.manifoldcf.connectorcommon.interfaces.IConnectionThrottler connectionThrottler, int connectionTimeoutMilliseconds, int connectionLimit, java.lang.String proxyHost, int proxyPort, java.lang.String proxyAuthDomain, java.lang.String proxyAuthUsername, java.lang.String proxyAuthPassword, org.apache.manifoldcf.crawler.interfaces.IAbortActivity activities) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionConstructor.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
-
Method Detail
-
beginFetch
public void beginFetch(java.lang.String fetchType) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionBegin the fetch process.- Specified by:
beginFetchin interfaceIThrottledConnection- Parameters:
fetchType- is a short descriptive string describing the kind of fetch being requested. This is used solely for logging purposes.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
logFetchCount
public void logFetchCount(int count)
Log the fetch of a number of bytes.
-
executeFetch
public int executeFetch(java.lang.String protocol, int port, java.lang.String urlPath, java.lang.String userAgent, java.lang.String from, java.lang.String lastETag, java.lang.String lastModified) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionExecute the fetch and get the return code. This method uses the standard logging mechanism to keep track of the fetch attempt. It also signals the following three conditions: ServiceInterruption (if a dynamic error occurs), OK, or a static error code (for a condition where retry is not likely to be helpful). The actual HTTP error code is NOT returned by this method.- Specified by:
executeFetchin interfaceIThrottledConnection- Parameters:
protocol- is the protocol to use to perform the access, e.g. "http"port- is the port to use to perform the access, where -1 means "use the default"urlPath- is the path part of the url, e.g. "/robots.txt"userAgent- is the value of the userAgent header to use.from- is the value of the from header to use.lastETag- is the requested lastETag header value.lastModified- is the requested lastModified header value.- Returns:
- the status code: success, static error, or dynamic error.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getResponseCode
public int getResponseCode() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionGet the http response code.- Specified by:
getResponseCodein interfaceIThrottledConnection- Returns:
- the response code. This is either an HTTP response code, or one of the codes above.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getResponseBodyStream
public java.io.InputStream getResponseBodyStream() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionGet the response input stream. It is the responsibility of the caller to close this stream when done.- Specified by:
getResponseBodyStreamin interfaceIThrottledConnection- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
getResponseHeader
public java.lang.String getResponseHeader(java.lang.String headerName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruptionGet a specified response header, if it exists.- Specified by:
getResponseHeaderin interfaceIThrottledConnection- Parameters:
headerName- is the name of the header.- Returns:
- the header value, or null if it doesn't exist.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
doneFetch
public void doneFetch(org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionDone with the fetch. Call this when the fetch has been completed. A log entry will be generated describing what was done.- Specified by:
doneFetchin interfaceIThrottledConnection- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
close
public void close() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionClose the connection. Call this to end this server connection.- Specified by:
closein interfaceIThrottledConnection- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-