Postat la
01-Feb-2009
ora
09:21 pm
de
[Chetroesu]
De multe ori m-am lovit de nevoia extragerii continutului unei pagini html, fie pentru a realiza o indexare de continut, fie a implementa o functionalitate de genul full text search. Am realizat acest lucru folosind componenta HttpClient.
Exemplu simplu de utilizare a HttpClient
public String getContentForUrl(String url) throws Exception { HttpClient client = new HttpClient(); PostMethod method = new PostMethod();
method.setURI(new org.apache.commons.httpclient.URI(url, true)); method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, new DefaultHttpMethodRetryHandler(3, false)); client.executeMethod(method); return method.getResponseBodyAsString(); } |
Exemplu de utilizare a HttpClient in conditiile folosirii unui server Proxy
public String getContentForUrl(String url) throws Exception { HttpClient client = new HttpClient(); PostMethod method = new PostMethod();
method.setURI(new org.apache.commons.httpclient.URI(url, true)); method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, new DefaultHttpMethodRetryHandler(3, false)); //se foloseste proxy? if (true) { client.getHostConfiguration().setProxy("proxyHost", "proxyPort"); AuthScope authScope = new AuthScope("proxyHost", "proxyPort"); UsernamePasswordCredentials proxyCredentials = new UsernamePasswordCredentials("", ""); client.getState().setProxyCredentials(authScope, proxyCredentials); client.getParams().setAuthenticationPreemptive(true); }
client.executeMethod(method); return method.getResponseBodyAsString(); } |