Skip to content
Chandrashekhar Mullaparthi edited this page Feb 17, 2014 · 16 revisions
Module ibrowse

Module ibrowse

The ibrowse application implements an HTTP 1.1 client in erlang.

Copyright © 2005-2012 Chandrashekhar Mullaparthi

Version: 4.0.0

Behaviours: gen_server.

Authors: Chandrashekhar Mullaparthi (chandrashekhar dot mullaparthi at gmail dot com).

The ibrowse application implements an HTTP 1.1 client in erlang. This module implements the API of the HTTP client. There is one named process called ‘ibrowse’ which assists in load balancing and maintaining configuration. There is one load balancing process per unique webserver. There is one process to handle one TCP connection to a webserver (implemented in the module ibrowse_http_client). Multiple connections to a webserver are setup based on the settings for each webserver. The ibrowse process also determines which connection to pipeline a certain request on. The functions to call are send_req/3, send_req/4, send_req/5, send_req/6.

Here are a few sample invocations.

ibrowse:send_req(“http://intranet/messenger/”, [], get).

ibrowse:send_req(“http://www.google.com/”, [], get, [], [{proxy_user, "XXXXX"}, {proxy_password, "XXXXX"}, {proxy_host, "proxy"}, {proxy_port, 8080}], 1000).

ibrowse:send_req(“http://www.erlang.org/download/otp_src_R10B-3.tar.gz”, [], get, [], [{proxy_user, "XXXXX"}, {proxy_password, "XXXXX"}, {proxy_host, "proxy"}, {proxy_port, 8080}, {save_response_to_file, true}], 1000).

ibrowse:send_req(“http://www.erlang.org”, [], head).

ibrowse:send_req(“http://www.sun.com”, [], options).

ibrowse:send_req(“http://www.bbc.co.uk”, [], trace).

ibrowse:send_req(“http://www.google.com”, [], get, [], [{stream_to, self()}]).
all_trace_off/0 Turn Off ALL tracing.
code_change/3
get_config_value/1 Internal export.
get_config_value/2 Internal export.
handle_call/3
handle_cast/2
handle_info/2
init/1
rescan_config/0 Clear current configuration for ibrowse and load from the file ibrowse.conf in the IBROWSE_EBIN/../priv directory.
rescan_config/1
send_req/3 This is the basic function to send a HTTP request.
send_req/4 Same as send_req/3.
send_req/5 Same as send_req/4.
send_req/6 Same as send_req/5.
send_req_direct/4 Same as send_req/3 except that the first argument is the PID returned by spawn_worker_process/2 or spawn_link_worker_process/2.
send_req_direct/5 Same as send_req/4 except that the first argument is the PID returned by spawn_worker_process/2 or spawn_link_worker_process/2.
send_req_direct/6 Same as send_req/5 except that the first argument is the PID returned by spawn_worker_process/2 or spawn_link_worker_process/2.
send_req_direct/7 Same as send_req/6 except that the first argument is the PID returned by spawn_worker_process/2 or spawn_link_worker_process/2.
set_dest/3 Deprecated.
set_max_pipeline_size/3 Set the maximum pipeline size for each connection to a specific Host:Port.
set_max_sessions/3 Set the maximum number of connections allowed to a specific Host:Port.
show_dest_status/0 Shows some internal information about load balancing.
show_dest_status/2 Shows some internal information about load balancing to a
specified Host:Port.
spawn_link_worker_process/1 Same as spawn_worker_process/1 except the the calling process is linked to the worker process which is spawned.
spawn_link_worker_process/2 Same as spawn_worker_process/2 except the the calling process is linked to the worker process which is spawned.
spawn_worker_process/1 Creates a HTTP client process to the specified Host:Port which is not part of the load balancing pool.
spawn_worker_process/2 Same as spawn_worker_process/1 but takes as input a Host and Port instead of a URL.
start/0 Starts the ibrowse process without linking.
start_link/0 Starts the ibrowse process linked to the calling process.
stop/0 Stop the ibrowse process.
stop_worker_process/1 Terminate a worker process spawned using spawn_worker_process/2 or spawn_link_worker_process/2.
stream_close/1 Tell ibrowse to close the connection associated with the specified stream.
stream_next/1 Tell ibrowse to stream the next chunk of data to the caller.
terminate/2
trace_off/0 Turn tracing off for the ibrowse process.
trace_off/2 Turn tracing OFF for all connections to the specified HTTP server.
trace_on/0 Turn tracing on for the ibrowse process.
trace_on/2 Turn tracing on for all connections to the specified HTTP server.

all_trace_off() -> ok

Turn Off ALL tracing

code_change(OldVsn, State, Extra) -> any()

get_config_value(Key) -> any()

Internal export

get_config_value(Key, DefVal) -> any()

Internal export

handle_call(Request, From, State) -> any()

handle_cast(Msg, State) -> any()

handle_info(Info, State) -> any()

init(X1) -> any()

rescan_config() -> any()

Clear current configuration for ibrowse and load from the file
ibrowse.conf in the IBROWSE_EBIN/../priv directory. Current
configuration is cleared only if the ibrowse.conf file is readable
using file:consult/1

rescan_config(File) -> any()

send_req(Url::string(), Headers::headerList(), Method::method()) -> response()

This is the basic function to send a HTTP request. A default timeout value of 30 seconds applies to the request. To specify exact values for timeout handling, see the options inactivity_timeout, connect_timeout or use send_req/6.
The Status return value indicates the HTTP status code returned by the webserver

send_req(Url, Headers, Method::method(), Body::body()) -> response()

Same as send_req/3.
If a list is specified for the body it has to be a flat list. The body can also be a fun/0 or a fun/1.
If fun/0, the connection handling process will repeatdely call the fun until it returns an error or eof.

Fun() = {ok, Data} | eof

If fun/1, the connection handling process will repeatedly call the fun with the supplied state until it returns an error or eof.
Fun(State) = {ok, Data} | {ok, Data, NewState} | eof

send_req(Url::string(), Headers::headerList(), Method::method(), Body::body(), Options::optionList()) -> response()

Same as send_req/4.
For a description of SSL Options, look in the ssl manpage. If the
HTTP Version to use is not specified, the default is 1.1.


  • The host_header option is useful in the case where ibrowse is connecting to a component such as stunnel which then sets up a secure connection to a webserver. In this case, the URL supplied to ibrowse must have the stunnel host/port details, but that won’t make sense to the destination webserver. This option can then be used to specify what should go in the Host header in the request.
  • The stream_to option can be used to have the HTTP response streamed to a process as messages as data arrives on the
    socket. If the calling process wishes to control the rate at which data is received from the server, the option {stream_to, {process(), once}} can be specified. The calling process will have to invoke ibrowse:stream_next(Request_id) to receive the next packet.
  • When both the options save_response_to_file and stream_to are specified, the former takes precedence.
  • For the save_response_to_file option, the response body is saved to file only if the status code is in the 200-299 range. If not, the response body is returned as a string.
  • Whenever an error occurs in the processing of a request, ibrowse will return as much information as it has, such as HTTP Status Code and HTTP Headers. When this happens, the response is of the form {error, {Reason, {stat_code, StatusCode}, HTTP_headers}}
  • The inactivity_timeout option is useful when dealing with large response bodies and/or slow links. In these cases, it might be hard to estimate how long a request will take to complete. In such cases, the client might want to timeout if no data has been received on the link for a certain time interval.
    This value is also used to close connections which are not in use for the specified timeout value.
  • The connect_timeout option is to specify how long the client process should wait for connection establishment. This is useful in scenarios where connections to servers are usually setup very fast, but responses might take much longer compared to connection setup. In such cases, it is better for the calling process to timeout faster if there is a problem (DNS lookup delays/failures, network routing issues, etc). The total timeout value specified for the request will enforced. To illustrate using an example: ibrowse:send_req(“http://www.example.com/cgi-bin/request”, [], get, [], [{connect_timeout, 100}], 1000). In the above invocation, if the connection isn’t established within 100 milliseconds, the request will fail with {error, conn_failed}.
    If connection setup succeeds, the total time allowed for the request to complete will be 1000 milliseconds minus the time taken for connection setup.
  • The socket_options option can be used to set specific options on the socket. The {active, true | false | once}
    and {packet_type, Packet_type} will be filtered out by ibrowse.
  • The headers_as_is option is to enable the caller to send headers exactly as specified in the request without ibrowse adding some of its own. Required for some picky servers apparently.
  • The give_raw_headers option is to enable the caller to get access to the raw status line and raw unparsed headers. Not quite sure why someone would want this, but one of my users asked for it, so here it is.
  • The preserve_chunked_encoding option enables the caller to receive the raw data stream when the Transfer-Encoding of the server response is Chunked.
  • The return_raw_request option enables the caller to get the exact request which was sent by ibrowse to the server, along with the response. When this option is used, the response for synchronous requests is a 5-tuple instead of the usual 4-tuple. For asynchronous requests, the calling process gets a message {ibrowse_async_raw_req, Raw_req}.
  • The basic_auth option works whether you supply it in the ‘Header’ or ‘Options’ parameter. The recommended usage is to supply it as part of ‘Options’.

send_req(Url, Headers::headerList(), Method::method(), Body::body(), Options::optionList(), Timeout) -> response()

  • Timeout = integer() | infinity

Same as send_req/5.
All timeout values are in milliseconds.

send_req_direct(Conn_pid, Url, Headers, Method) -> any()

Same as send_req/3 except that the first argument is the PID
returned by spawn_worker_process/2 or spawn_link_worker_process/2

send_req_direct(Conn_pid, Url, Headers, Method, Body) -> any()

Same as send_req/4 except that the first argument is the PID
returned by spawn_worker_process/2 or spawn_link_worker_process/2

send_req_direct(Conn_pid, Url, Headers, Method, Body, Options) -> any()

Same as send_req/5 except that the first argument is the PID
returned by spawn_worker_process/2 or spawn_link_worker_process/2

send_req_direct(Conn_pid, Url, Headers, Method, Body, Options, Timeout) -> any()

Same as send_req/6 except that the first argument is the PID
returned by spawn_worker_process/2 or spawn_link_worker_process/2

set_dest(Host, Port, T) -> any()

Deprecated. Use set_max_sessions/3 and set_max_pipeline_size/3
for achieving the same effect.

set_max_pipeline_size(Host::string(), Port::integer(), Max::integer()) -> ok

Set the maximum pipeline size for each connection to a specific Host:Port.

set_max_sessions(Host::string(), Port::integer(), Max::integer()) -> ok

Set the maximum number of connections allowed to a specific Host:Port.

show_dest_status() -> any()

Shows some internal information about load balancing. Info
about workers spawned using spawn_worker_process/2 or
spawn_link_worker_process/2 is not included.

show_dest_status(Host, Port) -> any()

Shows some internal information about load balancing to a
specified Host:Port. Info about workers spawned using
spawn_worker_process/2 or spawn_link_worker_process/2 is not
included.

spawn_link_worker_process(Url::string()) -> {ok, pid()}

Same as spawn_worker_process/1 except the the calling process
is linked to the worker process which is spawned.

spawn_link_worker_process(Host::string(), Port::integer()) -> {ok, pid()}

Same as spawn_worker_process/2 except the the calling process
is linked to the worker process which is spawned.

spawn_worker_process(Url::string()) -> {ok, pid()}

Creates a HTTP client process to the specified Host:Port which is not part of the load balancing pool. This is useful in cases where some requests to a webserver might take a long time whereas some might take a very short time. To avoid getting these quick requests stuck in the pipeline behind time consuming requests, use this function to get a handle to a connection process.
Note: Calling this function only creates a worker process. No connection is setup. The connection attempt is made only when the first request is sent via any of the send_req_direct/4,5,6,7 functions.
Note: It is the responsibility of the calling process to control pipeline size on such connections.

spawn_worker_process(Host::string(), Port::integer()) -> {ok, pid()}

Same as spawn_worker_process/1 but takes as input a Host and Port
instead of a URL.

start() -> any()

Starts the ibrowse process without linking. Useful when testing using the shell

start_link() -> {ok, pid()}

Starts the ibrowse process linked to the calling process. Usually invoked by the supervisor ibrowse_sup

stop() -> any()

Stop the ibrowse process. Useful when testing using the shell.

stop_worker_process(Conn_pid::pid()) -> ok

Terminate a worker process spawned using spawn_worker_process/2 or spawn_link_worker_process/2. Requests in progress will get the error response

{error, closing_on_request}

stream_close(Req_id::req_id()) -> ok | {error, unknown_req_id}

Tell ibrowse to close the connection associated with the specified stream. Should be used in conjunction with the stream_to option. Note that all requests in progress on the connection which is serving this Req_id will be aborted, and an error returned.

stream_next(Req_id::req_id()) -> ok | {error, unknown_req_id}

Tell ibrowse to stream the next chunk of data to the caller. Should be used in conjunction with the stream_to option

terminate(Reason, State) -> any()

trace_off() -> any()

Turn tracing off for the ibrowse process

trace_off(Host, Port) -> ok

Turn tracing OFF for all connections to the specified HTTP server.

trace_on() -> any()

Turn tracing on for the ibrowse process

trace_on(Host, Port) -> ok

  • Host = string()
  • Port = integer()

Turn tracing on for all connections to the specified HTTP server. Host is whatever is specified as the domain name in the URL




Generated by EDoc, Nov 10 2010, 06:04:33.