2000-05-22 21:35:35 +04:00
|
|
|
_ _ ____ _
|
|
|
|
___| | | | _ \| |
|
|
|
|
/ __| | | | |_) | |
|
|
|
|
| (__| |_| | _ <| |___
|
|
|
|
\___|\___/|_| \_\_____|
|
|
|
|
|
|
|
|
INTERNALS
|
|
|
|
|
|
|
|
The project is kind of split in two. The library and the client. The client
|
|
|
|
part uses the library, but the library is meant to be designed to allow other
|
|
|
|
applications to use it.
|
|
|
|
|
|
|
|
Thus, the largest amount of code and complexity is in the library part.
|
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
CVS
|
|
|
|
===
|
|
|
|
|
|
|
|
All changes to the sources are committed to the CVS repository as soon as
|
|
|
|
they're somewhat verified to work. Changes shall be commited as independently
|
|
|
|
as possible so that individual changes can be easier spotted and tracked
|
|
|
|
afterwards.
|
|
|
|
|
|
|
|
Tagging shall be used extensively, and by the time we release new archives we
|
|
|
|
should tag the sources with a name similar to the released version number.
|
|
|
|
|
2000-05-22 21:35:35 +04:00
|
|
|
Windows vs Unix
|
|
|
|
===============
|
|
|
|
|
|
|
|
There are a few differences in how to program curl the unix way compared to
|
|
|
|
the Windows way. The four most notable details are:
|
|
|
|
|
|
|
|
1. Different function names for close(), read(), write()
|
2000-06-14 13:16:11 +04:00
|
|
|
2. Windows requires a couple of init calls for the socket stuff
|
2000-05-22 21:35:35 +04:00
|
|
|
3. The file descriptors for network communication and file operations are
|
|
|
|
not easily interchangable as in unix
|
|
|
|
4. When writing data to stdout, Windows makes end-of-lines the DOS way, thus
|
|
|
|
destroying binary data, although you do want that conversion if it is
|
|
|
|
text coming through... (sigh)
|
|
|
|
|
2000-06-14 13:16:11 +04:00
|
|
|
In curl, (1) is made with defines and macros, so that the source looks the
|
|
|
|
same at all places except for the header file that defines them.
|
|
|
|
|
|
|
|
(2) must be made by the application that uses libcurl, in curl that means
|
|
|
|
src/main.c has some code #ifdef'ed to do just that.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
(3) is simply avoided by not trying any funny tricks on file descriptors.
|
|
|
|
|
2000-10-11 14:59:36 +04:00
|
|
|
(4) we set stdout to binary under windows
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
Inside the source code, I do make an effort to avoid '#ifdef WIN32'. All
|
|
|
|
conditionals that deal with features *should* instead be in the format
|
|
|
|
'#ifdef HAVE_THAT_WEIRD_FUNCTION'. Since Windows can't run configure scripts,
|
|
|
|
I maintain two config-win32.h files (one in / and one in src/) that are
|
|
|
|
supposed to look exactly as a config.h file would have looked like on a
|
|
|
|
Windows machine!
|
|
|
|
|
|
|
|
Library
|
|
|
|
=======
|
|
|
|
|
2000-06-14 13:16:11 +04:00
|
|
|
As described elsewhere, libcurl is meant to get two different "layers" of
|
2000-11-29 10:47:51 +03:00
|
|
|
interfaces. At the present point only the high-level, the "easy", interface
|
|
|
|
has been fully implemented and documented. We assume the easy-interface in
|
|
|
|
this description, the low-level interface will be documented when fully
|
2000-06-14 13:16:11 +04:00
|
|
|
implemented.
|
|
|
|
|
|
|
|
There are plenty of entry points to the library, namely each publicly defined
|
2000-05-22 21:35:35 +04:00
|
|
|
function that libcurl offers to applications. All of those functions are
|
2000-06-14 13:16:11 +04:00
|
|
|
rather small and easy-to-follow. All the ones prefixed with 'curl_easy' are
|
|
|
|
put in the lib/easy.c file.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
curl_easy_init() allocates an internal struct and makes some initializations.
|
|
|
|
The returned handle does not revail internals.
|
|
|
|
|
2000-06-14 13:16:11 +04:00
|
|
|
curl_easy_setopt() takes a three arguments, where the option stuff must be
|
2000-05-22 21:35:35 +04:00
|
|
|
passed in pairs, the parameter-ID and the parameter-value. The list of
|
2000-06-14 13:16:11 +04:00
|
|
|
options is documented in the man page.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
curl_easy_perform() does a whole lot of things:
|
2000-06-14 13:16:11 +04:00
|
|
|
|
|
|
|
The function analyzes the URL, get the different components and connects to
|
|
|
|
the remote host. This may involve using a proxy and/or using SSL. The
|
|
|
|
GetHost() function in lib/hostip.c is used for looking up host names.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
When connected, the proper function is called. The functions are named after
|
|
|
|
the protocols they handle. ftp(), http(), dict(), etc. They all reside in
|
|
|
|
their respective files (ftp.c, http.c and dict.c).
|
|
|
|
|
|
|
|
The protocol-specific functions deal with protocol-specific negotiations and
|
|
|
|
setup. They have access to the sendf() (from lib/sendf.c) function to send
|
|
|
|
printf-style formatted data to the remote host and when they're ready to make
|
|
|
|
the actual file transfer they call the Transfer() function (in
|
|
|
|
lib/download.c) to do the transfer. All printf()-style functions use the
|
|
|
|
supplied clones in lib/mprintf.c.
|
|
|
|
|
|
|
|
While transfering, the progress functions in lib/progress.c are called at a
|
2000-06-14 13:16:11 +04:00
|
|
|
frequent interval (or at the user's choice, a specified callback might get
|
|
|
|
called). The speedcheck functions in lib/speedcheck.c are also used to verify
|
|
|
|
that the transfer is as fast as required.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
2000-06-14 13:16:11 +04:00
|
|
|
When completed curl_easy_cleanup() should be called to free up used
|
|
|
|
resources.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
HTTP(S)
|
|
|
|
|
|
|
|
HTTP offers a lot and is the protocol in curl that uses the most lines of
|
|
|
|
code. There is a special file (lib/formdata.c) that offers all the multipart
|
|
|
|
post functions.
|
|
|
|
|
|
|
|
base64-functions for user+password stuff is in (lib/base64.c) and all
|
|
|
|
functions for parsing and sending cookies are found in
|
|
|
|
(lib/cookie.c).
|
|
|
|
|
|
|
|
HTTPS uses in almost every means the same procedure as HTTP, with only two
|
2000-06-14 13:16:11 +04:00
|
|
|
exceptions: the connect procedure is different and the function used to read
|
|
|
|
or write from the socket is different, although the latter fact is hidden in
|
|
|
|
the source by the use of curl_read() for reading and curl_write() for writing
|
|
|
|
data to the remote server.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
FTP
|
|
|
|
|
|
|
|
The if2ip() function can be used for getting the IP number of a specified
|
2000-06-14 13:16:11 +04:00
|
|
|
network interface, and it resides in lib/if2ip.c. It is only used for the FTP
|
|
|
|
PORT command.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
TELNET
|
|
|
|
|
|
|
|
Telnet is implemented in lib/telnet.c.
|
|
|
|
|
|
|
|
FILE
|
|
|
|
|
|
|
|
The file:// protocol is dealt with in lib/file.c.
|
|
|
|
|
|
|
|
LDAP
|
|
|
|
|
|
|
|
Everything LDAP is in lib/ldap.c.
|
|
|
|
|
|
|
|
GENERAL
|
|
|
|
|
|
|
|
URL encoding and decoding, called escaping and unescaping in the source code,
|
|
|
|
is found in lib/escape.c.
|
|
|
|
|
|
|
|
While transfering data in Transfer() a few functions might get
|
2000-06-14 13:16:11 +04:00
|
|
|
used. curl_getdate() in lib/getdate.c is for HTTP date comparisons (and
|
|
|
|
more).
|
2000-05-22 21:35:35 +04:00
|
|
|
|
2000-06-14 13:16:11 +04:00
|
|
|
lib/getenv.c offers curl_getenv() which is for reading environment variables
|
|
|
|
in a neat platform independent way. That's used in the client, but also in
|
2000-11-29 10:47:51 +03:00
|
|
|
lib/url.c when checking the proxy environment variables.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
lib/netrc.c holds the .netrc parser
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
lib/timeval.c features replacement functions for systems that don't have
|
2000-11-29 10:47:51 +03:00
|
|
|
gettimeofday().
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
A function named curl_version() that returns the full curl version string is
|
|
|
|
found in lib/version.c.
|
|
|
|
|
|
|
|
Client
|
|
|
|
======
|
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
main() resides in src/main.c together with most of the client code.
|
|
|
|
src/hugehelp.c is automatically generated by the mkhelp.pl perl script to
|
|
|
|
display the complete "manual" and the src/urlglob.c file holds the functions
|
|
|
|
used for the multiple-URL support.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
|
|
|
The client mostly mess around to setup its config struct properly, then it
|
2000-06-14 13:16:11 +04:00
|
|
|
calls the curl_easy_*() functions of the library and when it gets back
|
|
|
|
control after the curl_easy_perform() it cleans up the library, checks status
|
|
|
|
and exits.
|
2000-05-22 21:35:35 +04:00
|
|
|
|
2000-10-11 14:59:36 +04:00
|
|
|
When the operation is done, the ourWriteOut() function in src/writeout.c may
|
|
|
|
be called to report about the operation. That function is using the
|
|
|
|
curl_easy_getinfo() function to extract useful information from the curl
|
|
|
|
session.
|
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
Test Suite
|
|
|
|
==========
|
|
|
|
|
|
|
|
During November 2000, a test suite has evolved. It is placed in its own
|
|
|
|
subdirectory directly off the root in the curl archive tree, and it contains
|
|
|
|
a bunch of scripts and a lot of test case data.
|
|
|
|
|
|
|
|
The main test script is runtests.pl that will invoke the two servers
|
|
|
|
httpserver.pl and ftpserver.pl before all the test cases are performed. The
|
|
|
|
test suite currently only runs on unix-like platforms.
|
2000-10-11 14:59:36 +04:00
|
|
|
|
2000-11-29 10:47:51 +03:00
|
|
|
You'll find a complete description of the test case data files in the README
|
|
|
|
file in the test directory.
|