зеркало из https://github.com/mozilla/gecko-dev.git
91 строка
2.8 KiB
Plaintext
91 строка
2.8 KiB
Plaintext
|
|
|
|
|
|
Web Sniffer
|
|
|
|
by Erik van der Poel <erik@netscape.com>
|
|
|
|
originally created in 1998
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
This is a set of tools to work with the protocols underlying the Web.
|
|
|
|
|
|
Description of Tools
|
|
|
|
view.cgi
|
|
|
|
This is an HTML form that allows the user to enter a URL. The CGI then
|
|
fetches the object associated with that URL, and presents it to the
|
|
user in a colorful way. For example, HTTP headers are shown, HTML
|
|
documents are parsed and colored, and non-ASCII characters are shown in
|
|
hex. Links are turned into live links, that can be clicked to see the
|
|
source of that URL, allowing the user to "browse" source.
|
|
|
|
robot
|
|
|
|
Originally written to see how many documents actually include the HTTP
|
|
and HTML charsets, this tool has developed into a more general robot
|
|
that collects various statistics, including HTML tag statistics, DNS
|
|
lookup timing, etc. This robot does not adhere to the standard robot
|
|
rules, so please exercise caution if you use this.
|
|
|
|
proxy
|
|
|
|
This is an HTTP proxy that sits between the user's browser and another
|
|
HTTP proxy. It captures all of the HTTP traffic between the browser and
|
|
the Internet, and presents it to the user in the same colorful way as
|
|
the above-mentioned view.cgi.
|
|
|
|
grab
|
|
|
|
Allows the user to "grab" a whole Web site, or everything under a
|
|
particular directory. This is useful if you want to grab a bunch of
|
|
related HTML files, e.g. the whole CSS2 spec.
|
|
|
|
link
|
|
|
|
Allows the user to recursively check for bad links in a Web site or
|
|
under a particular directory.
|
|
|
|
|
|
Description of Files
|
|
|
|
addurl.c, addurl.h: adds URLs to a list
|
|
cgiview.c, cgiview.html: the view.cgi tool
|
|
dns.c: experimental DNS toy
|
|
doRun: used with robot
|
|
file.c, file.h: the file: URL
|
|
ftp.c: experimental FTP toy
|
|
grab.c: the "grab" tool
|
|
hash.c, hash.h: incomplete hash table routines
|
|
html.c, html.h: HTML parser
|
|
http.c, http.h: simple HTTP implementation
|
|
io.c, io.h: I/O routines
|
|
link.c: the "link" tool
|
|
main.h: very simple callbacks, could be more object-oriented
|
|
Makefile: the Solaris Makefile
|
|
mime.c, mime.h: MIME Content-Type parser
|
|
mutex.h: for threading in the robot
|
|
net.c, net.h: low-level Internet APIs
|
|
pop.c: experimental POP toy
|
|
proxy.c: the "proxy" tool
|
|
robot.c: the "robot" tool
|
|
run: used with robot
|
|
TODO: notes to myself
|
|
url.c, url.h: implementation of absolute and relative URLs
|
|
utils.c, utils.h: some little utility routines
|
|
view.c, view.h: presents stuff to the user
|
|
|
|
|
|
Description of Code
|
|
|
|
The code is extremely quick-and-dirty. It could be a lot more elegant,
|
|
e.g. C++, object-oriented, extensible, etc.
|
|
|
|
The point of this exercise was not to design and write a program well,
|
|
but to create some useful tools and to learn about Internet protocols.
|