From e9e256356fb36d9c789248679a505ce30a18beb9 Mon Sep 17 00:00:00 2001
From: "kestes%walrus.com" <kestes%walrus.com>
Date: Mon, 31 Dec 2001 20:02:05 +0000
Subject: [PATCH] new documentation files.

---
 webtools/tinderbox2/Overview | 153 +++++++++++++++++++
 webtools/tinderbox2/Policies | 277 +++++++++++++++++++++++++++++++++++
 2 files changed, 430 insertions(+)
 create mode 100644 webtools/tinderbox2/Overview
 create mode 100644 webtools/tinderbox2/Policies

diff --git a/webtools/tinderbox2/Overview b/webtools/tinderbox2/Overview
new file mode 100644
index 000000000000..95ab06765b16
--- /dev/null
+++ b/webtools/tinderbox2/Overview
@@ -0,0 +1,153 @@
+
+
+Overview of the Tinderbox System
+--------------------------------
+
+Tinderbox is an information display system. It runs on a machine with
+a webserver and will periodically write static HTML files to the disk
+so that the webserver can serve these documents. Tinderbox is run out
+of cron every five minutes.  It gathers up information from various
+databases including: CVS Logs, Bonsai, and Perforce.  It will also
+process mail which is sent to it. Mail is sent from Bug Ticketing
+software and Build/Test Machines.  All this information is combined to
+produce the HTML pages.
+
+
+Since no two companies will structure their development processes the
+same way, the tinderbox code has to be highly configurable to account
+for most possible uses.  There is a main configuration file which
+allows most of the major user configurable variables to be set.
+Novice users can expect to edit only this file and get a working
+tinderbox system.  Additionally each library has been broken into two
+parts.  One part is the library specific configurations.  This file is
+expected to need modifications in some installations.  I have put all
+the library configurations into one directory to make it easy to find
+the parts of tinderbox which are easy to modify.  Each configuration
+library can be thought of as a table which might need to be edited or
+extended for use at your company.  I have provided a working system
+but the defaults may not suit your needs.  These tables can be easily
+changed in small ways by simply looking at the file and making obvious
+changes.  I have also allowed for the possibility of making complex
+changes that only a competent perl programmer could define.  Changes
+are not made to the files which I have provided.  Rather the changes
+are made to copies of the files which are stored in a local
+configuration directory.  This ensures that you can easily version the
+Tinderbox code as it is provided to you from the official distribution
+and you can separately version the local configurations which you
+make.  It is also easy to see the local configurations since you have
+both the original and the modified code on the same server and can
+difference the two.  As an example you might need to change the
+BuildStatus I assume that you have the following possible build
+outcomes (Build in progress, Build failed, Build succeded but tests
+failed, Build and all tests were successful) You may have additional
+outcomes to specifiy which kind of tests failed (unit test failed, not
+enough unit test coverage, performance tests failed).  Similarly you may have unusual requirements for how the filesystem should be laied out.  I provide a 
+ I suggest that you read through the files to see
+how they are laid out and what types of changes are possible.
+
+
+The build machines are not considered part of the tinderbox server.
+They are clients just like Bug Ticketing systems and Version control
+systems are clients.  Build machines mail their build logs to the
+server in a special format.  This format specifies that name/value
+pairs must appear at the top of the mail message followed by the
+complete build log.  Scripts for setting up a tinderbox build client
+can be found in the clientbin directory but you may have other build
+needs and may use any build methods you choose.
+
+The central concept of the Tinderbox system is the notion of a 'Tree'.
+When several different groups are working out of the same version
+control system often the files are partitioned into separate modules
+with each group working on one or more disjoint modules.  Over time
+the developers need to branch their code because several different
+versions of the files are under development at the same time.  A tree
+is a module/branch pair.  This corresponds to a set of files which can
+be checked out and built.  Tinderbox makes one page for each tree and
+displays what work is being done on that tree.  CVS has a notion of
+branches and of modules but not of trees.  It is not possible to give
+a branch/module pair a name.  The tinderbox TreeData provides the
+mappings between treenames and branch/module pairs.  Tinderbox
+displays the updates to bug tickets on the appropriate tree page.
+This requires an easy mapping between bug tickets and trees. One
+example of a complex function to determine tree name would be if each
+of the product product types listed in the bug tracking data base
+refers to one development project, except for a particular
+feature/platform of one particular project which is being developed by
+a separate group of developers.  So the version control notion of
+trees (a set of modules on a branch) may not have a direct map into
+the bug tracking database at all times.  In large projects it is
+sometimes convenient to have a tree called 'ALL' which is used to
+display all checkins performed on any trees and all bug tickets worked
+on by any programmers.  It is not possible to build or test the 'ALL'
+tree and neither the version control nor bug ticketing system knows of
+its existence.
+
+
+The Bug Tracking code was intended to be as general as possible.
+Most bug ticketing systems send mail when tickets change state.  The
+mail is often of the same form.  It is a name/value pair which the
+separator being the string ": ".  Tinderbox will parse mail of this
+form and display the interesting fields on the appropriate tree page.
+The configuration of this module involves specifying which bug ticket
+names are interesting and should be displayed.  Also you will need to
+specify how to map a bug ticket into a a tree.  This could be very
+simple if each bug ticket has a field which represents the tree it is
+applicable to (in this case tree could equal project) or can be very
+complex if the tree must be computed by the values of a set of fields.
+Also tinderbox keeps track of which bugs are "reopened" and displays
+them in a different column.  The idea is that some bugs are moving
+backwards and creating duplicate work.  These tickets are particularly
+troublesome and should be watched specially.  So possible all ticket
+status are partitioned into "progress" or "slippage" categories.  You
+will need to specify what status values are possible for your ticket
+system and you will also need to specify the set of columns which you
+would like to see on the status page.
+
+The heart of the tinderbox system is the 'status table'.  This is an
+HTML table which graphically shows how the changes made to the
+development databases.  It will show what is going on in the version
+control system, the bug tracking system, the build system, automatic
+regression tests and provide a notice board for developers to inform
+each other of current news.  By placing all this information in the
+same table it is possible to correlate and cross check how different
+types of changes effected each other and what was going on with the
+whole project at different times in the day. The rows of the table
+represent time with the most current events at the top of the page.
+There are different sets of columns for each database which needs to
+be displayed.  The sets of columns are managed by independent modules.
+There is one module for each version control system and each bug
+tracking system which tinderbox knows how to interface with.  It is
+easy to port the system to new databases by just adding a new module
+using the same style as the existing modules.  Modules never share or
+peek at each others data all combining of data is done by the humans
+who stare at the table and interpret what is going on.  The main
+tinderbox system does not know how many columns the final table will
+have. It only knows about a list of table modules.  Each module in the
+list is called in turn to generate the complete row then the entire
+row is displayed. The user must configure tinderbox with the list of
+modules which are of important to their own environment.  There is no
+restriction on the number of modules which may be configured, though
+due to implementation details each module can only appear once in the
+table.  There are many pop up windows embedded in the status table
+these will provide extra level of detail when a mouse is placed over
+the link.  By moving your mouse around the page you may effectively
+drill down into an item of interest and learn more about it without
+leaving the page.  Most of the links will click through to the
+appropriate database.  Thus if you need more data about an item you
+can click on the link and query the database directly.
+
+Besides the status table there is one other feature of the status
+page.  The page displays some information which is not correlated
+through time and with other data.  This information is called status
+table headers.  The main headers are the message of the day (MOTD),
+and the Tree State though there are a few others headers of mainly
+historical interest.  The important issue with the headers is that
+they are not optional.  Tinderbox can render a table with as little or
+as many columns in the status table as you wish but each of the
+headers has a particular place on the status page and needs to be
+rendered in a particular way (font size, font type, etc) thus the
+tinderbox server must know where each header must go and how to
+specify the appropriate html context for this header.  Users may set
+null defaults for headers that they do not need but it is much harder
+for a user to add new headers to the code in a modular fashion.
+
diff --git a/webtools/tinderbox2/Policies b/webtools/tinderbox2/Policies
new file mode 100644
index 000000000000..ae5c5f3cf9a4
--- /dev/null
+++ b/webtools/tinderbox2/Policies
@@ -0,0 +1,277 @@
+Preparations you will need to make and 
+policies you will need to set:
+-----------------------------------
+
+
+To install tinderbox you will need some information about your
+existing computer systems and some idea about what your goals are.
+Here is a list of questions to help get you started, some of these
+ideas may not be apropriate for your environment.
+
+
+The webserver will serve the tinderbox pages.  
+Webserver configuration is a bit of an art and you will need to
+understand the policies which are used to administer your webserver.
+
+*) You will need to decide the directory where tinderbox should write
+the static HTML pages.  This will depend on how your webserver is
+configured. The default location is based on the RedHat 7.1
+(apache-1.3.19-5) installation and is: /var/www/html/tinderbox2.  You
+will also need to know what the URL browsers will need to use to find
+this directory. Since tinderbox generates static web pages, it is
+possible to run tinderbox and not run a web server.  One way this
+could be done is if you have a network file system and all users have
+browsers which can read from the HTML directories.  In this case all
+URL's should begin with "file:/" instead of the usual "http://".
+
+*) Project level administration is done via cgi scripts.  These
+scripts allow administrators to set the message of the day, and the
+state of the tree (open, closed, restricted).  Also all users can post
+notices to the web pages via a cgi script.  CGI programs are often
+restricted to a portion of the file system which is disjoint from the
+HTML files. You will need to figure out where the CGI programs will
+go.  Tinderbox takes its defaults from RedHat 7.1 and uses:
+/var/www/cgi-bin/tinderbox2. You will also need to know what the URL
+browsers will need to use to find this directory.
+
+*) CGI scripts will run as an unauthenticated user on your system.
+You will need to decide which user will run the tinderbox CGI scripts.
+The same user id must be used for running the scripts as for tinderbox
+mail delivery.  The Tinderbox Configuration files will define this
+user id and as a security precaution check that it is running as the
+required id.  It is suggested that this id not be a privileged id
+(higher ids are better, please make this number be grater then 10 and
+bigger then 100 is recommended).  Smaller ids are often assumed to
+have more privileges on a Unix box then larger ids.  It is not a good
+idea for an unauthenticated user to have any privileges so a large id
+is recommended. It is also recommended that you not use the id 'nobody'
+as this id is over used and it would be better to partition the
+unauthenticated user into separate ids in case of security problems.
+RedHat runs all its CGI scripts as the user 'apache', this is an
+acceptable user.  I would prefer to have a separate user to run the
+tinderbox CGI scripts but this would require recompiling apache to
+enable suEXEC, and it is more effort then most groups can afford.
+
+*) Tinderbox Files. There are other tinderbox files which need to be
+placed on the webserver.  These include libraries and non-cgi
+programs. You will need to decide where to place these files.  Most
+users put them in /home/tinderbox2.
+
+*) Tinderbox Data. Tinderbox stores its data in the file system.  For
+security it is often a good idea to keep this data out of the HTML and
+CGI directories so that malicious users can not directly access this
+data.  The compressed build logs can grow quite large, so it is
+recommended to put the data on a file system with room.  The default
+is to put them in the directory /home/tinderbox2/data.
+
+
+Mail
+----
+
+*) Many of the tinderbox modules (Bug Ticket, Build, CVS) receive
+their data via mail.  The mail system on you web server machine must
+be configured to deliver the mail into the tinderbox mail processing
+programs.  You should spend some time understanding how your mail
+delivery system can be configured to allow user mail to be delivered
+into a program and how to set the user id under which this delivery
+occurs.  If you do not wish to configure your mail delivery program
+then you can use fetchmail to pull the mail out of a mail box and push
+it into the programs on a periodic basis.  See the install page for
+details on what I have learned about mailing systems.
+
+
+Production Version Control
+-------------------------
+
+One of the biggest responsibilities which a "buildmaster" has is the
+requirement that all code should be reproducible.  That is that at
+any point in the future, even more than one year later, the current
+binaries should be able to be rebuilt byte for byte from sources.
+This requirement can be broken down as follows:
+
+1) The build machine must be reproducible.  
+
+We must be able to get back the same build machine we had at any point
+in the past.  This means that all OS libraries, all header files, all
+compilers, all build tools (make, grep, sed) must have some mechanism
+to roll back.  It is common to use a backup of the build machine to
+reconstruct it.  Most OS will give you a list of the software packages
+which are installed on the machine and their version numbers.  I like
+to keep the list of software packages which are installed on the
+machine checked into version control.  This allows me to compare the
+state of the build machine at any two points in time.  I have tools to
+recreate the build-machine from just a list of packages with version
+numbers. It is considered a best practice to limit the amount of
+software which is available on the build machine.  A build machine
+with too much installed will only make it difficult to reproduce older
+builds should the need arise.  I recommend not installing any
+web servers or graphical window managers on your build machine.  It
+should be clear that the build machine should not be the same machine
+where the tinderbox server runs.
+
+2) The build process must be reproducible.  That is all the steps
+which are used to create the application must be reproducible.
+
+*) Build Interface: We must be able to run exactly the same build
+process in the future including: all commands with command line
+arguments, all environmental variables.  I recommend that the entire
+build process be viewed as something outside of the build master
+control.  Developers are responsible for ensuring that there is a
+simple build master interface to construct all the software products
+which go into a build.  Typically there is a makefile in a standard
+place where the buildmaster can run something like "make all; make
+install;" and be guaranteed that this will build the product.  The
+build interface should be viewed as something which never changes and
+ are part of the build machine, like the OS and are changed only
+rarely.  It is hard enough to track all the parts of the build process
+which we expect to change, we should not need to track complex build
+procedures.  The build procedures should have a standard interface.
+By keeping the build instructions in one makefile which is checked
+into the same version control system as the sources it is easy to
+recreate any previous build even if the commands used to build the
+software fluctuate rapidly between releases.  There must be a simple
+interface to construct the software which will hide all the complexity
+of the actual construction.  
+
+*) Build Environment: The makefile will code all the build commands
+and all the environmental variables (PATH, UMASK, LD_LIBRARY_PATH,
+CLASSPATH) needed to build the software though it may rely on some
+well defined command line arguments (PREFIX, CCFLAGS, JAVA_LIBS) to
+make these prematurely.  These command line arguments should not
+change between versions of the software but should be a fixed set of
+build parameters.  The parameters may be needed to specify where some
+files are found on the build machine (Ideally the build machine is set
+up the same as developers machines so these directories can be
+hard-coded into the makefiles but often there is a need for some
+directories to be specified at build time) or where files are to be
+created/installed on the build machine (typically a subdirectory of
+/var/tmp but there may be several builds running at once and each will
+need a different directory) or what kind of build is being created.
+Each part of the build which needs a particular environmental variable
+set or a special header file in some path should have tests which
+ensure that the build environment is valid.  I keep my build scripts
+installed on the build machine and they are always started by running
+/etc/rc.d/init.d/build start this ensures that I am not relying on any
+build environmental variables which are set by logging into the build
+account and are thus not tracked and versioned.
+
+*) Environmental safety issues: 
+
+If the build environment can not be used to build the software then a
+human readable error message should be generated.  My makefiles often
+run various checks on the environmental variables before they
+construct the code.  They check that all required environmental
+variables are set, that the required libraries are found, that
+directories which must be disjoint (build and install directories) do
+not overlap. This test suite becomes a build regression test and as I
+discover additional possible build problems I add new tests to the
+makefile.  I make it a habit to explicit set all environmental
+variables so that there is no doubt as to their expected values.  It
+is important for the QA group to only use Builds which were created by
+an automated process so that we are sure that there are no
+undocumented steps in either the test builds or the released build.
+
+3) Track the Build numbers.  Given a clean install of your product you
+should have all the information necessary to reproduce the executable
+from sources.  If a customer shows you the application binaries you
+must be able to get the source code which build the application,
+reconstruct the build machine which created the application and
+possibly rerun the build exactly the same way as the application was
+created before, this may include making some minor source code changes
+before the build is run. I like to keep a file which contains:
+
+	The product release name
+
+	The sources 'as of date'. (I always checkout my sources using
+		cvs -D 'date time' so that exactly the same sources
+		can be recovered knowing only the 'data time' which
+		was used to check them out. I am sure a similar trick
+		could be used with a perforce 'change set number'.)
+
+	The branch name.
+
+	The module name.
+
+This can be stored as a file in the product (encrypted if necessary)
+or may be stored in some secure build master database where the data
+can be looked up by release name.  My preference is to keep all data
+necessary to reproduce a build in the build output and delivered as
+part of the product.  This means that I can generate as many builds as
+I want automatically and not need to keep track of any of them.  When
+the QA team deems that a certain build is 'important', by making a
+particular build the official released copy then I can take a look at
+its contents and tag/branch the code at the sources which I used to
+build it.
+
+4) Build Prefix: It is a good idea to familiarize yourself with the
+makefile conventions regarding the make variable PREFIX.  It is
+easiest to understand if you think about what RedHat does when they
+build their distribution of RPM's but this will apply in many
+different systems including the Andrew File System (AFS) and most
+packaging systems. This variable is used during the build process
+"make all PREFIX=/home/apache" to tell the package where it will be
+installed (examples include /usr, /usr/local, /home/apache).  I
+suggest reading a few RedHat Spec files to see how this works in
+practice.  The application may need to hard-code this value into its
+object code.  When the application is installed it must not be
+installed into its proper place on the build machine.  The package we
+are constructing could cause the build machine to stop working
+correctly if it is a buggy version of a system library or major OS
+application.  Instead the makefile will install "make install
+PREFIX=/var/tmp/build-root/home/apache" the package into some other
+directory with a similar tree structure to its final destination.  The
+packaging system will then move the files into the correct place
+during an installation step on the target machine.  The installation
+step only moves files and sets permissions.  The makefile is not
+supposed to use the installation directories to hard code values into
+the application since the application will never be run from this
+installation directory.  The hard part of the build including any
+PREFIX magic is in the build section.  Notice the clear separation
+between build machine / target machine and installation on the build
+machine and installation on the target machine and construction of the
+application binaries and installation of the application binaries.
+This is one of the reasons why building an application on a build
+machine is different from the way in which developers build their code
+on their personal development machines.  This PREFIX issue will arise
+when you try and build the Tinderbox system and also when you
+construct the makefiles for your own application.  Since the build
+machine is not the target machine it can not be assumed that files
+will always be in the same places on both (for example perl).
+
+5) Application Architecture: 
+
+*) The build process should mimic the architecture of the code.  It should
+be a final test that the code was coded to the same specifications
+that it was designed.  It is a common problem for code to turn into
+spaghetti with each piece of code using functions and creating
+dependencies on every other piece of code.  For example it is probably
+a mistake for code in the database abstraction layer to be implemented
+in terms of code in the HTML generation layer.  These two libraries
+should probably be independent of each other, though they both might
+depend on a common string library.  The code architecture should limit
+the dependency graph between code modules.  The BuildMaster must
+enforce the restrictions on information flow between components. Thus
+no libraries should be in the path unless the architecture allows this
+module to depend on those libraries.
+
+*) The architecture must not have circular dependencies.  Circular
+dependencies not only make upgrading individual libraries difficult
+but also make testing components nearly impossible.  That is it should
+be possible to build some set of libraries L0 which depend on no
+libraries and then build some other set of libraries L1 which depend
+only on L0 libraries then build L2 which depend only on the L0 and L1
+libraries.  This "build chain" will prevent circular dependencies and
+help keep your code testable and the dependencies understandable.
+More information about why this is a good practice is available in
+"Large-Scale C++ Software Design" (Addison-Wesley Professional
+Computing Series) by John Lakos
+
+*) I enforce the convention that developers are not allowed to overload
+standard system libraries.  I always put standard libraries in the
+path before any library our company develops.  I build the application
+in stages to ensure that parts of the application which are not
+intended to depend on other code will not have other header files on
+the build machine at the time that they are constructed.  Build
+dependencies between modules which are expected are explicitly
+controlled with build scripts and version numbers.
+