From e9e256356fb36d9c789248679a505ce30a18beb9 Mon Sep 17 00:00:00 2001 From: "kestes%walrus.com" Date: Mon, 31 Dec 2001 20:02:05 +0000 Subject: [PATCH] new documentation files. --- webtools/tinderbox2/Overview | 153 +++++++++++++++++++ webtools/tinderbox2/Policies | 277 +++++++++++++++++++++++++++++++++++ 2 files changed, 430 insertions(+) create mode 100644 webtools/tinderbox2/Overview create mode 100644 webtools/tinderbox2/Policies diff --git a/webtools/tinderbox2/Overview b/webtools/tinderbox2/Overview new file mode 100644 index 000000000000..95ab06765b16 --- /dev/null +++ b/webtools/tinderbox2/Overview @@ -0,0 +1,153 @@ + + +Overview of the Tinderbox System +-------------------------------- + +Tinderbox is an information display system. It runs on a machine with +a webserver and will periodically write static HTML files to the disk +so that the webserver can serve these documents. Tinderbox is run out +of cron every five minutes. It gathers up information from various +databases including: CVS Logs, Bonsai, and Perforce. It will also +process mail which is sent to it. Mail is sent from Bug Ticketing +software and Build/Test Machines. All this information is combined to +produce the HTML pages. + + +Since no two companies will structure their development processes the +same way, the tinderbox code has to be highly configurable to account +for most possible uses. There is a main configuration file which +allows most of the major user configurable variables to be set. +Novice users can expect to edit only this file and get a working +tinderbox system. Additionally each library has been broken into two +parts. One part is the library specific configurations. This file is +expected to need modifications in some installations. I have put all +the library configurations into one directory to make it easy to find +the parts of tinderbox which are easy to modify. Each configuration +library can be thought of as a table which might need to be edited or +extended for use at your company. I have provided a working system +but the defaults may not suit your needs. These tables can be easily +changed in small ways by simply looking at the file and making obvious +changes. I have also allowed for the possibility of making complex +changes that only a competent perl programmer could define. Changes +are not made to the files which I have provided. Rather the changes +are made to copies of the files which are stored in a local +configuration directory. This ensures that you can easily version the +Tinderbox code as it is provided to you from the official distribution +and you can separately version the local configurations which you +make. It is also easy to see the local configurations since you have +both the original and the modified code on the same server and can +difference the two. As an example you might need to change the +BuildStatus I assume that you have the following possible build +outcomes (Build in progress, Build failed, Build succeded but tests +failed, Build and all tests were successful) You may have additional +outcomes to specifiy which kind of tests failed (unit test failed, not +enough unit test coverage, performance tests failed). Similarly you may have unusual requirements for how the filesystem should be laied out. I provide a + I suggest that you read through the files to see +how they are laid out and what types of changes are possible. + + +The build machines are not considered part of the tinderbox server. +They are clients just like Bug Ticketing systems and Version control +systems are clients. Build machines mail their build logs to the +server in a special format. This format specifies that name/value +pairs must appear at the top of the mail message followed by the +complete build log. Scripts for setting up a tinderbox build client +can be found in the clientbin directory but you may have other build +needs and may use any build methods you choose. + +The central concept of the Tinderbox system is the notion of a 'Tree'. +When several different groups are working out of the same version +control system often the files are partitioned into separate modules +with each group working on one or more disjoint modules. Over time +the developers need to branch their code because several different +versions of the files are under development at the same time. A tree +is a module/branch pair. This corresponds to a set of files which can +be checked out and built. Tinderbox makes one page for each tree and +displays what work is being done on that tree. CVS has a notion of +branches and of modules but not of trees. It is not possible to give +a branch/module pair a name. The tinderbox TreeData provides the +mappings between treenames and branch/module pairs. Tinderbox +displays the updates to bug tickets on the appropriate tree page. +This requires an easy mapping between bug tickets and trees. One +example of a complex function to determine tree name would be if each +of the product product types listed in the bug tracking data base +refers to one development project, except for a particular +feature/platform of one particular project which is being developed by +a separate group of developers. So the version control notion of +trees (a set of modules on a branch) may not have a direct map into +the bug tracking database at all times. In large projects it is +sometimes convenient to have a tree called 'ALL' which is used to +display all checkins performed on any trees and all bug tickets worked +on by any programmers. It is not possible to build or test the 'ALL' +tree and neither the version control nor bug ticketing system knows of +its existence. + + +The Bug Tracking code was intended to be as general as possible. +Most bug ticketing systems send mail when tickets change state. The +mail is often of the same form. It is a name/value pair which the +separator being the string ": ". Tinderbox will parse mail of this +form and display the interesting fields on the appropriate tree page. +The configuration of this module involves specifying which bug ticket +names are interesting and should be displayed. Also you will need to +specify how to map a bug ticket into a a tree. This could be very +simple if each bug ticket has a field which represents the tree it is +applicable to (in this case tree could equal project) or can be very +complex if the tree must be computed by the values of a set of fields. +Also tinderbox keeps track of which bugs are "reopened" and displays +them in a different column. The idea is that some bugs are moving +backwards and creating duplicate work. These tickets are particularly +troublesome and should be watched specially. So possible all ticket +status are partitioned into "progress" or "slippage" categories. You +will need to specify what status values are possible for your ticket +system and you will also need to specify the set of columns which you +would like to see on the status page. + +The heart of the tinderbox system is the 'status table'. This is an +HTML table which graphically shows how the changes made to the +development databases. It will show what is going on in the version +control system, the bug tracking system, the build system, automatic +regression tests and provide a notice board for developers to inform +each other of current news. By placing all this information in the +same table it is possible to correlate and cross check how different +types of changes effected each other and what was going on with the +whole project at different times in the day. The rows of the table +represent time with the most current events at the top of the page. +There are different sets of columns for each database which needs to +be displayed. The sets of columns are managed by independent modules. +There is one module for each version control system and each bug +tracking system which tinderbox knows how to interface with. It is +easy to port the system to new databases by just adding a new module +using the same style as the existing modules. Modules never share or +peek at each others data all combining of data is done by the humans +who stare at the table and interpret what is going on. The main +tinderbox system does not know how many columns the final table will +have. It only knows about a list of table modules. Each module in the +list is called in turn to generate the complete row then the entire +row is displayed. The user must configure tinderbox with the list of +modules which are of important to their own environment. There is no +restriction on the number of modules which may be configured, though +due to implementation details each module can only appear once in the +table. There are many pop up windows embedded in the status table +these will provide extra level of detail when a mouse is placed over +the link. By moving your mouse around the page you may effectively +drill down into an item of interest and learn more about it without +leaving the page. Most of the links will click through to the +appropriate database. Thus if you need more data about an item you +can click on the link and query the database directly. + +Besides the status table there is one other feature of the status +page. The page displays some information which is not correlated +through time and with other data. This information is called status +table headers. The main headers are the message of the day (MOTD), +and the Tree State though there are a few others headers of mainly +historical interest. The important issue with the headers is that +they are not optional. Tinderbox can render a table with as little or +as many columns in the status table as you wish but each of the +headers has a particular place on the status page and needs to be +rendered in a particular way (font size, font type, etc) thus the +tinderbox server must know where each header must go and how to +specify the appropriate html context for this header. Users may set +null defaults for headers that they do not need but it is much harder +for a user to add new headers to the code in a modular fashion. + diff --git a/webtools/tinderbox2/Policies b/webtools/tinderbox2/Policies new file mode 100644 index 000000000000..ae5c5f3cf9a4 --- /dev/null +++ b/webtools/tinderbox2/Policies @@ -0,0 +1,277 @@ +Preparations you will need to make and +policies you will need to set: +----------------------------------- + + +To install tinderbox you will need some information about your +existing computer systems and some idea about what your goals are. +Here is a list of questions to help get you started, some of these +ideas may not be apropriate for your environment. + + +The webserver will serve the tinderbox pages. +Webserver configuration is a bit of an art and you will need to +understand the policies which are used to administer your webserver. + +*) You will need to decide the directory where tinderbox should write +the static HTML pages. This will depend on how your webserver is +configured. The default location is based on the RedHat 7.1 +(apache-1.3.19-5) installation and is: /var/www/html/tinderbox2. You +will also need to know what the URL browsers will need to use to find +this directory. Since tinderbox generates static web pages, it is +possible to run tinderbox and not run a web server. One way this +could be done is if you have a network file system and all users have +browsers which can read from the HTML directories. In this case all +URL's should begin with "file:/" instead of the usual "http://". + +*) Project level administration is done via cgi scripts. These +scripts allow administrators to set the message of the day, and the +state of the tree (open, closed, restricted). Also all users can post +notices to the web pages via a cgi script. CGI programs are often +restricted to a portion of the file system which is disjoint from the +HTML files. You will need to figure out where the CGI programs will +go. Tinderbox takes its defaults from RedHat 7.1 and uses: +/var/www/cgi-bin/tinderbox2. You will also need to know what the URL +browsers will need to use to find this directory. + +*) CGI scripts will run as an unauthenticated user on your system. +You will need to decide which user will run the tinderbox CGI scripts. +The same user id must be used for running the scripts as for tinderbox +mail delivery. The Tinderbox Configuration files will define this +user id and as a security precaution check that it is running as the +required id. It is suggested that this id not be a privileged id +(higher ids are better, please make this number be grater then 10 and +bigger then 100 is recommended). Smaller ids are often assumed to +have more privileges on a Unix box then larger ids. It is not a good +idea for an unauthenticated user to have any privileges so a large id +is recommended. It is also recommended that you not use the id 'nobody' +as this id is over used and it would be better to partition the +unauthenticated user into separate ids in case of security problems. +RedHat runs all its CGI scripts as the user 'apache', this is an +acceptable user. I would prefer to have a separate user to run the +tinderbox CGI scripts but this would require recompiling apache to +enable suEXEC, and it is more effort then most groups can afford. + +*) Tinderbox Files. There are other tinderbox files which need to be +placed on the webserver. These include libraries and non-cgi +programs. You will need to decide where to place these files. Most +users put them in /home/tinderbox2. + +*) Tinderbox Data. Tinderbox stores its data in the file system. For +security it is often a good idea to keep this data out of the HTML and +CGI directories so that malicious users can not directly access this +data. The compressed build logs can grow quite large, so it is +recommended to put the data on a file system with room. The default +is to put them in the directory /home/tinderbox2/data. + + +Mail +---- + +*) Many of the tinderbox modules (Bug Ticket, Build, CVS) receive +their data via mail. The mail system on you web server machine must +be configured to deliver the mail into the tinderbox mail processing +programs. You should spend some time understanding how your mail +delivery system can be configured to allow user mail to be delivered +into a program and how to set the user id under which this delivery +occurs. If you do not wish to configure your mail delivery program +then you can use fetchmail to pull the mail out of a mail box and push +it into the programs on a periodic basis. See the install page for +details on what I have learned about mailing systems. + + +Production Version Control +------------------------- + +One of the biggest responsibilities which a "buildmaster" has is the +requirement that all code should be reproducible. That is that at +any point in the future, even more than one year later, the current +binaries should be able to be rebuilt byte for byte from sources. +This requirement can be broken down as follows: + +1) The build machine must be reproducible. + +We must be able to get back the same build machine we had at any point +in the past. This means that all OS libraries, all header files, all +compilers, all build tools (make, grep, sed) must have some mechanism +to roll back. It is common to use a backup of the build machine to +reconstruct it. Most OS will give you a list of the software packages +which are installed on the machine and their version numbers. I like +to keep the list of software packages which are installed on the +machine checked into version control. This allows me to compare the +state of the build machine at any two points in time. I have tools to +recreate the build-machine from just a list of packages with version +numbers. It is considered a best practice to limit the amount of +software which is available on the build machine. A build machine +with too much installed will only make it difficult to reproduce older +builds should the need arise. I recommend not installing any +web servers or graphical window managers on your build machine. It +should be clear that the build machine should not be the same machine +where the tinderbox server runs. + +2) The build process must be reproducible. That is all the steps +which are used to create the application must be reproducible. + +*) Build Interface: We must be able to run exactly the same build +process in the future including: all commands with command line +arguments, all environmental variables. I recommend that the entire +build process be viewed as something outside of the build master +control. Developers are responsible for ensuring that there is a +simple build master interface to construct all the software products +which go into a build. Typically there is a makefile in a standard +place where the buildmaster can run something like "make all; make +install;" and be guaranteed that this will build the product. The +build interface should be viewed as something which never changes and + are part of the build machine, like the OS and are changed only +rarely. It is hard enough to track all the parts of the build process +which we expect to change, we should not need to track complex build +procedures. The build procedures should have a standard interface. +By keeping the build instructions in one makefile which is checked +into the same version control system as the sources it is easy to +recreate any previous build even if the commands used to build the +software fluctuate rapidly between releases. There must be a simple +interface to construct the software which will hide all the complexity +of the actual construction. + +*) Build Environment: The makefile will code all the build commands +and all the environmental variables (PATH, UMASK, LD_LIBRARY_PATH, +CLASSPATH) needed to build the software though it may rely on some +well defined command line arguments (PREFIX, CCFLAGS, JAVA_LIBS) to +make these prematurely. These command line arguments should not +change between versions of the software but should be a fixed set of +build parameters. The parameters may be needed to specify where some +files are found on the build machine (Ideally the build machine is set +up the same as developers machines so these directories can be +hard-coded into the makefiles but often there is a need for some +directories to be specified at build time) or where files are to be +created/installed on the build machine (typically a subdirectory of +/var/tmp but there may be several builds running at once and each will +need a different directory) or what kind of build is being created. +Each part of the build which needs a particular environmental variable +set or a special header file in some path should have tests which +ensure that the build environment is valid. I keep my build scripts +installed on the build machine and they are always started by running +/etc/rc.d/init.d/build start this ensures that I am not relying on any +build environmental variables which are set by logging into the build +account and are thus not tracked and versioned. + +*) Environmental safety issues: + +If the build environment can not be used to build the software then a +human readable error message should be generated. My makefiles often +run various checks on the environmental variables before they +construct the code. They check that all required environmental +variables are set, that the required libraries are found, that +directories which must be disjoint (build and install directories) do +not overlap. This test suite becomes a build regression test and as I +discover additional possible build problems I add new tests to the +makefile. I make it a habit to explicit set all environmental +variables so that there is no doubt as to their expected values. It +is important for the QA group to only use Builds which were created by +an automated process so that we are sure that there are no +undocumented steps in either the test builds or the released build. + +3) Track the Build numbers. Given a clean install of your product you +should have all the information necessary to reproduce the executable +from sources. If a customer shows you the application binaries you +must be able to get the source code which build the application, +reconstruct the build machine which created the application and +possibly rerun the build exactly the same way as the application was +created before, this may include making some minor source code changes +before the build is run. I like to keep a file which contains: + + The product release name + + The sources 'as of date'. (I always checkout my sources using + cvs -D 'date time' so that exactly the same sources + can be recovered knowing only the 'data time' which + was used to check them out. I am sure a similar trick + could be used with a perforce 'change set number'.) + + The branch name. + + The module name. + +This can be stored as a file in the product (encrypted if necessary) +or may be stored in some secure build master database where the data +can be looked up by release name. My preference is to keep all data +necessary to reproduce a build in the build output and delivered as +part of the product. This means that I can generate as many builds as +I want automatically and not need to keep track of any of them. When +the QA team deems that a certain build is 'important', by making a +particular build the official released copy then I can take a look at +its contents and tag/branch the code at the sources which I used to +build it. + +4) Build Prefix: It is a good idea to familiarize yourself with the +makefile conventions regarding the make variable PREFIX. It is +easiest to understand if you think about what RedHat does when they +build their distribution of RPM's but this will apply in many +different systems including the Andrew File System (AFS) and most +packaging systems. This variable is used during the build process +"make all PREFIX=/home/apache" to tell the package where it will be +installed (examples include /usr, /usr/local, /home/apache). I +suggest reading a few RedHat Spec files to see how this works in +practice. The application may need to hard-code this value into its +object code. When the application is installed it must not be +installed into its proper place on the build machine. The package we +are constructing could cause the build machine to stop working +correctly if it is a buggy version of a system library or major OS +application. Instead the makefile will install "make install +PREFIX=/var/tmp/build-root/home/apache" the package into some other +directory with a similar tree structure to its final destination. The +packaging system will then move the files into the correct place +during an installation step on the target machine. The installation +step only moves files and sets permissions. The makefile is not +supposed to use the installation directories to hard code values into +the application since the application will never be run from this +installation directory. The hard part of the build including any +PREFIX magic is in the build section. Notice the clear separation +between build machine / target machine and installation on the build +machine and installation on the target machine and construction of the +application binaries and installation of the application binaries. +This is one of the reasons why building an application on a build +machine is different from the way in which developers build their code +on their personal development machines. This PREFIX issue will arise +when you try and build the Tinderbox system and also when you +construct the makefiles for your own application. Since the build +machine is not the target machine it can not be assumed that files +will always be in the same places on both (for example perl). + +5) Application Architecture: + +*) The build process should mimic the architecture of the code. It should +be a final test that the code was coded to the same specifications +that it was designed. It is a common problem for code to turn into +spaghetti with each piece of code using functions and creating +dependencies on every other piece of code. For example it is probably +a mistake for code in the database abstraction layer to be implemented +in terms of code in the HTML generation layer. These two libraries +should probably be independent of each other, though they both might +depend on a common string library. The code architecture should limit +the dependency graph between code modules. The BuildMaster must +enforce the restrictions on information flow between components. Thus +no libraries should be in the path unless the architecture allows this +module to depend on those libraries. + +*) The architecture must not have circular dependencies. Circular +dependencies not only make upgrading individual libraries difficult +but also make testing components nearly impossible. That is it should +be possible to build some set of libraries L0 which depend on no +libraries and then build some other set of libraries L1 which depend +only on L0 libraries then build L2 which depend only on the L0 and L1 +libraries. This "build chain" will prevent circular dependencies and +help keep your code testable and the dependencies understandable. +More information about why this is a good practice is available in +"Large-Scale C++ Software Design" (Addison-Wesley Professional +Computing Series) by John Lakos + +*) I enforce the convention that developers are not allowed to overload +standard system libraries. I always put standard libraries in the +path before any library our company develops. I build the application +in stages to ensure that parts of the application which are not +intended to depend on other code will not have other header files on +the build machine at the time that they are constructed. Build +dependencies between modules which are expected are explicitly +controlled with build scripts and version numbers. +