Hacking on Clang

This document provides some hints for how to get started hacking on Clang for developers who are new to the Clang and/or LLVM codebases.

Developer Documentation

Both Clang and LLVM use doxygen to provide API documentation. Their respective web pages (generated nightly) are here:

For work on the LLVM IR generation, the LLVM assembly language reference manual is also useful.

Debugging

Inspecting data structures in a debugger:

Testing

[Note: The test running mechanism is currently under revision, so the following might change shortly.]

Testing on Unix-like Systems

Clang includes a basic regression suite in the tree which can be run with make test from the top-level clang directory, or just make in the test sub-directory. make VERBOSE=1 can be used to show more detail about what is being run.

The tests primarily consist of a test runner script running the compiler under test on individual test files grouped in the directories under the test directory. The individual test files include comments at the beginning indicating the Clang compile options to use, to be read by the test runner. Embedded comments also can do things like telling the test runner that an error is expected at the current line. Any output files produced by the test will be placed under a created Output directory.

During the run of make test, the terminal output will display a line similar to the following:

followed by a line continually overwritten with the current test file being compiled, and an overall completion percentage.

After the make test run completes, the absence of any Failing Tests (count): message indicates that no tests failed unexpectedly. If any tests did fail, the Failing Tests (count): message will be followed by a list of the test source file paths that failed. For example:

  Failing Tests (3):
      /home/john/llvm/tools/clang/test/SemaCXX/member-name-lookup.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/namespace-alias.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/using-directive.cpp
  

If you used the make VERBOSE=1 option, the terminal output will reflect the error messages from the compiler and test runner.

The regression suite can also be run with Valgrind by running make test VG=1 in the top-level clang directory.

For more intensive changes, running the LLVM Test Suite with clang is recommended. Currently the best way to override LLVMGCC, as in: make LLVMGCC="clang -std=gnu89" TEST=nightly report (make sure clang is in your PATH or use the full path).

Testing using Visual Studio on Windows

The cmake build tool is set up to create Visual Studio project files for running the tests, "clang-test" being the root. Unfortunately, the test runner scripts presently don't work on Windows. This will be fixed during the test runner revision in progress.

Note that the current and coming revised test runner is based on Python, which must be installed. Find Python at: http://www.python.org/download. Download the latest stable version (2.6.2 at the time of this writing).

The GnuWin32 tools are also necessary for running the tests. (Note that the grep from MSYS or Cygwin doesn't work with the tests because of embedded double-quotes in the search strings. The GNU grep does work in this case.) Get them from http://getgnuwin32.sourceforge.net.

Creating Patch Files

To return changes to the Clang team, unless you have checkin privileges, the prefered way is to send patch files to the cfe-commits mailing list, with an explanation of what the patch is for. Or, if you have questions, or want to have a wider discussion of what you are doing, such as if you are new to Clang development, you can use the cfe-dev mailing list also.

To create these patch files, change directory to the llvm/tools/clang root and run:

For example, for getting the diffs of all of clang:

For example, for getting the diffs of a single file:

Note that the paths embedded in the patch depend on where you run it, so changing directory to the llvm/tools/clang directory is recommended.

LLVM IR Generation

The LLVM IR generation part of clang handles conversion of the AST nodes output by the Sema module to the LLVM Intermediate Representation (IR). Historically, this was referred to as "codegen", and the Clang code for this lives in lib/CodeGen.

The output is most easily inspected using the -emit-llvm option to clang (possibly in conjunction with -o -). You can also use -emit-llvm-bc to write an LLVM bitcode file which can be processed by the suite of LLVM tools like llvm-dis, llvm-nm, etc. See the LLVM Command Guide for more information.