Hacking on Clang
This document provides some hints for how to get started hacking on Clang for developers who are new to the Clang and/or LLVM codebases.
Developer Documentation
Both Clang and LLVM use doxygen to provide API documentation. Their respective web pages (generated nightly) are here:
For work on the LLVM IR generation, the LLVM assembly language reference manual is also useful.
Debugging
Inspecting data structures in a debugger:
- Many LLVM and Clang data structures provide a dump() method which will print a description of the data structure to stderr.
- The QualType structure is used pervasively. This is a simple value class for wrapping types with qualifiers; you can use the isConstQualified(), for example, to get one of the qualifiers, and the getTypePtr() method to get the wrapped Type* which you can then dump.
Testing
Clang includes a basic regression suite in the tree which can be run with make test from the top-level clang directory, or just make in the test sub-directory. make report can be used after running the tests to summarize the results, and make VERBOSE=1 can be used to show more detail about what is being run.
For more intensive changes, running the LLVM Test Suite with clang is recommended. Currently the best way to override LLVMGCC, as in: make LLVMGCC="ccc -std=gnu89" TEST=nightly report (make sure ccc is in your PATH or use the full path).
LLVM IR Generation
The LLVM IR generation part of clang handles conversion of the AST nodes output by the Sema module to the LLVM Intermediate Representation (IR). Historically, this was referred to as "codegen", and the Clang code for this lives in lib/CodeGen.
The output is most easily inspected using the -emit-llvm option to clang (possibly in conjunction with -o -). You can also use -emit-llvm-bc to write an LLVM bitcode file which can be processed by the suite of LLVM tools like llvm-dis, llvm-nm, etc. See the LLVM Command Guide for more information.