Getting Involved with the Clang Project

Once you have checked out and built clang and played around with it, you might be wondering what you can do to make it better and contribute to its development. Alternatively, maybe you just want to follow the development of the project to see it progress.

Follow what's going on

Clang is a subproject of the LLVM Project, but has its own mailing lists because the communities have people with different interests. The two clang lists are:

cfe-commits - This list is for patch submission/discussion.
cfe-dev - This list is for everything else clang related (questions and answers, bug reports, etc).

If you are interested in clang only, these two lists should be all you need. If you are interested in the LLVM optimizer and code generator, please consider signing up for llvmdev and llvm-commits as well.

The best way to talk with other developers on the project is through the cfe-dev mailing list. The clang mailing list is a very friendly place and we welcome newcomers. In addition to the cfe-dev list, a significant amount of design discussion takes place on the cfe-commits mailing list. All of these lists have archives, so you can browse through previous discussions or follow the list development on the web if you prefer.

Open Projects

Here are a few tasks that are available for newcomers to work on, depending on what your interests are. This list is provided to generate ideas, it is not intended to be comprehensive. Please ask on cfe-dev for more specifics or to verify that one of these isn't already completed. :)

Compile your favorite C/ObjC project with "clang -fsyntax-only": the clang type checker and verifier is quite close to complete (but not bug free!) for C and Objective C. We appreciate all reports of code that is rejected by the front-end, and if you notice invalid code that is not rejected by clang, that is also very important to us. For make-based projects, the script attached to this post might help to get you started.
Compile your favorite C project with "clang -emit-llvm": The clang to LLVM converter is getting more mature, so you may be able to compile it. If not, please let us know. Again, the attachment to this post might help you. Once it compiles it should run. If not, that's a bug :)
Work on code generation for Objective C: -emit-llvm support for Objective C is basically nonexistent at the time of this writing, this is a nice open project that can be tackled incrementally (one language feature at a time).
Debug Info Generation: -emit-llvm doesn't currently support emission of LLVM debug info (which the code generator turns into DWARF). Adding this should be straight-forward if you follow the example of what llvm-gcc generates.
Continue work on C++ support: Implementing all of C++ is a very big job, but there are lots of little pieces that can be picked off and implemented. See the C++ status report page to find out what is missing and what is already at least partially supported.
Improve target support: The current target interfaces are heavily stubbed out and need to be implemented fully. See the FIXME's in TargetInfo. Additionally, the actual target implementations (instances of TargetInfoImpl) also need to be completed. This includes defining builtin macros for linux targets and other stuff like that.
Implement 'builtin' headers: GCC provides a bunch of builtin headers, such as stdbool.h, iso646.h, float.h, limits.h, etc. It also provides a bunch of target-specific headers like altivec.h and xmmintrin.h. clang will eventually need to provide its own copies of these (and there is a lot of improvement that can be made to the GCC ones!) that are clean-room implemented to avoid GPL taint.
Implement a clang 'libgcc': As with the headers, clang (or a another related subproject of llvm) will need to implement the features that libgcc provides. libgcc provides a bunch of routines the code generator uses for "fallback" when the chip doesn't support some operation (e.g. 64-bit divide on a 32-bit chip). It also provides software floating point support and many other things. I don't think that there is a specific licensing reason to reimplement libgcc, but there is a lot of room for improvement in it in many dimensions.
Implement an tool to generate code documentation: Clang's library-based design allows it to be used by a variety of tools that reason about source code. One great application of Clang would be to build an auto-documentation system like doxygen that generates code documentation from source code. The advantage of using Clang for such a tool is that the tool would use the same preprocessor/parser/ASTs as the compiler itself, giving it a very rich understanding of the code.
Use clang libraries to implement better versions of existing tools: Clang is built as a set of libraries, which means that it is possible to implement capabilities similar to other source language tools, improving them in various ways. Two examples are distcc and the delta testcase reduction tool. The former can be improved to scale better and be more efficient. The later could also be faster and more efficient at reducing C-family programs if built on the clang preprocessor.
Use clang libraries to extend Ragel with a JIT: Ragel is a state machine compiler that lets you embed C code into state machines and generate C code. It would be relatively easy to turn this into a JIT compiler using LLVM.

If you hit a bug with clang, it is very useful for us if you reduce the code that demonstrates the problem down to something small. There are many ways to do this; ask on cfe-dev for advice.