Getting Involved with the Clang Project
Once you have checked out and built clang and played around with it, you might be wondering what you can do to make it better and contribute to its development. Alternatively, maybe you just want to follow the development of the project to see it progress.
Follow what's going on
Clang is a subproject of the LLVM Project, but has its own mailing lists because the communities have people with different interests. The two clang lists are:
- cfe-commits - This list is for patch submission/discussion.
- cfe-dev - This list is for everything else clang related (questions and answers, bug reports, etc).
If you are interested in clang only, these two lists should be all you need. If you are interested in the LLVM optimizer and code generator, please consider signing up for llvmdev and llvm-commits as well.
The best way to talk with other developers on the project is through the cfe-dev mailing list. The clang mailing list is a very friendly place and we welcome newcomers. In addition to the cfe-dev list, a significant amount of design discussion takes place on the cfe-commits mailing list. All of these lists have archives, so you can browse through previous discussions or follow the list development on the web if you prefer.
Open Projects
Here are a few tasks that are available for newcomers to work on, depending on what your interests are. This list is provided to generate ideas, it is not intended to be comprehensive. Please ask on cfe-dev for more specifics or to verify that one of these isn't already completed. :)
- Compile your favorite C/ObjC project with "clang -fsyntax-only":
the clang type checker and verifier is quite close to complete (but not bug
free!) for C and Objective C. We appreciate all reports of code that is
rejected by the front-end, and if you notice invalid code that is not rejected
by clang, that is also very important to us. For make-based projects,
the
ccc
script in clang's utils folder might help to get you started. - Compile your favorite C project with "clang -emit-llvm":
The clang to LLVM converter is getting more mature, so you may be able to
compile it. If not, please let us know. Again,
ccc
might help you. Once it compiles it should run. If not, that's a bug :) - Overflow detection: an interesting project would be to add a -ftrapv compilation mode that causes -emit-llvm to generate overflow tests for all signed integer arithmetic operators, and call abort if they overflow. Overflow is undefined in C and hard for people to reason about. LLVM IR also has intrinsics for generating arithmetic with overflow checks directly.
- Undefined behavior checking: similar to adding -ftrapv, codegen could insert runtime checks for all sorts of different undefined behaviors, from reading uninitialized variables, buffer overflows, and many other things. This checking would be expensive, but the optimizers could eliminate many of the checks in some cases, and it would be very interesting to test code in this mode for certain crowds of people. Because the inserted code is coming from clang, the "abort" message could be very detailed about exactly what went wrong.
- Continue work on C++ support: Implementing all of C++ is a very big job, but there are lots of little pieces that can be picked off and implemented. See the C++ status report page to find out what is missing and what is already at least partially supported.
- Improve target support: The current target interfaces are heavily stubbed out and need to be implemented fully. See the FIXME's in TargetInfo. Additionally, the actual target implementations (instances of TargetInfoImpl) also need to be completed.
- Implement an tool to generate code documentation: Clang's library-based design allows it to be used by a variety of tools that reason about source code. One great application of Clang would be to build an auto-documentation system like doxygen that generates code documentation from source code. The advantage of using Clang for such a tool is that the tool would use the same preprocessor/parser/ASTs as the compiler itself, giving it a very rich understanding of the code.
- Use clang libraries to implement better versions of existing tools: Clang is built as a set of libraries, which means that it is possible to implement capabilities similar to other source language tools, improving them in various ways. Two examples are distcc and the delta testcase reduction tool. The former can be improved to scale better and be more efficient. The later could also be faster and more efficient at reducing C-family programs if built on the clang preprocessor.
- Use clang libraries to extend Ragel with a JIT: Ragel is a state machine compiler that lets you embed C code into state machines and generate C code. It would be relatively easy to turn this into a JIT compiler using LLVM.
- Self-testing using clang: There are several neat ways to
improve the quality of clang by self-testing. Some examples:
- Improve the reliability of AST printing and serialization by ensuring that the AST produced by clang on an input doesn't change when it is reparsed or unserialized.
- Improve parser reliability and error generation by automatically or randomly changing the input checking that clang doesn't crash and that it doesn't generate excessive errors for small input changes. Manipulating the input at both the text and token levels is likely to produce interesting test cases.
If you hit a bug with clang, it is very useful for us if you reduce the code that demonstrates the problem down to something small. There are many ways to do this; ask on cfe-dev for advice.