Add rules to specification for functions without prototypes. (#66)

Update the specification to add some draft rules for declarations of functions without prototypes. These are C functions declared without their arguments. Informally, - In checked scopes, it is an error to declare or use functions without prototypes. - Outside of checked scopes, it is an error to pass a checked type or return a checked type from a function without a prototype. The change adds a new section to interoperation that contains examples and detailed rules. The draft rules need to be reviewed with other people working on Checked C. The change removes functions without prototypes from the list of work to be addressed in Chapter 10.2. It also updates the section on checked scopes to say that functions without prototypes are not allowed there. This change updates some examples with empty parameter lists to use void to properly describe the empty parameter lists.
2016-10-24 10:41:23 -07:00 · 2016-10-24 10:41:23 -07:00 · b8f22a7601
--- a/spec/bounds_safety/core-extensions.tex
+++ b/spec/bounds_safety/core-extensions.tex
@ -789,6 +789,13 @@ called does not have to be declared as checked. The notion of whether a
 scope is checked or not checked is lexical and the function definition
 is a separate lexical scope.

+C allows declarations of functions without prototypes, where the types 
+of the arguments to functions are not specified.  These
+functions are dangerous to use because there can be mismatches 
+between argument types and parameter types at function
+calls.  This can corrupt data or the call stack.  In checked scopes, 
+the use or declaration of  functions without prototypes is not allowed.
+
 As we add different notions of checking to Checked C, we will use the
 checked and unchecked keywords for all the different notions of
 checking. We may introduce additional keywords to control specific kinds
--- a/spec/bounds_safety/design-alternatives.tex
+++ b/spec/bounds_safety/design-alternatives.tex
@ -119,7 +119,7 @@ Code takes the address of an array element and immediately does pointer
 arithmetic will still fail to type check, introducing a potential
 backward compatibility issue:
 \begin{verbatim}
-f()
+f(void)
 {
    int a[10];
    int *x = &a[0] + 5; // &a[0] has type ptr<T>.  Pointer arithmetic is not allowed
@ -133,7 +133,7 @@ possibility, so this proposal still violates the principle of not
 changing the meaning of existing C code.

 \begin{verbatim}
-f()
+f(void)
 {
    int a[10];
    int *x = ((int *) &a[0]) + 5; // redundant but OK under old rule
@ -179,12 +179,12 @@ f(int *arg, int len)
   ...
 }

-g() {
+g(void) {
   int x[10];
   f(x, 10);
 }

-h() {
+h(void) {
   int x[10];
   int *ptr = x;
   f(ptr, 10);
@ -258,7 +258,7 @@ void swap(int *p, int *q) {
    *q = *tmp;
 }

-void f() {
+void f(void) {
    int arr[5] = {0, 1, 2, 3, 4};
    swap(&arr[0], &arr[5]);
 }
@ -274,7 +274,7 @@ int sum(int *start, int count) {
   return total;
 }

-void f() {
+void f(void) {
    int arr[5] = {0, 1, 2, 3, 4};
    sum(&arr[3], 3);
 }
--- a/spec/bounds_safety/interoperation.tex
+++ b/spec/bounds_safety/interoperation.tex
@ -292,9 +292,9 @@ array_ptr<int> pax : count(5) : (array_ptr<int>) &x;
 \end{verbatim}
 In this example, the result of \texttt{random()} has no bounds:
 \begin{verbatim}
-char *random();
+char *random(void);

-void f() {
+void f(void) {
    // fails to check: random() has no bounds
    array_ptr<char> sp : count(1) = random();
 }
@ -453,9 +453,9 @@ array_ptr<int> pax : count(5) : &x;
 Implicit conversions of unchecked pointers with no bounds to checked pointers
 will also be rejected:
 \begin{verbatim}
-char *random();
+char *random(void);

-void f() {
+void f(void) {
    // fails to check: random() has no bounds
    array_ptr<char> sp : count(1) = random(); 
 }
@ -846,7 +846,7 @@ that set the tag to 1.
   bounds((array_ptr<int>) ((size_t) x & ~0x3), \
          (array_ptr<int>) ((size_t) x & ~0x3) + 1) rel_align(char)

-array_ptr<int> create() 
+array_ptr<int> create(void) 
 where untagged_bounds(return_value)
 {
   array_ptr<int> x : bounds(x, x + 1) = malloc(sizeof(int));
@ -870,4 +870,159 @@ where untagged_bounds(return_value);
  }
  return x;
 }
-\end{verbatim}
+\end{verbatim}
+
+\section{Restricted interoperation with functions without prototypes}
+C allows declarations of functions that do not specify the type of their parameters
+(no-prototype function declarations).  This provides backward compatibility between
+ANSI C from 1989 and earlier versions of C that did not check the types of
+arguments at calls.  Using functions declared this way is
+dangerous.  Arguments are passed based on their types and an incorrect call can be made
+where the types of arguments do not match the types of the parameters of the function
+definition.  This could lead to bypassing of checking.  Checked pointers could be 
+converted silently to unchecked pointers or vice versa.  Even worse, parameters could contain
+corrupted values or the stack could be corrupted.
+
+We recommend strongly that programmers do not declare functions without prototypes.  This
+feature is a backward compatibility feature and is used rarely now.
+The GCC and clang C compilers have warning flags that will detect the declaration of functions without 
+prototypes.
+
+For checked scopes, the declaration or use of functions with no prototypes is an error
+and is not allowed.   In unchecked scopes, forbidding the use of function
+declarations without prototypes would violate the design goal of providing backward compatibility.
+Instead, we restrict the usage of no-prototype functions to reduce the possibility of
+bounds checking being bypassed accidentally in unchecked blocks.
+
+Informally, we want to prevent values with checked types from being passed as arguments or
+returned from calls to no-prototype functions.  This requires some care to define because of
+structures, unions, and function pointers. We define the set of types $E$ that are an error to use
+with functions without prototypes by induction.  It includes:
+\begin{enumerate}
+\item Checked pointer and array types.
+\item Complete structure and union types with members that have types in $E$.
+\item Pointers to function types that have argument or return types that are in $E$.
+\item Complete structure and union types with members with bounds declarations (these
+      are described in Chapter~\ref{chapter:structure-bounds}).
+\item Pointers to function types with bounds declarations.
+\end{enumerate}
+Clauses 4 and 5 handle the case of integer-typed values that have bounds declarations
+(note that bounds-safe interfaces on unchecked pointer types are not bounds declarations).
+
+We define the following rules for unchecked scopes:
+\begin{enumerate}
+\item It is an error to call a function that is
+declared to have no prototype and pass or return a value whose type is in $E$.
+\item A function declaration with no prototype is incompatible with a function declaration
+with a prototype that has parameter types or a return type in $E$ or that has bounds
+declarations.
+\end{enumerate}
+
+\subsection{Examples}
+The rules catch common errors but are not foolproof.  They catch passing a checked pointer
+to a function with no prototype:
+\begin{verbatim}
+int f();
+
+int g(ptr<int> a) {
+  f(a);  // error - passing a checked type to a function without a prototype
+}
+\end{verbatim}
+They also catch redeclaring a function with no prototype to have a checked parameter:
+\begin{verbatim}
+int f();
+
+struct S {
+  array_ptr<int> ap : count(len);
+  int len;
+}
+
+// Error - incompatible definition of f with a prototype.
+int f(S y) {
+ ...
+ }
+\end{verbatim}
+By rule 2, the definition of \verb+f+ is incompatible with the initial
+declaration of \verb+f+, so this is an error.  It is an error even
+if \verb+S+ is an incomplete type at the time of
+a prototype declaration for \verb+f+:
+\begin{verbatim} 
+int f();
+
+struct S;
+
+int f(S y);  // Declarations involving incomplete types are allowed.
+
+// Now define struct S.
+struct S {
+  array_ptr<int> ap : count(len);
+  int len;
+}
+
+// Error - incompatible definition of f with the initial declaration of f.
+int f(S y) {
+ ...
+}
+\end{verbatim}
+
+\subsection{Checking during compilation and linking}
+Checking can be bypassed by code that declares a function with no prototype in one 
+compilation unit and defines it in another compilation unit:
+\begin{verbatim}
+Compilation unit 1:
+
+int g(ptr<int> x) {
+ ...
+ }
+
+Compilation unit 2:
+
+extern int g();
+void h(void) {
+  g(5);  // Error
+}
+\end{verbatim}
+The definition of \verb+g+ in compilation unit 1 is incompatible with
+the declaration in compilation unit 2, but there is no way for a compiler
+to detect this.
+
+The checking could be deferred to linking. The compiler could decorate the linker names of
+functions whose argument types or return types are in $E$ differently from the names 
+of functions whose argument types and return types are not in $E$.
+
+\subsection{Unchecked pointers to checked types}
+There is a limited way in which no-prototype functions can interoperate with checked types.
+The definition of $E$ allows unchecked
+pointers to checked pointers and arrays
+to be passed to or returned from no-prototype functions.  It also allows unchecked pointers to
+structures or unions that have checked members to be passed to or returned from
+no-prototype functions.  Finally, it allows unchecked pointers with bounds-safe interfaces to be
+passed to functions with no prototypes.  This is necessary so that bounds-safe interfaces
+can be added to existing code without breaking the code.   
+
+Here are some examples:
+\begin{verbatim}
+int g();
+
+int g(ptr<int> *x);
+
+int f();
+
+struct S {
+  array_ptr<ptr> ap : count(len);
+  int len;
+}
+
+int f(S *arg);
+\end{verbatim}
+
+There are three reasons to allow unchecked pointers that point to checked data to
+be passed to functions without prototypes.  First, the unchecked pointer types 
+indicate a lack of checking, so it is already clear from the types of the variables being
+used that there is some lack of checking.  Second, we believe that this
+will support incremental conversion of code to use the Checked C extension.  Finally,
+it would be difficult to enforce that an unchecked pointer does not point
+to a checked type. An unchecked pointer could point to an incomplete
+structure or union type. A compilation unit might never define the type. The type
+could even be unresolved during linking of a library if none of the library compilation units
+define the type.
--- a/spec/bounds_safety/open-issues.tex
+++ b/spec/bounds_safety/open-issues.tex
@ -27,10 +27,6 @@ the statement.  Also add wording to allow a bundle block to do this.
 \item
  Decide what to do about null terminated arrays. Do we have special rules
  for them?
-\item
-  Old-style function declarations where argument list length or
-  parameter/argument types could be mismatched at compile time, leading
-  to undefined behavior.
 \item
  Variable arguments
 \item
--- a/spec/bounds_safety/pointers-to-pointers.tex
+++ b/spec/bounds_safety/pointers-to-pointers.tex
@ -123,7 +123,7 @@ void create(ptr<array_ptr<char>> pbuf where *pbuf: count(*len), ptr<int> plen)
 \end{verbatim}
 A caller would take the addresses of local variables to use this function:
 \begin{verbatim}
-void f()
+void f(void)
 {
    int len;
    array_ptr<char> buf : count(len) = NULL;
@ -261,7 +261,7 @@ struct S {
    int array_ptr<char> chars : len;
 }

-void f() 
+void f(void) 
 {
   S *s = malloc(sizeof(S)) where s : _uninit_data;
   s->chars = NULL where s : _init_data;
--- a/spec/bounds_safety/variable-bounds.tex
+++ b/spec/bounds_safety/variable-bounds.tex
@ -265,7 +265,7 @@ Externally-scoped variables can have bounds as well:
 int buflen = 0;
 array_ptr<int> buf : count(buflen) = NULL;

-int sum()
+int sum(void)
 {
    int result = 0;
    for (int i = 0; i < buflen; i++) {
@ -333,7 +333,7 @@ elements and then points to an array with 10 elements; the bounds are
 adjusted accordingly.

 \begin{verbatim}
-void f() 
+void f(void) 
 {
   int x[5];
   int y[10];
@ -501,7 +501,7 @@ void update_size(int i)

 extern array_ptr<int> ap : count(size);

-void go()
+void go(void)
 {
    update_size(INT_MAX);
    ap[100] = 0xbad;
@ -1305,7 +1305,7 @@ that allowed for the buffer to be reallocated:
 int buflen = 0;
 array_ptr<int> buf : count(buflen) = NULL;

-int sum()
+int sum(void)
 {
   int result = 0;
   for (int i = 0; i < buflen; i++) {