ANSI C

From NetHackWiki
Revision as of 23:48, 16 November 2014 by Ray Chason (talk | contribs) (USE_VARARGS uses va_arg; USE_STDARG does not)
Jump to navigation Jump to search

In computer programming, ANSI C (or ISO C, or C89) is a specification for the C language and an update to the original K&R version of C. Programs written the ANSI C language have access to a few extra features inspired by C++; the main difference between old C and ANSI C seems to be in declarations of function parameters. These days, ANSI C is routine and C programmers almost always use it.

However, NetHack was a very old program dating from before ANSI C's first spec in 1989. Today's version can take advantage of certain ANSI C features, and code for this is in tradstdc.h. Is NetHack written in ANSI C? Yes and no, depending on what tradstdc.h decides to do.

The void type

In C, the void type indicates a function that does not return a value. The original C did not have a void type; programmers often declared functions to return int and discarded the value. (This is why compiling doesn't fail if you forget to return a value from a non-void function.)

It became common to #define void int to cosmetically declare a void function. (The preprocessor would change every void to int and the C compiler would have no concept of void.) Later, many C compiler vendors started including the void keyword. C++ had a void keyword. So ANSI decided to include the void type in ANSI C.

Another common convention was to define a function without an explicit return type when no return was intended. The compiler would supply a return type of int. Older code is not consistent in this usage, and compilers did nothing to enforce it, but NetHack through 2.3e mostly adheres to it. (Implicit int is not permitted in C99, but most compilers accept it with a warning.)

If you find a void-free compiler to build NetHack with, then the procedure is to uncomment the #define NOVOID line at config.h#line239 so that tradstdc.h#line23 defines void.

void pointers, null pointers

In ANSI C, the data type pointer-to-void, written void *, can be used to hold the value of any other type of pointer, without requiring a "cast", or explicit type conversion. NetHack defines genericptr_t for this purpose.

NetHack also does without the keyword NULL, which is defined in ANSI C as a pointer value that cannot be dereferenced and compares equal to a value of zero in integer or boolean types. The pre-ANSI equivalent is a constant zero value cast to a pointer type: (char *)0, (genericptr_t)0 etc.

Enumerated types

NetHack uses lists of symbolic constants to identify members of many sets: object and monster classes are probably the most frequent examples. Current practice in ANSI C is often to use enumerations instead. In either case, each element of the list is distinguished by a unique numeric value, but enumerations have a distinct advantage for developers: the debugger shows the program symbol for the value, i. e. a human-readable word, when stepping through the program. Symbolic constants on the other hand are preprocessor macros, and are not available except as numbers in the compiled code.

Function declarations in ANSI C

Prototypes

An old style declaration gives only the return type:

char *xname();

while a prototype gives the types of the parameters:

char *xname(struct obj *optr);

The name of the parameter, optr in the example above, is optional; and NetHack usually omits it.

The NDECL, FDECL and VDECL macros create either prototypes or old-style declarations. NDECL is used if there are no parameters, FDECL for a fixed parameter list of at least one parameter, and VDECL for a variable parameter list. Separate macros are used to support an overlaid build on MS-DOS (which is not officially supported, and barely practical today).

Default promotions

With the functions defined in the old style, integer types smaller than int are promoted to int or unsigned, preserving their signedness. If a prototype is present, on most compilers the parameters must match the promoted types. A few older compilers will match a prototype to an unpromoted parameter type.

global.h defines seven types to be used in prototypes where the function has a type with a default promotion. The types are CHAR_P, SCHAR_P, UCHAR_P, XCHAR_P, SHORT_P, BOOLEAN_P and ALIGNTYP_P, and they correspond to char, schar, uchar, xchar, short, boolean and aligntyp. The X11 interface defines a DIMENSION_P type, corresponding to Dimension. (Not all of those types are ANSI C keywords: some are defined in library or other headers.)

Variable parameter lists

The first C compilers used barely-portable hacks to support variadic functions such as pline in pline.c. The symbol USE_OLDARGS enables these in NetHack. Here is an abridgement of pline from NetHack 2.3e:

pline(line,arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9)
char *line,*arg1,*arg2,*arg3,*arg4,*arg5,*arg6,*arg7,*arg8,*arg9;
{
    char pbuf[BUFSZ];
    sprintf(pbuf,line,arg1,arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9);
    /* do stuff with pbuf */
}

Later pre-ANSI compilers provided a header, varargs.h, to support variadic functions. USE_VARARGS enables this system in NetHack, and it looks like this:

#include <varargs.h>
void
pline(va_alist)
va_dcl /* no semicolon */
{
    char pbuf[BUFSZ];
    va_list the_args;
    char *format;

    va_start(the_args);
    format = va_arg(the_args, char *);
    vsprintf(pbuf, format, the_args);
    va_end(the_args);
    /* do stuff with pbuf */
}

The macro va_arg extracts an argument from the list. NetHack calls it via the macro VA_INIT, defined in tradstdc.h, if varargs.h is in use (but not if stdarg.h is in use). The fixed arguments could also have been named explicitly in the function header.

This usage could not be made compatible with prototypes, and so ANSI C uses a different system. USE_STDARG enables it in NetHack:

#include <stdarg.h>
void
pline(const char *format, ...)
{
    char pbuf[BUFSZ];
    va_list the_args;

    va_start(the_args, format); /* use the last parameter before the ... */
    vsprintf(pbuf, format, the_args);
    va_end(the_args);
    /* do stuff with pbuf */
}

The source code for the variadic functions in NetHack is ugly, to say the least.

Note for the ambitious newbie

In case you consider creating a project of your own, be it a Rogue-like game or some other application, consider using a more modern language than C. Nowadays there are lots of platform-independent, high-level alternatives like Java, Python or Perl just to mention a few. Such an application would be far easier to debug and maintain than its counterpart written in C. If you feel an urge to squeeze a bit more power out of the machine for your advanced ANSI-graphics and its pixelshading algorithms, at least consider using C++. If you, despite every sane thought, decide that C is the language you want to use, use the latest version of C, C99. The latest version contains many corrections and improvements and will cause you less trouble.

If you choose to develop your code in C or C++, and are using something like gcc to compile, use the options -Wall -ansi -pedantic. You can also use -std=c99 for the 1999 version of ANSI C.

Vanilla NetHack continues to use C because of inertia (a 150,000+ line program is non-trivial to translate) and because of its stated goal: to get the game working on as many different types of hardware and under as many different operating systems as is practical.