Coding Conventions

User and contributor documentation
Contributor documentation

User and contributor documentation

This page summarises the coding conventions used in CoCoALib. This document is useful primarily to contributors, but some users may find it handy too. As the name suggests, these are merely guidelines; they are not hard and fast rules. Nevertheless, you should violate these guidelines only if you have genuinely good cause. We would also be happy to receive notification about parts of CoCoALib which do not adhere to the guidelines.

We expect these guidelines to evolve slowly with time as experience grows.

Before presenting the guidelines I mention some useful books. The first is practically a sine qua non for the C++ library: The C++ Standard Library by Josuttis which contains essential documentation for the C++ library. Unless you already have quite a lot of experience in C++, you should read the excellent books by Scott Meyers: Effective C++ (the new version), and Effective STL. Another book offering useful guidance is C++ Coding Standards by Alexandrescu and Sutter; it is a good starting point for setting coding standards.

Names of CoCoA types, functions, variables

All code and "global" variables must be inside the namespace CoCoA (or in an anonymous namespace); the only exception is code which is not regarded as an integral part of CoCoA (e.g. the C++ interface to the GMP big integer package).

There are numerous conventions for how to name classes/types, functions, variables, and other identifiers appearing in a large package. It is important to establish a convention and apply it rigorously (plus some common sense); doing so will facilitate maintenance, development and use of the code. (The first three rules follow the convention implicit in NTL)

single word names are all lower-case (e.g. ring);
multiple word names have the first letter of each word capitalized, and the words are juxtaposed (rather than separated by underscore, (e.g. PolyRing);
acronyms should be all upper-case (e.g. PPM);
names of functions returning a boolean start with Is (Are if argument is a list/vector);
names of functions returning a bool3 start with Is and end with 3 (Are if argument is a list/vector);
variables of type (or functions returning a) pointer end with Ptr
data members' names start with my (or Iam/Ihave if they are boolean);
a class static member has a name beginning with our;
enums are called BlahMarker if they have a single value (e.g. BigInt::CopyFromMPZMarker) and BlahFlag if they have more;
abbreviations should be used consistently (see below);
Abstract base classes and derived abstract classes normally have names ending in Base; in contrast, a derived concrete class normally has a name ending in Impl. Constructors for abstract classes should probably be protected rather than public.

It is best to choose a name for your function which differs from the names of functions in the C++ standard library, otherwise it can become necessary to use fully qualified names (e.g. std::set and CoCoA::set) which is terribly tedious. (Personally, I think this is a C++ design fault)

If you are overloading a C++ operator then write the keyword operator attached to the operator symbol (with no intervening space). See ring.H for some examples.

Order in function arguments

When a function has more than one argument we follow the first applicable of the following rules:

the non-const references are the first args, e.g.
- myAdd(a,b,c) as in a=b+c,
- IsIndetPosPower(long& index, BigInt& exp, pp)
the ring/PPMonoid is the first arg, e.g.
- ideal(ring, vector<RingElem>)
the main actor is the first arg and the with respect to args follow, e.g.
- deriv(f, x)
optional args go last, e.g.
- NewPolyRing(CoeffRing, NumIndets),
- NewPolyRing(CoeffRing, NumIndets, ordering)
the arguments follow the order of the common use or sentence, e.g.
- div(a,b) for a/b,
- IndetPower(P, long i, long/BigInt exp) for x[i]^exp,
- IsDivisible(a,b) for a is divisible by b,
- IsContained(a,b) for a is contained in b
strongly related functions behave as if they were overloading (--> optional args go last), (??? is this ever used apart from NewMatrixOrdering(long NumIndets, long GradingDim, ConstMatrixView OrderMatrix);???)
the more structured objects go first, e.g. ... (??? is this ever used ???)

IMPORTANT we are trying to define a good set of few rules which is easy to apply and, above all, respects common sense. If you meet a function in CoCoALib not following these rules let us know: we will fix it, or fix the rules, or call it an interesting exception ;-)

Explanation notes, exceptions, and more examples

We don't think we have any function with 1 and 2 colliding
The main actor is the object which is going to be worked on to get the returning value (usually of the same type), the with respect to are strictly constant, e.g.
- deriv(f, x)
- NF(poly, ideal)
Rule 1 wins on rule 4, e.g.
- IsIndetPosPower(index, exp, pp) and IsIndetPosPower(pp)
Rule 2 wins on rule 4, e.g.
- ideal(gens) and ideal(ring, gens)
we should probably change:
- NewMatrixOrdering(NumIndets, GradingDim, M) into NewMatrixOrdering(M, GradingDim)

Abbreviations

The overall idea is that if a given concept in a class or function name always has the same name: either always the full name, or always the same abbreviation. Moreover a given abbreviation should have a unique meaning.

Here is a list for common abbreviations

col -- column
ctor -- constructor
deg -- degree (exceptions: degree in class names)
div -- divide
dim -- dimension
elem -- element
mat -- matrix (exceptions: matrix in class names)
mul -- multiply
pos -- positive (or should it be positive? what about IsPositive(BigInt N)?)

Here is a list of names that are written in full

assign
one -- not 1
zero -- not 0

Contributor documentation

Guidelines from Alexandrescu and Sutter

Here I paraphrase some of the suggestions from their book, picking out the ones I think are less obvious and are most likely to be relevant to CoCoALib.

Write correct, clean and simple code at first; optimize later.
Keep track of object ownership; document any "unusual" behaviour.
Keep implementation details hidden (e.g. make data members private)
Use const as much as you reasonably can.
Use prefix ++ and -- (unless you specifically do want the postfix behaviour)
Each class should have a single clearly defined purpose; keep it simple!
Guideline: member fns should be either virtual or public not both.
Exception cleanliness: dtors, deallocate and swap should never throw.
Use explicit to avoid making unintentional "implicit type conversions"
Avoid using in header files.
Use CoCoA_ERROR for sanity checks on args to public fns, and CoCoA_ASSERT for internal fns.
Use std::vector unless some other container is obviously better.
Avoid casting; if you must, use a C++ style cast (e.g. static_cast)

Use of ``#define``

Excluding the read once trick for header files, #define should be avoided (even in experimental code). C++ is rich enough that normally there is a cleaner alternative to a #define: for instance, inline functions, a static const object, or a typedef -- in any case, one should avoid polluting the global namespace.

If you must define a preprocessor symbol, its name should begin with the prefix CoCoA_, and the remaining letters should all be capital.

Header Files

The read once trick uses preprocessor symbols starting with CoCoA_ and then finishing with the file name (retaining the capitalization of the file name but with slashes replaced by underscores). The include path passed to the compiler specifies the directory above the one containing the CoCoALib header files, so to include one of the CoCoALib header files you must prefix CoCoA/ to the name of the file -- this avoids problems of ambiguity which could arise if two includable files have the same name. This idea was inspired by NTL.

Include only the header files you really need -- this is trickier to determine than you might imagine. The reasons for minimising includes are two-fold: to speed compilation, and to indicate to the reader which external concepts you genuinely need. In header files it often suffices simply to write a forward declaration of a class instead of including the header file defining that class. In implementation files the definition you want may already be included indirectly; in such cases it is enough to write a comment about the indirectly included definitions you will be using.

In header files I add a commented out using command immediately after including a system header to say which symbols are actually used in the header file. In the implementation file I write a using command for each system symbol used in the file; these commands appear right after the #include directive which imported the symbol.

Curly brackets and indentation

Sutter claims curly bracket positioning doesn't matter: he's wrong! Matching curly brackets should be either vertically or horizontally aligned. Indentation should be small (e.g. two positions for each level of nesting); have a look at code already in CoCoALib to see the preferred style. Avoid using tabs for indentation as these do not have a universal interpretation.

The else keyword indents the same as its matching if.

Inline Functions

Use inline sparingly. inline is useful in two circumstances: for a short function which is called very many times (at least several million), or for an extremely short function (e.g. a field accessor in a class). The first case may make the program faster; the second may make it shorter. You can use a profiler (e.g. gprof) to count how often a function is called.

There are two potential disadvantages to inline functions: they may force implementation details to be publicly visible, and they may cause code bloat.

Exception Safety

Exception Safety is an expression invented/promulgated by Sutter to mean that a procedure behaves well when an exception is thrown during its execution. All the main functions and procedures in CoCoALib should be fully exception safe: either they complete their computations and return normally, or they leave all arguments essentially unchanged, and return exceptionally. A more relaxed approach is acceptable for functions/procedures which a normal library user would not call directly (e.g. non-public member functions): it suffices that no memory is leaked (or other resources lost). Code which is not fully exception-safe should be clearly marked as such.

Consult one of Sutter's (irritating) books for more details.

Dumb/Raw Pointers

If you're using dumb/raw pointers, improve your design!

Dumb/raw pointers should be used only as a last resort; prefer C++ references or std::auto_ptr<...> if you can. Note that it is especially hard writing exception safe code which contains dumb/raw pointers.

Preprocessor Symbols for Controlling Debugging

During development it will be useful to have functions perform sanity checks on their arguments. For general use, these checks could readily produce a significant performance hit.

Compilation without setting any preprocessor variables should produce fast code (i.e. without non-vital checks). Instead there is a preprocessor symbol (CoCoA_DEBUG) which can be set to turn on extra sanity checks. Currently if CoCoA_DEBUG has value zero, all non-vital checks are disabled; any non-zero value enables all additional checks.

There is a macro CoCoA_ASSERT(...) which will check that its argument yields true when CoCoA_DEBUG is set; if CoCoA_DEBUG is not set it does nothing (not even evaluating its argument). This macro is useful for conducting extra sanity checks during debugging; it should not be used for checks that must always be performed (e.g. in the final optimized compilation).

There is currently no official preprocessor symbol for (de)activating the gathering of statistics.

NB I wish to avoid having a plethora of symbols for switching debugging on and off in different sections of the code, though I do accept that we may need more than just one or two symbols.

Errors and Exceptions

During development

Conditions we want to verify solely during development (i.e. when compiling with -DCoCoA_DEBUG) can be checked simply by using the macro CoCoA_ASSERT with argument the condition. Should the condition be false, a CoCoA::ErrorInfo object is thrown -- this will cause an abort if not caught. The error message indicates the file and line number of the failing assertion. If the compilation option -DCoCoA_DEBUG is not enabled then the macro does nothing whatsoever. An example of its use is:

    CoCoA_ASSERT(index <= 0 && index < length);

Always

A different mechanism is to be used for conditions which must be checked even after development is completed.

What should happen when one tries to divide by zero? Or even asks for an exact division between elements that do not have an exact quotient (in the given ring)?

Answer: call the macro CoCoA_ERROR(err_type, location) where err_type should be one of the error codes listed in error.H and location is a string saying where the error was detected (e.g. the name of the function which discovered it). Here is an example

    if (trouble)
      CoCoA_ERROR(ERR::DivByZero, "applying partial ring homomorphism");

The macro CoCoA_ERROR never returns: it will throw a CoCoA::ErrorInfo object. See the example programs for the recommended way of catching and handling exceptions: so that an informative message can be printed out. See error.txt for advice on debugging when an unexpected CoCoA error is thrown.

Functions Returning Complex Values

C++ tends to copy the return value of a function; this is undesirable if the value is potentially large and complex. An obvious alternative is to supply as argument a reference into which the result will be placed. If you choose to return the value via a reference argument then make the reference argument the first one.

    myAdd(rawlhs, rawx, rawy);  // stands for:  lhs = x + y

Spacing and Operators

All binary operators should have one space before and one space after the operator name (unless both arguments are particularly short and simple). Unary operators should not be separated from their arguments by any spaces. Avoid spaces between function names and the immediately following bracket.

  expr1 + expr2;
  !expr;
  UsefulFunction(args);