Skip to content

Indigo Cpp Style guide

Iurii Solntsev edited this page Feb 15, 2021 · 6 revisions

Table of Contents

1 Overview

General guide for writing C++ code for Indigo toolkit

This guide is based on and takes most of approaches from Google C++ guide

2 Header Files

2.1 The #define Guard

All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be __PROJECT_PATH_FILE_H__.

To guarantee uniqueness, they should be based on full name of header and underscores. For example, the file molecule/molecule_cool_component.h in should have the following guard:

    #ifndef __molecule_cool_component_h__
    #define __molecule_cool_component_h__
    ...
    #endif  // __molecule_cool_component_h__

2.2 The #pragma once Guard

When creating a new header file, you can use the modern guard instead:

    #pragma once

This guard is supported by the absolute majority of C and C++ compilers, however, it is still not included in the language standard.

This guard benefits that it cannot accidently be broken by using the same macros defined in different or external headers.

This makes the code shorter.

Both guards can be used together, however, when using the #pragma once guard, the #define guard is ineffective.

2.3 Forward Declarations

Use forward declarations where possible.

A "forward declaration" is a declaration of a class, function, or template without an associated definition. The code after the forward declaration is compiled until the declared item is referenced by value.

  • Forward declarations can save compile time, as #includes force the compiler to open more files and process more input.
  • Forward declarations can save on unnecessary recompilation. #includes can force your code to be recompiled more often, due to unrelated changes in the header.

** Example **

reaction_auto_loader.h:
    #include "base_cpp/array.h"
    #include "molecule/molecule_stereocenter_options.h"

    namespace indigo {

    class Scanner;
    class BaseReaction;
    class Reaction;
    class QueryReaction;

    class DLLEXPORT ReactionAutoLoader
    {
    public:
       ReactionAutoLoader (Scanner &scanner);
       ReactionAutoLoader (const Array<char> &arr);
       ReactionAutoLoader (const char *);
reaction_auto_loader.cpp:
    #include "reaction/reaction_auto_loader.h"
    ...
    #include "gzip/gzip_scanner.h"
    #include "reaction/reaction.h"
    #include "reaction/query_reaction.h"
    ...
    // use Scanner, Reaction, QueryReaction...
 

2.4 Inline Functions

Define functions inline only when they are small, say, 10 lines or less.

You can declare functions in a way that allows the compiler to expand them inline rather than calling them through the usual function call mechanism.

Inlining a function can generate more efficient object code, as long as the inlined function is small. Feel free to inline accessors and mutators, and other short, performance-critical functions.

2.5 Names and Order of Includes

Include headers in the following order: the header corresponding to the current Cpp file, C system headers, C++ standard library headers, other libraries' headers, your project's headers.

Include headers as descendants of the project's source directory, using relative directory prefix with Unix-style slashes wherever possible, for example:

#include "indigo_internal.h"
#include "base_cpp/output.h"
#include "base_cpp/profiling.h"
#include "base_cpp/temporary_thread_obj.h"
#include "molecule/molecule_fingerprint.h"

When including headers from other projects or third-party libraries, include them as descendants of the corresponding project's source or include directory and add the corresponding directory to your project's include paths. For example:

#include <tinyxml.h> 
#include <rapidjson/document.h>
#include "molecule/molecule_3d_constraints.h"
#include "molecule/molecule_arom_match.h"

Provided that the following paths are added to include directories compiler option (just as an example):

  • "~/Indigo/third_party/tinyxml/include"
  • "~/Indigo/third_party/rapidjson"
  • "~/Indigo"

Use double quotes ("header_name.h") to refer a header file in your or neighbor project.

Use angle brackets (<library.h>) to refer header from third-party or standard set of headers.

2.6 Precompiled Headers

Do not include precompiled headers explicitly since such code can hardly be reused in client projects without modifications. Overriding the precompiled header with client's one can drastically distort the functionality and performance of the original source code.

However, you can speedup compilation with precompilation features depending on the compiler in use without modification the code.

3 Scoping

3.1 Namespaces

With few exceptions, place all code in indigo namespace.

    #ifndef __molecule_h__
    #define __molecule_h__

    #include "molecule/base_molecule.h"

    namespace indigo {

    class DLLEXPORT Molecule : public BaseMolecule
    ...

4 Classes

Classes are the fundamental unit of code in Indigo. Every object or function should be part of a class, excepting extern "C" exported functions.

4.1 Non-copyable by default

By default, all classes should be non-copyable. It decreases a huge amount of errors. If you want to make your structure copyable, it should be a simple structure (only simple type fields) and struct must be used instead of class. You can use the basic class NonCopyable to inherit. For example:

    #include "base_cpp/non_copyable.h"

    ...
    class DLLEXPORT Graph : public NonCopyable
    {
    ...

4.2 Delegating and Inheriting Constructors

Use delegating and inheriting constructors when they reduce code duplication.

Delegating and inheriting constructors are two different features, both introduced in C++11, for reducing code duplication in constructors. Delegating constructors allows one of a class's constructors to forward work to one of the class's other constructors, using a special variant of the initialization list syntax. For example:

    X::X(const string& name) : name_(name) {
       ...
    }

    X::X() : X("") {}

Use delegating and inheriting constructors when they reduce boilerplate and improve readability. Be cautious about inheriting constructors when your derived class has new member variables. Inheriting constructors may still be appropriate in that case if you can use in-class member initialization for the derived class' member variables.

4.3 Structs vs. Classes

Use a struct only for passive objects that carry data; everything else is a class.

4.4 Inheritance

Composition is often more appropriate than inheritance. When using inheritance, make it public.

When a sub-class inherits from a base class, it includes the definitions of all the data and operations that the parent base class defines. In practice, inheritance is used in two major ways in C++: implementation inheritance, in which actual code is inherited by the child, and interface inheritance, in which only method names are inherited.

All inheritance should be public. If you want to do private inheritance, you should be including an instance of the base class as a member instead.

Make your destructor virtual if necessary. If your class has virtual methods, its destructor should be virtual.

4.5 Operator Overloading

Overload operators judiciously. Do not create user-defined literals.

Define overloaded operators only if their meaning is obvious, unsurprising, and consistent with the corresponding built-in operators. For example, use | as a bitwise- or logical-or, not as a shell-style pipe.

5 Functions

5.1 Parameter Ordering

When defining a function, parameter order is: inputs, then outputs.

Prefer to pass non-trivial input parameters by const reference rather than by value. It produces more efficient code.

5.2 Write Short Functions

Prefer small and focused functions.

We recognize that long functions are sometimes appropriate, so no hard limit is placed on functions length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program.

6 Indigo specific components

6.1 DECL_ERROR and IMPL_ERROR macros

DECL_ERROR is used to defined Exception for a user class. IMPL_ERROR is a second part which should be defined in cpp file

my_class.h:
    class MyClass
    {
    public:
    ...
    DECL_ERROR;
my_class.cpp:
    IMPL_ERROR(MyClass, "custom error prefix");

    MyClass::MyClass() {
    ...

6.2 QS_DEF macro

QS_DEF macro creates a quasi-static (local thread static) variable.

It is useful if your code creates this variable many times and you cannot move variable out of loop scope (for example when you have it inside a function)

    void MyClass::myCoolFunction () {
       QS_DEF(Array<char>, curline);
       ...

6.3 TL_CP_DECL, CP_DECL and other

TL_CP_DECL macro does the same as QS_DEF but it is applied for a class fields.

The special macro should be defined both for header and cpp files in this case. See example below

my_class.h:
    class MyClass
    {
    public:
    ...
    CP_DECL;
    TL_CP_DECL(Array<int>, _offsets);
my_class.cpp:
    CP_DEF(MyClass);

    MyClass::MyClass () :
        CP_INIT, TL_CP_GET(_offsets)

6.4 Indigo utility classes

Indigo contains collection of useful classes in common/base_cpp and common/base_c. Here is the list of most commonly used:

  • Array can be used for simple types. For example Array<char> Array<int>
  • ObjArray is used when your type is not copyable and class has constructors with one or none parameters . ObjArray<Object>
  • PtrArray is used when you just want to keep pointers for your objects.
  • AutoPtr auto_ptr similar structure. Light-weight
  • Scanner is basic input stream class
  • Output is basic output stream class
  • RedBlackMap is basic map structure
  • Other

7 C++11 features

7.1 auto

Use auto to avoid type names that are just clutter. Continue to use manifest type declarations when it helps readability, and never use auto for anything but local variables.

In C++11, a variable whose type is given as auto will be given a type that matches that of the expression used to initialize it. You can use auto either to initialize a variable by copying, or to bind a reference.

    vector<string> v;
    ...
    auto s1 = v[0];  // Makes a copy of v[0].
    const auto& s2 = v[0];  // s2 is a reference to v[0].

7.2 Iterators

Use iterators and create an iterable types for your classes if needed. Iterate vertices example:

    for (auto i : graph.vertices()) {
       const Vertex &v = graph.getVertex(i);
       ...

8 Naming

8.1 File Names

Filenames should be all lowercase and can include underscores (_) or dashes (-). Follow the convention that your project uses. If there is no consistent local pattern to follow, prefer "_".

Examples of acceptable file names:

    my_useful_class.h
    my-useful-class.cpp
    myusefulclass.cpp
    myusefulclass_test.cpp

Header files should have .h extension. Source files should have .cpp extension. Source files those are designed to be included as piece of code in other source files can have .inc extension.

Using other extensions, like .hpp, .hxx, or .cxx are not recommended. Violations are acceptable for third-party code.

8.2 Namespace Names

Namespace names are all lower-case, with words separated by underscores. Top-level namespace names are based on the project name. Avoid collisions between nested namespaces and well-known top-level namespaces.

The top-level namespace is "indigo".

    namespace indigo {
    namespace my_namespace {
    ...
    } // namespace my_namespace 
    } // namespace indigo

Remember not to put a semicolon after namespace's closing brace.

8.3 Type Names

Type names start with a capital letter and have a capital letter for each new word, with no underscores: MyExcitingClass, MyExcitingEnum.

The names of all types — classes, structs, typedefs, enums, and type template parameters — have the same naming convention. Type names should start with a capital letter and have a capital letter for each new word. No underscores. For example:

    // classes and structs
    class UrlTable { ...
    class UrlTableTester { ...
    struct UrlTableProperties { ...

    // typedefs
    typedef hash_map<UrlTableProperties *, string> PropertiesMap;

    // enums
    enum UrlTableErrors { ...

8.4 Variable Names

The names of variables and data members are all lowercase, with underscores between words. Data members of classes (but not structs) additionally have leading underscores. For instance: a_local_variable, a_struct_data_member, _a_class_data_member.

For example:

    string table_name;  // OK - uses underscore.
    string tablename;   // OK - all lowercase.

    string tableName;   // Bad - mixed case.

8.5 Class Data Members

Data members of classes, both static and non-static, are named like ordinary nonmember variables, but with a trailing underscore, excepting public data members.

    class TableInfo {
       ...
    protected:
       string _table_name;  // OK - underscore at begin.
       string _tablename;   // OK.
    private:
       static Pool<TableInfo>* _pool;  // OK.
    };

8.6 Struct Data Members

Data members of structs, both static and non-static, are named like ordinary nonmember variables. They do not have the trailing underscores that data members in classes have.

    struct UrlTableProperties {
      string name;
      int num_entries;
      static Pool<UrlTableProperties>* pool;
    };

8.7 Global Variables

There are no special requirements for global variables, which should be rare in any case, but if you use one, consider prefixing it with g_ or some other marker to easily distinguish it from local variables.

8.8 Constant Names

Variables declared constexpr or const, and whose value is fixed for the duration of the program, are named UPPER_CASE. For example:

    const int DAYS_IN_WEEK = 7;

All such variables with static storage duration (i.e. static and global) should be named this way.

8.9 Function Names

Regular public functions should be named lowerCamelCase. Private function should be leading underscored.

    class MyClass {
       public:
       ...
       bool isEmpty() const { return _num_entries == 0; }
 
       private:
       int _num_entries;
       void _calculateEntries();
    };

8.10 Macro Names

Macros should be named UPPER_UNDESCORED

    #define ROUND(x) ...
    #define PI_ROUNDED 3.0

9 Formatting

Coding style and formatting are pretty arbitrary, but a project is much easier to follow if everyone uses the same style. Individuals may not agree with every aspect of the formatting rules, and some of the rules may take some getting used to, but it is important that all project contributors follow the style rules so that they can all read and understand everyone's code easily.

9.1 Line Length

Each line of text in your code should be at most 160 characters long.

We recognize that this rule is controversial, but so much existing code already adheres to it, and we feel that consistency is important.

9.2 Spaces vs. Tabs

Use only spaces, and indent 4 spaces at a time.

We use spaces for indentation. Do not use tabs in your code. You should set your editor to emit spaces when you hit the tab key.

9.3 Open and close parenthesis

We don't have any special formals for parenthesis. You can do on the same string or on the next string. The same string is more widely used though.

Return type on the same line as function name, parameters on the same line if they fit. Wrap parameter lists which do not fit on a single line as you would wrap arguments in a function call.

Functions look like this:

    ReturnType ClassName::FunctionName(Type par_name1, Type par_name2) {
        doSomething();
        ...
    }

If you have 3 parameters or more, or parameters don't fit on one line, start each parameter on a new line with corresponding indent (aligning to the first parameter):

    ReturnType ClassName::ReallyLongFunctionName(Type par_name1, 
                                                 Type par_name2,
                                                 Type par_name3) {
        DoSomething();
        ...
    }

or

    ReturnType ClassName::ReallyLongFunctionName(
        ReallyLongTemplate<ReallyLongType> par_name1, 
        ReallyLongTemplate<ReallyLongType> par_name2,
        ReallyLongTemplate<ReallyLongType> par_name3) {
        DoSomething();
       ...
    }

9.4 Namespace Formatting

The contents of namespaces are not indented.

Namespaces do not add an extra level of indentation. For example, use:

    namespace indigo {

    void foo() {  // Correct.  No extra indentation within namespace.
    ...
    }

    }  // namespace indigo

9.5 Class Format

Sections in public, protected and private order, each are not indented.

The basic format for a class declaration is:

    class MyClass : public OtherClass {
    public:      // Note no space indent!
        MyClass();  // Regular 4 space indent.
        explicit MyClass(int var);
        ~MyClass() {}

        void someFunction();
        void someFunctionThatDoesNothing() {
        }


    private:
        bool _someInternalFunction();

        int _some_var;
        int _some_other_var;
    };

Things to note:

  • Any base class name should be on the same line as the subclass name unless doesn't fit.
  • The public:, protected:, and private: keywords should be not indented.
  • The public section should be first, followed by the protected and finally the private section.

9.6 General spaces rules

Use spaces and empty strings to make your code more understandable.

    if (b) {          // Space after the keyword in conditions and loops.

    } else {          // Spaces around else.

    }

    while (test) {}   // There is usually no space inside parentheses.
    switch (i) {
    for (int i = 0; i < 5; ++i) {

    // For loops always have a space after the semicolon.
    for (; i < 5; ++i) {
        ...

    // Range-based for loops always have a space before and after the colon.
    for (auto x : counts) {
        ...
    }
    switch (i) {
        case 1:         // No space before colon in a switch case.
            ...
        case 2: break;  // Use a space after a colon if there's code after it.