This document provides some information to help you contribute to the CTranslate2.
We use GitHub issues for bugs in the code that are reproducible. A good bug report should contain every information needed to reproduce it. Before opening a new issue, make sure to:
- use the GitHub issue search for existing and fixed bugs;
- check if the issue has been fixed in a more recent version;
- isolate the problem to give as much context as possible.
If you have questions on how to use the project or have trouble getting started with it, consider using our forum instead and tagging your topic with the ctranslate2 tag.
Do you think a feature is missing or would be a great addition to the project? Please open a GitHub issue to describe it.
- If you want to contribute with code but are unsure what to do,
- search for TODO comments in the code: these are small dev tasks that should be addressed at some point.
- look for GitHub issues marked with the help wanted label: these are developments that we find particularly suited for community contributions.
- If you are planning to make a large change to the existing code, consider asking first on the forum to confirm that it is welcome.
See Install from sources.
To enable the C++ tests, you should configure the project with cmake -DBUILD_TESTS=ON
. The binary tests/ctranslate2_test
runs all tests using Google Test. It expects the path to the test data as argument:
./tests/ctranslate2_test ../tests/data
The Python tests can be run with pytest
:
cd python
pip install -r tests/requirements.txt
pytest tests/
The code should also be checked with black
(automatic formatting), isort
(imports ordering), and flake8
(code checking):
black .
isort .
flake8 .
You should make sure that new changes do not negatively impact the general performance. The translation client has some options to measure the performance.
The command line option --log_throughput
reports the tokens generated per second on the standard error output. This is the recommended metric to compare different runs (higher is better).
The command line option --log_profiling
reports an execution profile on the standard error output. It prints a list of selected functions in the format:
2.51% 80.38% 87.27% beam_search 557.00ms
where the columns mean:
- Percent of time spent in the function
- Percent of time spent in the function and its callees
- Percent of time printed so far
- Name of the function
- Time spent in the function (in milliseconds)
The list is ordered on 5. from the largest to smallest time.
CTranslate2 uses row-major storages, usually encapsulated in the StorageView
class. This class acts like a tensor representation but without the mathematical semantics. It is convenience wrapper to view a buffer of data in a particular shape, and provides methods to resize, reshape, and copy data. The underlying storage has a type (e.g. float
) and a location (e.g. GPU #1) which are both resolved at runtime.
To maximize performance, the implementation avoid new allocations when possible:
- no reallocation occurs when resizing the storage to a smaller size
- caching allocators are used to reuse previously allocated buffers
- primitives: low-level compute functions, specialized depending on the data type and target device.
- ops: neural network operations (e.g. Softmax, Gemm, etc.)
- layers: stateful neural network layers like
Dense
,LayerNorm
, etc. - models: collection of neural network layers to achieve a certain tasks (e.g.
Transformer
for NMT) - translators: high-level class using a model to implement the text translation logic
- translators pool: pool of parallel translators sharing the same model
Ops define the basic neural network operations. Whenever possible, they follow the ONNX specification.
Their implementation typically require multiple source files:
include/ctranslate2/ops/my_op.h # Op interface
src/ops/my_op.cc # Input checks and dispatch based on device and type.
src/ops/my_op_cpu.cc # CPU-specific implementation
src/ops/my_op_gpu.cu # CUDA-specific implementation
In particular, no compilation flags should be used in the header file to make it easy to use the project as a library.