forked from dmlc/mshadow
-
Notifications
You must be signed in to change notification settings - Fork 11
Home
Tianqi Chen edited this page Apr 6, 2014
·
26 revisions
This is the documentation for mshadow: A Lightweight CPU/GPU Matrix/Tensor Template Library in C++/CUDA.
- Read Tutorial to get started, understand data structures, and basic usages.
- [Expression API](Expression API) gives a list of expressions supported in mshadow
- Read rest part of this page for other information.
- Example codes are in example
- Basically, all the files ends in
-inl.hpp, -inl.cuh
are implementations, and can be ignored if only using mshadow - The files ends in
.h
are heavily commented with doxyen format, and can be used to generate the corresponding document - List of useful headers related to different type of functions. read comments directly, or doxygen generated documents, an online version of doxygen generated documents is available Here
- mashadow/tensor.h: core header, contains all data structure, and core functions
- mashadow/tensor_random.h: random number generator
- mashadow/tensor_io.h: utils for save write to binary file
- mashadow/tensor_container.h: an potential useful implementation that handles memory allocation like STL
- Doxygen works well for normal APIs header listed above, except for expression related codes. They also comes with detail comments, but maybe not straight forward to understand, read [Expression API](Expression API) for documentation of expression.
- mashadow/tensor_expr.h: definition of expression template
- mashadow/tensor_expr_ext.h: extension of expressions
- Use mshadow: mshadow is a purely template library, include
#include "mshadow/tensor.h"
to your code to use the lib - Example Makefile: checkout example/neuralnet/Makefile to see how mshadow is compiled
- Package dependency: mshadow will need MKL or other CBLAS to do matrix multiplication. The package dependencies can be customized via macros in mashadow/tensor_base.h, examples:
- To compile with CBLAS(ATLAS), add -DMSHADOW_USE_MKL=0 -DMSHADOW_USE_CBLAS=0 to CFLAGS
- To compile without CUDA, and only use cpu mode, add -DMSHADOW_USE_CUDA=0 to CFLAGS
- To only use element-wise operations, and remove dependency on all packages, add -DMSHADOW_STAND_ALONE=1 to CFLAGS
- SSE support: mshadow will use sse2 optimization for simple element-wise operations. However, due to incompatibility of nvcc and emmintrin.h, sse is only supported when not compiling with nvcc. To make use of sse and device invariant code:
- Write a implementation in template say
Learner<xpu>()
inlearner.hpp
- Create
learner.cu
andlearner.cpp
, and includelearner.hpp
in both file, returnLearner<cpu>
in cpp file andLearner<gpu>
in cu using some factory function. - Compile cpp file with g++ and cu file with nvcc, and link all together.
- Example: CXXNET Project (checkout cxxnet_nnet .cpp .cu) use this way to create device invariant code using mshadow.
- We should note that not using sse does not influence performance of most project, if the project's major computation is matrix multiplication, which translated to optimized MKL, or if you simply care about performance in GPU.
- Write a implementation in template say