神經網絡并行框架:Purine
新加坡LV實驗室的神經網絡并行框架Purine。支持構建各種并行的架構,在多機多卡,同步更新參數的情況下基本達到線性加速。12塊Titan 20小時可以完成Googlenet的訓練。
-
common
common codes used across the project. Including abstraction of CUDA, abstraction of uv event loop etc.
-
-
caffeine
code taken from Caffe, mainly math functions and some macros from common.hpp in Caffe.
-
catch
contains the header file of CATCH testing system. It is the unit test framework used in Purine. There are not much unit testing in this code. Since the core math functions are based on cudnn and caffe, it should be no problem. (Though during development I did file a bug report to cudnn, now it is fixed in cudnn v2 rc3)
-
dispatch
contains definitions of graph, node, op, blob etc. blob wraps tensor, op wraps operation. Different from Purine version 1, there is no standalone dispatcher, the dispatching code is inside blob, op and graph. Construction of a graph can be done by connecting blobs and ops. The resulting Graph is self-dispatchable. By calling graph.run().
-
composite
contains predefined composite graphs. which can be used to construct larger graphs. For example, all the layers in caffe can be defined as a graph in purine. A network can be constructed by further connecting these predefined graphs.
-
operations
contains operations and tensor. In this version, tensor is 4 dimensional (It can be changed to ndarray). Operations takes input tensors and generate output tensors. Inputs and outputs of a operation is stored in a std vector. Operations can take parameters, for example, the parameters of convolution contain padding size, stride etc. In the operation folder, there are a bunch of predefined operations.
-
tests
unit tests of the project.