非死book開發的Caffe工具箱:fb-caffe-exts

jopen 9年前發布 | 21K 次閱讀 機器學習 fb-caffe-exts

【開源:非死book開發的Caffe工具箱fb-caffe-exts】predictor:C++多線程權制共享常用模式封裝 torch2caffe:將Torch模型文件轉換成Caffe模型 conversions:Caffe網絡轉換命令行工具。

predictor/

A simple C++ library that wraps the common pattern of running acaffe::Netin multiple threads while sharing weights. It also provides a slightly more convenient usage API for the inference case.

#include "caffe/predictor/Predictor.h"

// In your setup phase
predictor_ = folly::make_unique<caffe::fb::Predictor>(FLAGS_prototxt_path,
                                                      FLAGS_weights_path);

// When calling in a worker thread
static thread_local caffe::Blob<float> input_blob;
input_blob.set_cpu_data(input_data); // avoid the copy.
const auto& output_blobs = predictor_->forward({&input_blob});
return output_blobs[FLAGS_output_layer_name];

Of note is thepredictor/Optimize.{h,cpp}, which optimizes memory usage by automatically reusing the intermediate activations when this is safe. This reduces the amount of memory required for intermediate activations by around 50% for AlexNet-style models, and around 75% for GoogLeNet-style models.

We can plot each set of activations in the topological ordering of the network, with a unique color for each reused activation buffer, with the height of the blob proportional to the size of the buffer.

For example, in an AlexNet-like model, the allocation looks like

A corresponding allocation for GoogLeNet looks like

The idea is essentially linear scan register allocation. We

  • compute a set of “live ranges” for eachcaffe::SyncedMemory(due to sharing, we can’t do this at acaffe::Bloblevel)
  • compute a set of live intervals, and schedule eachcaffe::SyncedMemoryin a non-overlapping fashion onto each live interval
  • allocate a canonicalcaffe::SyncedMemorybuffer for each live interval
  • Update the blob internal pointers to point to the canonical buffer

Depending on the model, the buffer reuse can also lead to some non-trivial performance improvements at inference time.

To enable this just passPredictor::Optimization::MEMORYto thePredictorconstructor.

項目主頁:http://www.baiduhome.net/lib/view/home/1448027087259

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!