微軟深度學習框架Microsoft Cognitive Toolkit 2.0候選版發布

jopen 7年前發布 | 9K 次閱讀微軟

Microsoft Cognitive Toolkit 原名 CNTK，是微軟去年開源的深度學習框架。

作為語音識別領域聲名卓著的開發工具，Microsoft Cognitive Toolkit 具有相當不錯的可擴展性、速度和精確性。在海量數據上開發深度學習應用，它具備商用級別的穩定型，以及與主流編程語言與算法的兼容。

如今，它即將迎來新一代的 2.0 版本。

自從去年十月發布 2.0 beta 版，微軟為 Microsoft Cognitive Toolkit 2.0 已陸續添加了超過 100 余個新特性、升級以及漏洞修補。而近日雷鋒網獲知，微軟在 GitHub 上放出了它的 RC1 版，即第一個候選版本，標志著內測階段已經完成。

我們離 Microsoft Cognitive Toolkit 2.0 的正式發布又近了一步。

前天，微軟在博客表示：

“我們很高興地宣布，微軟已經將 Microsoft Cognitive Toolkit 2.0 帶出內測階段，并在今天向大家公布其第一個候選版本。該工具此前被稱為 CNTK，是一個針對深度學習的系統，用來加速諸如語音、圖像識別以及搜索相關性（search relevance）等領域的技術進步。并可運行于 CPU 或英偉達 GPU。Microsoft Cognitive Toolkit 既可本地運行，也可在云端基于 Azure GPU 運行。

Microsoft Cognitive Toolkit 在一系列微軟產品中都有十分廣泛的應用。全世界范圍內有大規模深度學習部署需求的公司，對最新算法、技術感興趣的學生，都是其用戶。自從 2016 年十月，我們已發布了超過十個 beta 版本，涵蓋數百個新特性、性能提升和修補。”

主要升級

在 BrainScript 之外提供了更多 binding。2.0 版本把 Cognitive Toolkit 作為一個支持以下 binding 的算法庫：

Python (versions 2.7, 3.4, and 3.5).

C++.

C#/.NET Managed.

Python 示例和教程（Jupyter Notebooks）

微軟充分認識到 Python 在深度學習領域的重要性，準備了一系列 Python 示例與教程（后者作為 Jupyter Notebooks 來執行）。請見：

Python Examples.

Python Tutorials (Jupyter Notebooks).

我們了解到，你也可以用 Cognitive Toolkit Docker Containers 來運行 Jupyter Notebooks 教程。

Layers

Layers 算法庫得到了大幅升級。大量的通用“layers”已預定義，使編寫包含標準層級的簡單網絡變得十分容易。

新的評估算法庫

雷鋒網(公眾號：雷鋒網)獲知，新的 Cognitive Toolkit 評估算法庫在易用性和性能上被大幅升級。該算法庫可被用于 Windows 和 Linux，使用 C++、Python、C# 其它 .NET 語言。

新特性列表

The ability to extend Cognitive Toolkit functions, learners, trainers and optimizers with your own algorithms in Python, C++.

Enhanced, built-in distributed readers for speech, image, and text deep learning tasks.
The ability to use TensorBoard visualizations from Cognitive toolkit! Read more here.
Pretrained models available for use.
Performance improvements.
Support of distributed scenarios in Python API. See more in the sections on distributed scenarios in the ConvNet and ResNet examples.
Support of Asynchronous Stochastic Gradient Descent (ASGD)/Hogwild! training parallelization support using Microsoft’s Parameter Server (Project Multiverso).
Support for training on one-hot and sparse arrays via NumPy.
Support of object recognition using Fast R-CNN algorithm.
Integration with NVIDIA NCCL, a stand-alone library of standard collective communication routines, such as all-gather, reduce, broadcast, etc.， that have been optimized to achieve high bandwidth over PCIe. See how to enable NCCL in the Cognitive Toolkit Wiki.
Lambda rank and NDCG at 1 are accessible from Python for real this time.
Performance Profiler for BrainScript and Python.