大規模線性分類、回歸和排序用的python庫:lightning

jopen 9年前發布 | 23K 次閱讀 機器學習 Lightning

python下大規模線性分類、回歸和排序用的庫,支持SDCA、Prox-SDCA、SGD, AdaGrad, SAG, SVRG、FISTA, SpaRSA,亮點:和scikit-learn使用相同的API約定、原生支持數據的密集和稀疏表示、計算密集模塊用Cython開發。

Highlights:

  • follows the scikit-learn API conventions
  • supports natively both dense and sparse data representations
  • computationally demanding parts implemented in Cython
  • </ul>

    Solvers supported:

    • primal coordinate descent
    • dual coordinate descent (SDCA, Prox-SDCA)
    • SGD, AdaGrad, SAG, SVRG
    • FISTA, SpaRSA
    from sklearn.datasets import fetch_20newsgroups_vectorized
    from lightning.classification import CDClassifier
    
    # Load News20 dataset from scikit-learn.
    bunch = fetch_20newsgroups_vectorized(subset="all")
    X = bunch.data
    y = bunch.target
    
    # Set classifier options.
    clf = CDClassifier(penalty="l1/l2",
                       loss="squared_hinge",
                       multiclass=True,
                       max_iter=20,
                       alpha=1e-4,
                       C=1.0 / X.shape[0],
                       tol=1e-3)
    
    # Train the model.
    clf.fit(X, y)
    
    # Accuracy
    print clf.score(X, y)
    
    # Percentage of selected features
    print clf.n_nonzero(percentage=True)

    項目主頁:http://www.baiduhome.net/lib/view/home/1421574088171

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!