Software
Open-source libraries
- Metric-learn [GitHub] [PyPI] [Documentation] [Companion paper]
Python package implementing various metric learning algorithms, among which Large Margin Nearest Neighbor (LMNN), Neighborhood Component Analysis (NCA), Information-Theoretic Metric Learning (ITML), Relative Component Analysis (RCA) and Mahalanobis Metric for Clustering (MMC). As part of scikit-learn-contrib, the API of metric-learn is compatible with scikit-learn, a prominent library for machine learning in Python. This allows to use all the scikit-learn routines (for pipelining, model selection, etc) with metric learning algorithms through a unified interface. The package is distributed under MIT license.
Open-source code from papers
- Distributed Frank Wolfe for Low-Rank Learning [GitHub]
Python / PySpark implementation of the distributed optimization algorithm proposed in our paper A Distributed Frank-Wolfe Framework for Learning Low-Rank Matrices with the Trace Norm (Machine Learning 2018). The code can run on Spark clusters and allows user to define their own objective functions. - GoSta [GitHub]
Basic Python implementation of the gossip algorithm for computing U-statistics proposed in our paper Extending Gossip Algorithms to Distributed Estimation of U-statistics (NIPS 2015). A few other gossip averaging algorithms are also implemented. Distributed under Apache 2.0 license. - HDSL [GitHub]
Matlab/MEX implementation of the similarity learning method proposed in our papers Similarity Learning for High-Dimensional Sparse Data (AISTATS 2015) and Escaping the Curse of Dimensionality in Similarity Learning: Efficient Frank-Wolfe Algorithm and Generalization Bounds (Neurocomputing 2019). It allows scalable learning of sparse bilinear similarity functions on high-dimensional data. Distributed under GNU/GPL 3 license. - Distributed Frank Wolfe for Sparse Learning [mloss]
C++/MPI implementation of the distributed optimization algorithm proposed in our paper A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning (SDM 2015). In its current version, it can tackle two problems: kernel SVM with distributed training examples and LASSO regression with distributed attributes. Distributed under GNU/GPL 3 license. - SCML [GitHub] [mloss]
Matlab/MEX implementation of the metric learning method proposed in our paper Sparse Compositional Metric Learning (AAAI 2014). It allows scalable learning of global, multi-task and multiple local Mahalanobis metrics for multi-class data under a unified framework based on sparse combinations of rank-one basis metrics. Distributed under GNU/GPL 3 license. - GESL [mloss]
C library implementing the string edit similarity learning method proposed in our papers Learning Good Edit Similarities with Generalization Guarantees (ECML/PKDD 2011) and Good edit similarity learning by loss minimization (MLJ 2012). It is able to handle multi-class problems and also includes a classification library that allows to learn a sparse classifier from an edit similarity. The archive includes the datasets used in the papers. Distributed under GNU/GPL 3 license.