| | |
| | |
Stat |
Members: 3645 Articles: 2'504'928 Articles rated: 2609
25 April 2024 |
|
| | | |
|
Article overview
| |
|
Large Scale Distributed Distance Metric Learning | Pengtao Xie
; Eric Xing
; | Date: |
18 Dec 2014 | Abstract: | In large scale machine learning and data mining problems with high feature
dimensionality, the Euclidean distance between data points can be
uninformative, and Distance Metric Learning (DML) is often desired to learn a
proper similarity measure (using side information such as example data pairs
being similar or dissimilar). However, high dimensionality and large volume of
pairwise constraints in modern big data can lead to prohibitive computational
cost for both the original DML formulation in Xing et al. (2002) and later
extensions. In this paper, we present a distributed algorithm for DML, and a
large-scale implementation on a parameter server architecture. Our approach
builds on a parallelizable reformulation of Xing et al. (2002), and an
asynchronous stochastic gradient descent optimization procedure. To our
knowledge, this is the first distributed solution to DML, and we show that, on
a system with 256 CPU cores, our program is able to complete a DML task on a
dataset with 1 million data points, 22-thousand features, and 200 million
labeled data pairs, in 15 hours; and the learned metric shows great
effectiveness in properly measuring distances. | Source: | arXiv, 1412.5949 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |