| | |
| | |
Stat |
Members: 3669 Articles: 2'599'751 Articles rated: 2609
18 March 2025 |
|
| | | |
|
Article overview
| |
|
Tradeoffs for Space, Time, Data and Risk in Unsupervised Learning | Mario Lucic
; Mesrob I. Ohannessian
; Amin Karbasi
; Andreas Krause
; | Date: |
2 May 2016 | Abstract: | Faced with massive data, is it possible to trade off (statistical) risk, and
(computational) space and time? This challenge lies at the heart of large-scale
machine learning. Using k-means clustering as a prototypical unsupervised
learning problem, we show how we can strategically summarize the data (control
space) in order to trade off risk and time when data is generated by a
probabilistic model. Our summarization is based on coreset constructions from
computational geometry. We also develop an algorithm, TRAM, to navigate the
space/time/data/risk tradeoff in practice. In particular, we show that for a
fixed risk (or data size), as the data size increases (resp. risk increases)
the running time of TRAM decreases. Our extensive experiments on real data sets
demonstrate the existence and practical utility of such tradeoffs, not only for
k-means but also for Gaussian Mixture Models. | Source: | arXiv, 1605.0529 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|