skip to main content

Low-Rank Approximation and Regression in Input Sparsity Time

Clarkson, Kenneth L ; Woodruff, David P

Journal of the ACM, 2017-02, Vol.63 (6), p.1-45 [Periódico revisado por pares]

New York: Association for Computing Machinery

Texto completo disponível

Citações Citado por
  • Título:
    Low-Rank Approximation and Regression in Input Sparsity Time
  • Autor: Clarkson, Kenneth L ; Woodruff, David P
  • Assuntos: Algorithms ; Approximation ; Indexing ; Mathematical analysis ; Polynomials ; Principal components analysis ; Probability ; Regression ; Running in ; Semantics ; Studies ; Subspaces
  • É parte de: Journal of the ACM, 2017-02, Vol.63 (6), p.1-45
  • Notas: ObjectType-Article-1
    SourceType-Scholarly Journals-1
    ObjectType-Feature-2
    content type line 23
  • Descrição: We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r , with probability at least 9/10, ∥ SAx ∥ 2 = (1 ± ε)∥ Ax ∥ 2 simultaneously for all x ∈ R d . Here, m is bounded by a polynomial in r ε − 1 , and the parameter ε ∈ (0, 1]. Such a matrix S is called a subspace embedding . Furthermore, SA can be computed in O (nnz( A )) time, where nnz( A ) is the number of nonzero entries of A . This improves over all previous subspace embeddings, for which computing SA required at least Ω( nd log d ) time. We call these S sparse embedding matrices . Using our sparse embedding matrices, we obtain the fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and ℓ p regression. More specifically, let b be an n × 1 vector, ε > 0 a small enough value, and integers k , p ⩾ 1. Our results include the following. — Regression: The regression problem is to find d × 1 vector x ′ for which ∥ Ax ′ − b ∥ p ⩽ (1 + ε)min x ∥ Ax − b ∥ p . For the Euclidean case p = 2, we obtain an algorithm running in O (nnz( A )) + Õ ( d 3 ε −2 ) time, and another in O (nnz( A )log(1/ε)) + Õ ( d 3 log (1/ε)) time. (Here, Õ ( f ) = f ċ log O (1) ( f ).) For p ∈ [1, ∞), more generally, we obtain an algorithm running in O (nnz( A ) log n ) + O ( r \ε −1 ) C time, for a fixed C . — Low-rank approximation: We give an algorithm to obtain a rank- k matrix  k such that ∥ A −  k ∥ F ≤ (1 + ε )∥ A − A k ∥ F , where A k is the best rank- k approximation to A . (That is, A k is the output of principal components analysis, produced by a truncated singular value decomposition, useful for latent semantic indexing and many other statistical problems.) Our algorithm runs in O (nnz( A )) + Õ ( nk 2 ε −4 + k 3 ε −5 ) time. — Leverage scores: We give an algorithm to estimate the leverage scores of A , up to a constant factor, in O (nnz( A )log n ) + Õ ( r 3 )time.
  • Editor: New York: Association for Computing Machinery
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.