# JMLR Workshop and Conference Proceedings

## Volume 37: Proceedings of The 32nd International Conference on Machine Learning

**Editors:
Francis Bach,
David Blei
**

### Accepted Papers

Attribute Efficient Linear Regression with Distribution-Dependent Sampling

[abs] [pdf] [supplementary]

Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis

[abs] [pdf] [supplementary]

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

[abs] [pdf] [supplementary]

Accelerated Online Low Rank Tensor Learning for Multivariate Spatiotemporal Streams

[abs] [pdf] [supplementary]

A Modified Orthant-Wise Limited Memory Quasi-Newton Method with Convergence Analysis

[abs] [pdf] [supplementary]

Generalization error bounds for learning to rank: Does the length of document lists matter?

[abs] [pdf] [supplementary]

PeakSeg: constrained optimal segmentation and supervised penalty learning for peak detection in count data

Paired-Dual Learning for Fast Training of Latent Variable Hinge-Loss MRFs

[abs] [pdf] [supplementary]

A Provable Generalized Tensor Spectral Method for Uniform Hypergraph Partitioning

[abs] [pdf] [supplementary]

Budget Allocation Problem with Multiple Advertisers: A Game Theoretic View

[abs] [pdf] [supplementary]

Tracking Approximate Solutions of Parameterized Optimization Problems over Multi-Dimensional (Hyper-)Parameter Domains

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[abs] [pdf] [supplementary]

Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds

The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling

Ordered Stick-Breaking Prior for Sequential MCMC Inference of Bayesian Nonparametric Models

[abs] [pdf] [supplementary]

A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data

[abs] [pdf] [supplementary]

Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network

[abs] [pdf] [supplementary]

Fast Kronecker Inference in Gaussian Processes with non-Gaussian Likelihoods

[abs] [pdf] [supplementary]

Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares

On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence

Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

[abs] [pdf] [supplementary]

Improving the Gaussian Process Sparse Spectrum Approximation by Representing Uncertainty in Frequency Inputs

[abs] [pdf] [supplementary]

Ranking from Stochastic Pairwise Preferences: Recovering Condorcet Winners and Tournament Solution Sets at the Top

[abs] [pdf] [supplementary]

Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification

Multi-view Sparse Co-clustering via Proximal Alternating Linearized Minimization

[abs] [pdf] [supplementary]

Latent Topic Networks: A Versatile Probabilistic Programming Framework for Topic Models

[abs] [pdf] [supplementary]

Random Coordinate Descent Methods for Minimizing Decomposable Submodular Functions

[abs] [pdf] [supplementary]

DP-space: Bayesian Nonparametric Subspace Clustering with Small-variance Asymptotics

[abs] [pdf] [supplementary]

HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades

[abs] [pdf] [supplementary]

Large-scale log-determinant computation through stochastic Chebyshev expansions

[abs] [pdf] [supplementary]

Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)

Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

[abs] [pdf] [supplementary]

Safe Subspace Screening for Nuclear Norm Regularized Least Squares Problems

[abs] [pdf] [supplementary]

Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays

[abs] [pdf] [supplementary]

Non-Linear Cross-Domain Collaborative Filtering via Hyper-Structure Transfer

[abs] [pdf] [supplementary]

The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

[abs] [pdf] [supplementary]

Non-Gaussian Discriminative Factor Models via the Max-Margin Rank-Likelihood

[abs] [pdf] [supplementary]

Convergence rate of Bayesian tensor estimator and its minimax optimality

[abs] [pdf] [supplementary]

On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments

[abs] [pdf] [supplementary]

Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal Likelihood

[abs] [pdf] [supplementary]

Double Nyström Method: An Efficient and Accurate Nyström Scheme for Large-Scale Data Sets

[abs] [pdf] [supplementary]

A Deterministic Analysis of Noisy Sparse Subspace Clustering for Dimensionality-reduced Data

[abs] [pdf] [supplementary]

\(\ell_{1,p}\)-Norm Regularization: Error Bounds and Convergence Rate Analysis of First-Order Methods

[abs] [pdf] [supplementary]

Entropy evaluation based on confidence intervals of frequency estimates : Application to the learning of decision trees

An Empirical Study of Stochastic Variational Inference Algorithms for the Beta Bernoulli Process

[abs] [pdf] [supplementary]

Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection

[abs] [pdf] [supplementary]

Predictive Entropy Search for Bayesian Optimization with Unknown Constraints

[abs] [pdf] [supplementary]

Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP)

[abs] [pdf] [supplementary]

Robust Estimation of Transition Matrices in High Dimensional Heavy-tailed Vector Autoregressive Processes

[abs] [pdf] [supplementary]

Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks

[abs] [pdf] [supplementary]

Preference Completion: Large-scale Collaborative Ranking from Pairwise Comparisons

[abs] [pdf] [supplementary]

Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components

Unsupervised Riemannian Metric Learning for Histograms Using Aitchison Transformations

[abs] [pdf] [supplementary]

Algorithms for the Hard Pre-Image Problem of String Kernels and the General Problem of String Prediction

[abs] [pdf] [supplementary]

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

[abs] [pdf] [supplementary]

A Fast Variational Approach for Learning Markov Random Field Language Models

[abs] [pdf] [supplementary]

Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes

[abs] [pdf] [supplementary]

Intersecting Faces: Non-negative Matrix Factorization With New Guarantees

[abs] [pdf] [supplementary]

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems

[abs] [pdf] [supplementary]

Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret

[abs] [pdf] [supplementary]

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

[abs] [pdf] [supplementary]

A Convex Exemplar-based Approach to MAD-Bayes Dirichlet Process Mixture Models

[abs] [pdf] [supplementary]

Multi-instance multi-label learning in the presence of novel class instances

[abs] [pdf] [supplementary]

An Asynchronous Distributed Proximal Gradient Method for Composite Convex Optimization

[abs] [pdf] [supplementary]

Boosted Categorical Restricted Boltzmann Machine for Computational Prediction of Splice Junctions

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

[abs] [pdf] [supplementary]

A trust-region method for stochastic variational inference with applications to streaming data

[abs] [pdf] [supplementary]

Inference in a Partially Observed Queuing Model with Applications in Ecology

[abs] [pdf] [supplementary]

On the Optimality of Multi-Label Classification under Subset Zero-One Loss for Distributions Satisfying the Composition Property

Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization

[abs] [pdf] [supplementary]