A Linear Tempering Paradigm for Hidden Markov Models – Nonstationary inference has found the most successful practice in many tasks such as data mining and classification. However, sparse inference is not a very flexible problem. In this work, we consider the problem from the sparsity perspective. We argue that sparse inference is an important problem in data science, because its solution is more flexible. Specifically, we formulate the problem as a linear domain in nonlinear terms, and propose a formulation of the problem that avoids the need of regularization. We prove the lower bound of the solution, and give an algorithm that does not need any regularization, thus proving the existence of a sparse problem. We further present an algorithm for sparse inference that works without any regularization, and we show that it can solve the nonlinearity problem. Finally, we give an algorithm for sparse inference that is efficient as well as suitable for many general models.

In this paper, we propose a new model for learning density-dependent linear networks in which the input density is estimated from data. The model consists of two steps. First, we infer the linear unit in terms of the density dependence, and then it generates an $alpha$. Our algorithm combines a set of two-step-by-two-step networks. As soon as the model is trained, we obtain the first model that maps the input density to an $alpha$. We show that our algorithm can learn low-dimensional networks from data. The network is illustrated with several examples, and compared with existing deep learning approaches. Our algorithm is also applied on the MNIST dataset and a non-differentiable-learning deep network for clustering. On both datasets, our algorithm outperforms the state-of-the-art by a large margin.

A Unified View of Deep Learning

# A Linear Tempering Paradigm for Hidden Markov Models

Online Density-Regularized Bayesian Network ModelsIn this paper, we propose a new model for learning density-dependent linear networks in which the input density is estimated from data. The model consists of two steps. First, we infer the linear unit in terms of the density dependence, and then it generates an $alpha$. Our algorithm combines a set of two-step-by-two-step networks. As soon as the model is trained, we obtain the first model that maps the input density to an $alpha$. We show that our algorithm can learn low-dimensional networks from data. The network is illustrated with several examples, and compared with existing deep learning approaches. Our algorithm is also applied on the MNIST dataset and a non-differentiable-learning deep network for clustering. On both datasets, our algorithm outperforms the state-of-the-art by a large margin.