Generative vs. Discriminative Models

Published by Daniel at November 1, 2021

The pattern theory developed by Granander in the 1970s is a unified mathematical structure for representing, learning, and recognizing patterns encountered in science and engineering. The objects in this theory tend to be rich in complexity and dimensionality, and their patterns are characterized by algebraic and probability distributions. Thus, different types of models exist to evaluate tasks involving these patterns based on a variety of perspectives to obtain a wide diversity of outputs. For simplicity, models can be grouped into two big families: generative and discriminative. (1)

Thus, different types of models exist to evaluate tasks involving these patterns based on a variety of perspectives to obtain a wide diversity of outputs. For simplicity, models can be grouped into two big families: generative and discriminative.(1)

Generative Models

Generative models are called “generative” because sampling can generate synthetic data points. They produce a probability density model with all the variables in a system and use them to generate classification and regression functions. In these models, a system’s input and output are represented homogeneously by a joint probability distribution; they define a distribution over all variables.(2)

They can learn in a semi-supervised manner and also in an unsupervised manner where datasets only have labeled input signals, and output signals lack labels.(1)These models can be used for compression, denoising, inpainting, texture synthesis; they have also proven to be helpful in medical diagnosis, genomics, and bioinformatics. Since there are many applications, generative models are usually formulated, trained, and evaluated differently. (3)

Some popular models are Gaussians, Naive Bayes, Mixture of Multinomials, Hidden Markov Models, Sigmoidal Belief Networks, and Bayesian Networks.(1)

These models can be used for compression, denoising, inpainting, texture synthesis; they have also proven to be helpful in medical diagnosis, genomics, and bioinformatics. Since there are many applications, generative models are usually formulated, trained, and evaluated differently. (3)

Some popular models are Gaussians, Naive Bayes, Mixture of Multinomials, Hidden Markov Models, Sigmoidal Belief Networks, and Bayesian Networks.(1)

Discriminative Models

Discriminative models directly make estimations of posterior probabilities without attempting to model underlying probability distributions. These models focus on the given task obtaining a better performance; they are primarily interested in optimizing a mapping from the inputs while only the resulting classification boundaries are adjusted.

Here, the final mapping of an input (x) and output (y) is important, and the final estimate is only considered. Even the estimation of a conditional distribution is viewed as unnecessary. These models only make use of the conditional probability of a candidate analysis given the input sentence. Thus, the joint probability is no longer possible to derive.

The main advantage is that it provides more freedom to define features and incorporate arbitrary features over the input.(4)

The downside is that discriminative models typically require numerical optimization techniques that can be computationally difficult, and by using more complexity, the parsing problem gets harder.(4)

Discriminative models have been successful in a variety of tasks like image and document classification, problems in biosequence analysis, and time sequence prediction. Popular models are Logistic Regression, Support Vector Machines, Traditional Neural Networks, Nearest Neighbor, and Conditional Random Fields.(2)

To illustrate better the difference between generative and discriminative models, let us consider the task of determining the language that someone speaks: the generative approach is to learn each language and determine which language fits into the speech, and the discriminative approach is to determine the linguistic differences without learning any language at all.(2)

Even the estimation of a conditional distribution is viewed as unnecessary. These models only make use of the conditional probability of a candidate analysis given the input sentence. Thus, the joint probability is no longer possible to derive.The main advantage is that it provides more freedom to define features and incorporate arbitrary features over the input.(4)

The downside is that discriminative models typically require numerical optimization techniques that can be computationally difficult, and by using more complexity, the parsing problem gets harder.(4)

References