Wiki
We have compiled hundreds of related entries to help you understand "artificial intelligence"
The cumulative error back propagation algorithm (ABP algorithm) is a variant of the standard back propagation (BP) algorithm. If an update rule based on minimizing the cumulative error is derived, the cumulative error back propagation algorithm is obtained.
The hinge loss function is shaped like a hinge, which is why it is named. This loss function mainly exists in support vector machines. It requires not only correct classification, but also high confidence that the loss is 0, that is, the hinge loss function has higher requirements for learning. The formula of the hinge loss function is L ( y ( w * x […]
Hybrid computing is a kind of integrated computing, which includes traditional hard computing and emerging soft computing. This computing method can take the advantages of each to overcome the limitations. The main characteristics of hard computing: It is easy to establish a standard mathematical model of the problem; The established mathematical model is easy to solve and can achieve a high degree of accuracy; It has good stability. Soft […]
Gaussian kernel function is a commonly used kernel function that can map finite-dimensional data to high-dimensional space. The Gaussian kernel function is defined as follows: $latex {k{ \left( {x,x\text{'}} \right) }\text{ }=\text{ }e\m […]
Gaussian mixture model GMM is based on Gaussian probability density function, which can smoothly approximate density distribution of any shape. Since GMM has multiple models and its fine division characteristics make it suitable for complex object modeling. Suppose there is a batch of observation data $latex {X\text{ }=\text […]
The generalized linear model is a flexible linear regression model that allows the dependent variable to have a distribution other than the normal distribution. Definition The generalized linear model is an extension of the simple least squares regression. Assume that each data observation comes from an exponential family distribution. Then the mean of the distribution […]
Inductive bias can be seen as a set of assumptions in machine learning. It is used as a necessary assumption for the objective function in machine learning. The most typical example is Occam’s razor. Inductive bias is based on mathematical logic, but in practical applications, the inductive bias of a learner may be just a very rough description, or even simpler. In comparison, the theoretical value […]
Kernel method is a type of pattern recognition algorithm, which aims to find and learn the mutual relationship in a set of data. Kernel method is based on the following assumption: "point sets that cannot be linearly separated in low-dimensional space may become linearly separable after being transformed into point sets in high-dimensional space". Basic understanding of kernel method: Patterns in raw data can be […]
The generalized Rayleigh entropy can be seen as an extension of the Rayleigh entropy, which refers to the function R(A,B,x): $latex {R{ \left( {A,B,x} \right) }\text{ }=\text{ }\frac{{x\mathop{{}}\nolim […]
Inductive Logic Programming (ILP) is a symbolic rule learning method that introduces function and logic expression nesting in first-order rule learning and uses first-order logic as the expression language. ILP enables machine learning systems to have more powerful expression capabilities. At the same time, it can be seen as an application of machine learning, mainly used to solve back-based […]
The kernel trick is a method of using the kernel function to directly calculate $latex \langle\phi(x), \phi(z)\rangle $, so as to avoid calculating $latex \phi(x) $ and $latex \phi(z) $ separately, thereby speeding up the kernel method calculation […]
Recursive neural network is a representation learning method that can map words, sentences, paragraphs and articles into the same vector space according to their semantics, that is, it can represent combinable (tree/graph structure) information as meaningful vectors.
Negative correlation means that the two columns of variables change in opposite directions. When one column of variables changes, the other column of variables changes in the opposite trend of the previous variable.
A univariate decision tree is a decision tree with only one variable. That is, each time a node splits, only one feature in the feature set is selected, which also means that the classification boundary of the decision tree is composed of several segments parallel to the coordinate axis.
Negative log-likelihood is a loss function used to solve classification problems. It is a natural logarithm form of the likelihood function, which can be used to measure the similarity between two probability distributions. The negative sign is to make the maximum likelihood value correspond to the minimum loss. It is a common function form in maximum likelihood estimation and related fields. In machine learning, it is customary to use optimization […]
Non-convex optimization is a method in the field of machine learning and signal processing. It refers to a method that directly solves the problem or directly optimizes the non-convex formula without using relaxation for non-convex problems.
A nonlinear model is a mathematical expression in which there is a nonlinear relationship between the independent variable and the dependent variable. Compared with a linear model, the dependent variable and the independent variable cannot be expressed as a linear correspondence in the coordinate space.
Non-metric distance refers to the distance between parameters that does not satisfy directness.
Non-negative matrix factorization (NMF) is a matrix decomposition method where all elements satisfy the non-negative constraint.
The norm is a basic function in mathematics. It is often used to measure the length or size of a vector in a vector space (or matrix). For the norm of model parameters, it can be used as a regularization function.
ODE is the most commonly used strategy for semi-naive Bayes classifiers. ODE assumes that each attribute depends on at most one other attribute outside the category.
The polynomial kernel function refers to a kernel function expressed in polynomial form. It is a non-standard kernel function suitable for orthogonal normalized data. Its specific form is shown in the figure.
The principle of multiple interpretations is the idea that all hypotheses that are consistent with empirical observations should be retained.
Hyperplane partitioning means that if two disjoint convex sets are both open, then there exists a hyperplane that can separate them.
The cumulative error back propagation algorithm (ABP algorithm) is a variant of the standard back propagation (BP) algorithm. If an update rule based on minimizing the cumulative error is derived, the cumulative error back propagation algorithm is obtained.
The hinge loss function is shaped like a hinge, which is why it is named. This loss function mainly exists in support vector machines. It requires not only correct classification, but also high confidence that the loss is 0, that is, the hinge loss function has higher requirements for learning. The formula of the hinge loss function is L ( y ( w * x […]
Hybrid computing is a kind of integrated computing, which includes traditional hard computing and emerging soft computing. This computing method can take the advantages of each to overcome the limitations. The main characteristics of hard computing: It is easy to establish a standard mathematical model of the problem; The established mathematical model is easy to solve and can achieve a high degree of accuracy; It has good stability. Soft […]
Gaussian kernel function is a commonly used kernel function that can map finite-dimensional data to high-dimensional space. The Gaussian kernel function is defined as follows: $latex {k{ \left( {x,x\text{'}} \right) }\text{ }=\text{ }e\m […]
Gaussian mixture model GMM is based on Gaussian probability density function, which can smoothly approximate density distribution of any shape. Since GMM has multiple models and its fine division characteristics make it suitable for complex object modeling. Suppose there is a batch of observation data $latex {X\text{ }=\text […]
The generalized linear model is a flexible linear regression model that allows the dependent variable to have a distribution other than the normal distribution. Definition The generalized linear model is an extension of the simple least squares regression. Assume that each data observation comes from an exponential family distribution. Then the mean of the distribution […]
Inductive bias can be seen as a set of assumptions in machine learning. It is used as a necessary assumption for the objective function in machine learning. The most typical example is Occam’s razor. Inductive bias is based on mathematical logic, but in practical applications, the inductive bias of a learner may be just a very rough description, or even simpler. In comparison, the theoretical value […]
Kernel method is a type of pattern recognition algorithm, which aims to find and learn the mutual relationship in a set of data. Kernel method is based on the following assumption: "point sets that cannot be linearly separated in low-dimensional space may become linearly separable after being transformed into point sets in high-dimensional space". Basic understanding of kernel method: Patterns in raw data can be […]
The generalized Rayleigh entropy can be seen as an extension of the Rayleigh entropy, which refers to the function R(A,B,x): $latex {R{ \left( {A,B,x} \right) }\text{ }=\text{ }\frac{{x\mathop{{}}\nolim […]
Inductive Logic Programming (ILP) is a symbolic rule learning method that introduces function and logic expression nesting in first-order rule learning and uses first-order logic as the expression language. ILP enables machine learning systems to have more powerful expression capabilities. At the same time, it can be seen as an application of machine learning, mainly used to solve back-based […]
The kernel trick is a method of using the kernel function to directly calculate $latex \langle\phi(x), \phi(z)\rangle $, so as to avoid calculating $latex \phi(x) $ and $latex \phi(z) $ separately, thereby speeding up the kernel method calculation […]
Recursive neural network is a representation learning method that can map words, sentences, paragraphs and articles into the same vector space according to their semantics, that is, it can represent combinable (tree/graph structure) information as meaningful vectors.
Negative correlation means that the two columns of variables change in opposite directions. When one column of variables changes, the other column of variables changes in the opposite trend of the previous variable.
A univariate decision tree is a decision tree with only one variable. That is, each time a node splits, only one feature in the feature set is selected, which also means that the classification boundary of the decision tree is composed of several segments parallel to the coordinate axis.
Negative log-likelihood is a loss function used to solve classification problems. It is a natural logarithm form of the likelihood function, which can be used to measure the similarity between two probability distributions. The negative sign is to make the maximum likelihood value correspond to the minimum loss. It is a common function form in maximum likelihood estimation and related fields. In machine learning, it is customary to use optimization […]
Non-convex optimization is a method in the field of machine learning and signal processing. It refers to a method that directly solves the problem or directly optimizes the non-convex formula without using relaxation for non-convex problems.
A nonlinear model is a mathematical expression in which there is a nonlinear relationship between the independent variable and the dependent variable. Compared with a linear model, the dependent variable and the independent variable cannot be expressed as a linear correspondence in the coordinate space.
Non-metric distance refers to the distance between parameters that does not satisfy directness.
Non-negative matrix factorization (NMF) is a matrix decomposition method where all elements satisfy the non-negative constraint.
The norm is a basic function in mathematics. It is often used to measure the length or size of a vector in a vector space (or matrix). For the norm of model parameters, it can be used as a regularization function.
ODE is the most commonly used strategy for semi-naive Bayes classifiers. ODE assumes that each attribute depends on at most one other attribute outside the category.
The polynomial kernel function refers to a kernel function expressed in polynomial form. It is a non-standard kernel function suitable for orthogonal normalized data. Its specific form is shown in the figure.
The principle of multiple interpretations is the idea that all hypotheses that are consistent with empirical observations should be retained.
Hyperplane partitioning means that if two disjoint convex sets are both open, then there exists a hyperplane that can separate them.