Wiki
We have compiled hundreds of related entries to help you understand "artificial intelligence"
The holdout method is a model evaluation method that divides the dataset D into two mutually exclusive sets. Assuming that one set is the training set S and the other is the test set T, then: D = S ∪ T , S ∩ T = ∅ The division of the training/test set should keep the data distribution as consistent as possible. To avoid […]
Pruning is a method to stop a decision tree from branching. It is a method to solve the problem of overfitting in decision trees.
Hypothesis testing is a method of testing statistical assumptions, mainly used in inferential statistics. "Statistical assumptions" are a kind of scientific hypothesis testing, which is mainly carried out by observing the model of random variables. Under the premise that the unknown parameters can be estimated, appropriate inferences can be made about the unknown parameter values based on the results. The assumption of parameters in statistics is the assumption of one or more […]
Ensemble learning is the idea of combining multiple models into a high-precision model. It is mainly used in the field of machine learning. It is not a single machine learning algorithm, but a learning task completed by building and combining multiple learners. Ensemble learning can be used for classification problems, regression problems, feature selection, outlier detection, etc. It can be said that all machine learning […]
The error-correcting output coding method ECOC can convert multi-class problems into multiple two-class problems, and the error-correcting output code itself has error correction capabilities, which can improve the prediction accuracy of supervised learning algorithms. Output category encoding can convert multi-class problems into two categories, that is, each category corresponds to a binary bit string of length n, forming a total of m codewords, which […]
Empirical risk shows the model's ability to predict training samples. It is obtained by calculating the loss function once for all training samples and then accumulating and averaging. The loss function is the basis of expected risk, empirical risk, and structural risk. The loss function is for a single specific sample and represents the gap between the model's predicted value and the true value. […]
K-means clustering is a vector quantization method that was used in signal processing in the early days. It is currently mainly used as a clustering analysis method in the field of data mining. The purpose of k-means clustering is to divide n points into k clusters so that each point belongs to the cluster corresponding to the nearest mean, and use this as the clustering criterion. This type of problem […]
Margin theory is a concept in support vector machines, where the margin refers to the minimum distance between two types of samples divided by a hyperplane. Margin theory can be used to explain that when the training error of the AdaBoost algorithm is 0, continuing training can further improve the generalization performance of the model. Let x and y represent the input and […]
The perceptron is a binary linear classification model that can be regarded as the simplest form of a feedforward neural network, a model invented by Frank Rosenblatt in 1957. Its input is the feature vector of an instance, and its output is the category of the instance.
The International Conference on Neural Information Processing Systems (NIPS) is a top conference in the field of machine learning and neural computing hosted by the NIPS Foundation every December.
Normalization is to map data to a specified range to remove the dimensions and dimensional units of data of different dimensions to improve the comparability between different data indicators.
The proximal gradient method (PGD) is a special gradient descent method, which is mainly used to solve optimization problems with non-differentiable objective functions.
Post-pruning refers to the pruning operation performed after the decision tree is generated.
A probabilistic graphical model is a probabilistic model that uses a graph structure to express the relationship between variables.
Regression is a supervised learning algorithm for predicting and modeling numerical continuous random variables.
Rule learning is to learn a set of IF-THEN rules consisting of atomic propositions from training data. It is a type of unsupervised learning and is often classified as a type of classification.
The root node is the first node in the tree data structure. A normal node may have a parent node and child nodes, but since the root note is the first node, it only has child nodes.
Particle Swarm Optimization (PSO), also known as particle swarm optimization, is an optimization algorithm based on swarm intelligence theory. The particles in the swarm complete the optimization process of the problem in each iterative search process.
The rule engine evolved from the inference engine and is a component embedded in the application. It separates business decisions from application code and writes business decisions using predefined semantic modules.
The nuclear norm is the sum of the singular values of a matrix and is used to constrain the low rank of the matrix.
Association analysis is to find frequent patterns, associations, correlations or causal structures between items or object sets in transaction data, relational data or other information carriers. Association analysis method Apriori algorithm Apriori algorithm is a basic algorithm for mining frequent item sets required to generate Boolean association rules. It makes […]
An individual learner is a relative concept. It is the learner before integration in ensemble learning. According to the generation method of individual learners, ensemble learning methods can be divided into the following two categories: There is a strong dependency relationship, and the serialization method must be generated serially, such as Boosting; There is no strong dependency relationship, and the learners can be generated simultaneously and in parallel.
Induction is a process of reasoning that generalizes a general principle from a series of specific facts. Mathematical induction refers to a way of thinking that generalizes general concepts, principles or conclusions from multiple things. Induction can be divided into complete induction and incomplete induction: Complete induction: includes all objects of this type, thus making a conclusion about this type of object.
Inductive learning is a method of machine learning, which is usually used for symbolic learning. It mainly summarizes a concept description by giving a series of known positive and negative examples about a concept. Inductive learning can obtain new concepts, create new rules, and discover new theories. Its general operation is generalization and specialization, where generalization refers to the expansion of […]
The holdout method is a model evaluation method that divides the dataset D into two mutually exclusive sets. Assuming that one set is the training set S and the other is the test set T, then: D = S ∪ T , S ∩ T = ∅ The division of the training/test set should keep the data distribution as consistent as possible. To avoid […]
Pruning is a method to stop a decision tree from branching. It is a method to solve the problem of overfitting in decision trees.
Hypothesis testing is a method of testing statistical assumptions, mainly used in inferential statistics. "Statistical assumptions" are a kind of scientific hypothesis testing, which is mainly carried out by observing the model of random variables. Under the premise that the unknown parameters can be estimated, appropriate inferences can be made about the unknown parameter values based on the results. The assumption of parameters in statistics is the assumption of one or more […]
Ensemble learning is the idea of combining multiple models into a high-precision model. It is mainly used in the field of machine learning. It is not a single machine learning algorithm, but a learning task completed by building and combining multiple learners. Ensemble learning can be used for classification problems, regression problems, feature selection, outlier detection, etc. It can be said that all machine learning […]
The error-correcting output coding method ECOC can convert multi-class problems into multiple two-class problems, and the error-correcting output code itself has error correction capabilities, which can improve the prediction accuracy of supervised learning algorithms. Output category encoding can convert multi-class problems into two categories, that is, each category corresponds to a binary bit string of length n, forming a total of m codewords, which […]
Empirical risk shows the model's ability to predict training samples. It is obtained by calculating the loss function once for all training samples and then accumulating and averaging. The loss function is the basis of expected risk, empirical risk, and structural risk. The loss function is for a single specific sample and represents the gap between the model's predicted value and the true value. […]
K-means clustering is a vector quantization method that was used in signal processing in the early days. It is currently mainly used as a clustering analysis method in the field of data mining. The purpose of k-means clustering is to divide n points into k clusters so that each point belongs to the cluster corresponding to the nearest mean, and use this as the clustering criterion. This type of problem […]
Margin theory is a concept in support vector machines, where the margin refers to the minimum distance between two types of samples divided by a hyperplane. Margin theory can be used to explain that when the training error of the AdaBoost algorithm is 0, continuing training can further improve the generalization performance of the model. Let x and y represent the input and […]
The perceptron is a binary linear classification model that can be regarded as the simplest form of a feedforward neural network, a model invented by Frank Rosenblatt in 1957. Its input is the feature vector of an instance, and its output is the category of the instance.
The International Conference on Neural Information Processing Systems (NIPS) is a top conference in the field of machine learning and neural computing hosted by the NIPS Foundation every December.
Normalization is to map data to a specified range to remove the dimensions and dimensional units of data of different dimensions to improve the comparability between different data indicators.
The proximal gradient method (PGD) is a special gradient descent method, which is mainly used to solve optimization problems with non-differentiable objective functions.
Post-pruning refers to the pruning operation performed after the decision tree is generated.
A probabilistic graphical model is a probabilistic model that uses a graph structure to express the relationship between variables.
Regression is a supervised learning algorithm for predicting and modeling numerical continuous random variables.
Rule learning is to learn a set of IF-THEN rules consisting of atomic propositions from training data. It is a type of unsupervised learning and is often classified as a type of classification.
The root node is the first node in the tree data structure. A normal node may have a parent node and child nodes, but since the root note is the first node, it only has child nodes.
Particle Swarm Optimization (PSO), also known as particle swarm optimization, is an optimization algorithm based on swarm intelligence theory. The particles in the swarm complete the optimization process of the problem in each iterative search process.
The rule engine evolved from the inference engine and is a component embedded in the application. It separates business decisions from application code and writes business decisions using predefined semantic modules.
The nuclear norm is the sum of the singular values of a matrix and is used to constrain the low rank of the matrix.
Association analysis is to find frequent patterns, associations, correlations or causal structures between items or object sets in transaction data, relational data or other information carriers. Association analysis method Apriori algorithm Apriori algorithm is a basic algorithm for mining frequent item sets required to generate Boolean association rules. It makes […]
An individual learner is a relative concept. It is the learner before integration in ensemble learning. According to the generation method of individual learners, ensemble learning methods can be divided into the following two categories: There is a strong dependency relationship, and the serialization method must be generated serially, such as Boosting; There is no strong dependency relationship, and the learners can be generated simultaneously and in parallel.
Induction is a process of reasoning that generalizes a general principle from a series of specific facts. Mathematical induction refers to a way of thinking that generalizes general concepts, principles or conclusions from multiple things. Induction can be divided into complete induction and incomplete induction: Complete induction: includes all objects of this type, thus making a conclusion about this type of object.
Inductive learning is a method of machine learning, which is usually used for symbolic learning. It mainly summarizes a concept description by giving a series of known positive and negative examples about a concept. Inductive learning can obtain new concepts, create new rules, and discover new theories. Its general operation is generalization and specialization, where generalization refers to the expansion of […]