Skip to main content

Table 4 Learning algorithm

From: A genetic algorithm based framework for software effort prediction

LA

Description

GaussianProcesses (GP)

This algorithm implements the Bayesian Gaussian process technique for non-linear regression. This method is equivalent to kernel ridge regression.

LeastMedSq (LMS)

Is a robust linear regression algorithm that minimizes the median (rather than mean) of the squares of divergences from the regression line. It repeatedly applies standard linear regression to subsamples of the data and outputs the solution that has the smallest median-square error.

LinearRegression (LR)

This algorithm performs standard least-squares multiple linear regression and can optionally perform attribute selection, either greedily using backward elimination or by building a full model from all attributes and dropping the terms one by one, in decreasing order of their standardized coefficients.

MultilayerPerceptron (MP)

Is a neural network that trains using back-propagation. This network can be built by hand, created by an algorithm or both. The network can also be monitored and modified during training time. The nodes in this network are all sigmoid (except for when the class is numeric in which case the output nodes become unthresholded linear units).

RBFNetwork (RBFN)

Is an algorithm that implements a Gaussian radial basis function network, deriving the centers and widths of hidden units using k-means and combining the outputs obtained from the hidden layer using logistic regression, if the class is nominal and, linear regression if it is numeric.

SMOreg (SMO)

This method implements the sequential minimal-optimization algorithm for learning a support vector regression model. The parameters can be learned using various algorithms. The algorithm is selected by setting the RegOptimizer.

AdditiveRegression (AR)

It is an algorithm that enhances the performance of a regression base classifier. Each iteration fits a model to the residuals left by the classifier on the previous iteration. Prediction is accomplished by adding the predictions of each classifier.

Bagging (BAG)

Is an algorithm that reduces variance. This method can work for both classification and regression, depending on the base learner. In the case of classification, predictions are generated by averaging probability estimates, no by voting.

ConjunctiveRule (CR)

The algorithm learns a single rule that predicts either a numeric or a nominal class value. Uncovered test instances are assigned the default class value (or distribution) of the uncovered training instances.

DecisionTable (DT)

This algorithm consists of a hierarchical table in which each entry in a higher level table gets broken down by the values of a pair of additional attributes to form another table. The structure is similar to dimensional stacking (Becker 1998).

M5Rules (M5R)

This algorithm obtains regression rules from model trees built using M5 Ridor (Ripple Down Rule learner). This means that method learns rules with exceptions by generating the default rule, using incremental reduced-error pruning to find exceptions with the smallest error rate, finding the best exceptions for each exception, and iterating. (More details (Quinlan and et al. 1992; Holmes et al. 1999; Wang and Witten 1997))

ZeroR (ZR)

This algorithm predicts the test data’s majority class (if nominal) or average value (if numeric). It is the simplest classification method, which relies on the target and ignores all predictors. ZeroR algorithm simply predicts the majority category (class).

DecisionStump (DS)

It is an algorithm designed for use with the boosting method. It builds one-level binary decision trees for datasets with a categorical and numeric class, dealing with missing values by treating them as separate values and extending a third branch from the stump.

M5P (M5P)

This algorithm combines a conventional decision tree with the possibility of linear regression functions at the nodes. First, a decision-tree induction algorithm is used to build a tree, but instead of maximizing the information gain at each inner node, a splitting criterion is used that minimizes the intra-subset variation in the class values down each branch.

REPTree (RT)

This algorithm builds a decision or regression tree using information gain/variance reduction and prunes it using reduced-error pruning. Optimized for speed, it only sorts values for numeric attributes once. It deals with missing values by splitting instances into pieces.