SSL-methods A semi-supervised classiﬁcation method for prognosis of ACLF was proposed in 2015 , where the authors constructed an im-balanced prediction model based on small sphere and large margin approach (SSLM) , which classiﬁes two classes (improved patients, death patients) of samples by maximizing their margin. SSLM was shown to perform better than OneClass SVM and Support Vector Data Description (SVDD) methods. The authors also experimented with semisupervised Twin SVM  by adding unlabeled patients into the dataset.
Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate the label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples. Most popular semi-supervised learning approaches are sensitive to initial label distribution happened in imbalanced labeled datasets. The class boundary will be severely skewed by the majority classes in an imbalanced classiﬁcation. In , the authors propose a simple and eﬀective approach to alleviate the unfavorable inﬂuence of imbalance problem by iteratively selecting a few unlabeled samples and adding them into the minority classes to form a balanced labeled dataset for the learning methods afterwards. The experiments on UCI datasets  and MNIST handwritten digits dataset  showed that the proposed approach outperforms other existing state of the art methods.
A SSL method in  uses a tranductive learning approach to build upon a graph-based phase ﬁeld model  that handles imbalanced class distributions. This method is able to encourage or penalize the memberships of data to diﬀerent classes according to an explicit a priori model that avoids biased classiﬁcations. Experiments, conducted on real-world benchmarks, support the better performance of the model compared to several state of the art semi-supervised learning algorithms.
The problem of predicting splice sites in a genome using semisupervised learning approach  is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of nonsplice sites. To address this challenge, the authors propose to use ensembles of semi-supervised classiﬁers, speciﬁcally self-training and cotraining classiﬁers. The experiments on ﬁve highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, it was found that ensembles of co-training and self-training classiﬁers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines.
A semi-supervised learning task from both labeled and unlabeled instances and in particular, selftraining with decision tree learners as base learners is proposed in . The authors show that standard decision tree algorithm as the base learner cannot be eﬀective in a self-training algorithm to semi-supervised learning. The main reason is that the basic decision tree learner does not produce reliable probability estimation to its predictions. Therefore, it cannot be a proper selection criterion in self-training. They considered the eﬀect of several modiﬁcations to the basic decision tree learner that produce better probability estimation than using the distributions at the leaves of the tree. They show that these modiﬁcations do not produce better performance when used only the labeled data, but they do beneﬁt more from the unlabeled data in self-training. The modiﬁcations that they considered are Naive Bayes Tree, a combination of No-pruning and Laplace correction, grafting, and using a distance-based measure. Then they extended this improvement to algorithms for ensembles of decision trees and the authors show that the ensemble learner gives an extra improvement over the adapted decision tree learners.
In , the authors describe the stochastic semi-supervised learning approach that was used in their submission to all six tasks in 20092010 Active Learning Challenge. The method was designed to tackle the binary classiﬁcation problem under the condition that the number of labeled data points is extremely small and the two classes are highly imbalanced. It starts with only one positive seed given by the contest organizer. They randomly picked additional unlabeled data points and treated them as “negative” seeds based on the fact that the positive label is rare across all datasets. A classiﬁer was trained using the “labeled” data points and then was used to predict the unlabeled dataset. They took the ﬁnal result to be the average of “n” stochastic iterations. Supervised learning was used as a large number of labels were purchased. Their approach was shown to work well in 5 out of 6 datasets, which ranked them 3rd in the contest.
A framework to address the imbalanced data problem using semisupervised learning is proposed in . Speciﬁcally, from a supervised problem, they created a semisupervised problem and then use a semi-supervised learning method to identify the most relevant instances to establish a well-deﬁned training set. They presented extensive experimental results, which demonstrate that the proposed framework significantly outperforms all other sampling algorithms in 67% of the cases across three diﬀerent classiﬁers and ranks second best for the remaining 33% of the cases.
A combined co-training and random subspace generation technique is proposed in  for sentiment classiﬁcation problems. There are two main advantages of this dynamic strategy over the static strategy in generating random subspaces. First, the dynamic strategy makes the involved subspace classiﬁers quite different from each other even when the training data becomes similar after some iterations. Second, considering that the most helpful features (e.g., sentimental words) for sentiment classiﬁcation usually account for a small portion, it is possible that one random subspace might contain few useful features. When this happens in the static strategy, the corresponding subspace classiﬁer will perform badly in selecting correct samples from the unlabeled data. This makes semi-supervised learning fail. In comparison, the dynamic strategy can avoid this phenomenon.