MicroRNAs are a class of small, single-stranded RNAs which are produced by non-protein-coding RNA genes with a length of 21-29 nt. They regulate the expression of protein-encoding genes at the post-transcriptional level and the degradation ofmRNAs by base pairing to mRNAs. Mature miRNAs are processed from 60-90 nt RNA hairpin structures called pre-miRNAs. At present, most of the machine learning computational methods for pre-miRNAs prediction are based on two-class SVM and use structural information of pre-miRNA hairpins. Those methods share a common feature that all of them need a negative dataset in the training dataset and feature selection in both training and testing dataset. In order to avoid selecting false negative examples of miRNA hairpins in the training dataset which may mislead the classifiers, we presented a microRNA prediction algorithm called MirBio based on miRNAs Biogenesis which is trained only on the information of the positive miRNAs class to predict miRNAs. It can predict both pre-miRNAs and miRNAs and get a relatively satisfying result in this study.
LIU Yuan-ningYAN WenZHANG HaoLI ZhiLU Hui-junLI Xin