简介:Translationinitiationsites(TISs)areimportantsignalsincDNAsequences.InmanypreviousattemptstopredictTISsincDNAsequences,threemajorfactorsaffectthepredictionperformance:thenatureofthecDNAsequencesets,therelevantfeaturesselected,andtheclassificationmethodsused.Inthispaper,weexaminedifferentapproachestoselectandintegraterelevantfeaturesforTISprediction.Thetopselectedsignificantfeaturesincludethefeaturesfromthepositionweightmatrixandthepropensitymatrix,thenumberofnucleotideCinthesequencedownstreamATG,thenumberofdownstreamstopcodons,thenumberofupstreamATGs,andthenumberofsomeaminoacids,suchasaminoacidsAandD.Withthenumericaldatageneratedfromthesefeatures,differentclassificationmethods,includingdecisiontree,naiveBayes,andsupportvectormachine,wereappliedtothreeindependentsequencesets.Theidentifiedsignificantfeatureswerefoundtobebiologicallymeaningful,whiletheexperimentsshowedpromisingresults.