Home  |  Organizers  |  Proceedings Editors  |  Proceedings Contributors  |  Search  |
 
Title:PREDICTION OF TRANSCRIPTION START SITES BASED ON FEATURE SELECTION USING AMOSA
DOI No:10.1142/9781860948732_0021
Source:COMPUTATIONAL SYSTEMS BIOINFORMATICS (pp 183-193)
Author(s):Xi Wang
The first two authors are joint first authors.

Bioinformatics Division, TNLIST and Dep. of Automation, Tsinghua Univ., Beijing 100084, China

Sanghamitra Bandyopadhya
The first two authors are joint first authors.

Bioinformatics Division, TNLIST and Dep. of Automation, Tsinghua Univ., Beijing 100084, China

Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, India

Zhenyu Xuan
Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA

Xiaoyue Zhao
Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA

Michael Q. Zhang
Bioinformatics Division, TNLIST and Dep. of Automation, Tsinghua Univ., Beijing 100084, China

Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA

Xuegong Zhang
Bioinformatics Division, TNLIST and Dep. of Automation, Tsinghua Univ., Beijing 100084, China

Abstract:To understand the regulation of the gene expression, the identification of transcription start sites (TSSs) is a primary and important step. With the aim to improve the computational prediction accuracy, we focus on the most challenging task, i.e., to identify the TSSs within 50 bp in non-CpG related promoter regions. Due to the diversity of non-CpG related promoters, a large number of features are extracted. Effective feature selection can minimize the noise, improve the prediction accuracy, and also to discover biologically meaningful intrinsic properties. In this paper, a newly proposed multi-objective simulated annealing based optimization method, Archive Multi-Objective Simulated Annealing (AMOSA), is integrated with Linear Discriminant Analysis (LDA) to yield a combined feature selection and classification system. This system is found to be comparable to, often better than, several existing methods in terms of different quantitative performance measures.
Full Text:View full text in PDF format (449KB)
TOC:Back to Table of Contents

Copyright © 2012 World Scientific Publishing Co. All rights reserved.