基因演算法中的雜訊處理問題
- 1999-09-20 (Mon.), 10:30 AM
- 二樓交誼廳
- 許 聞 廉 教授
Abstract
Computational issues in biology are usually very complex due to errors in the lab data. We shall illustrate this using an example with a nice mathematical structure. This problem is to test the consecutive ones property of a (0,1)-matrix: that is, whether it is possible to permute the columns so that each row of the resulting matrix has the ones occur in a consecutive block. This is useful for DNA sequence assembly. The linear time algorithm by Booth and Lueker (1975) for this problem has a serious drawback: the data must be error-free. However, laboratory work is never flawless. We devised a new algorithm for this problem, which, under modest assumptions, can accommodate the following four types of errors: false negatives, false positives, non- unique probes and chimeric clones. Clustering turns out to be a crucial technique in this algorithm.