Gene Selection

Micro-array technology provides researchers with an opportunity to capture the expressions of thousands of genes in one experiment. Despite past successes, the inherent problems, such as technical variations, small sample size and asymmetric data structure remains an impediment to develop robust computational tools that can be used for robust gene selection to infer its function or its association with a particular disease or biological processes. Testing for differential expression requires analysis of complex multi-factor experiments, constructing joint expression profiles which discriminate between groups of interest, and significance analysis of the selected genes. Dr. Yeasin and his team is researching on various issues in developing robust computational methods based on statistical analysis, supervised and unsupervised machine learning, and visualization for the knowledge discovery from microarray data. There are other sources of data intended for this. These data sources include but not limited to protein data, sequence data, etc. The literature may also be used as a potential source of information. Another related database is the gene ontology database which may also be used. The main goal of data integration methods is to merge the results from multiple databases/experiments to improve the robustness in gene selection.