Person:
Wang, May Dongmei

Associated Organization(s)
ORCID
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 6 of 6
  • Item
    Histological image classification using biologically interpretable shape-based features
    (Georgia Institute of Technology, 2013) Kothari, Sonal ; Phan, John H. ; Young, Andrew N. ; Wang, May Dongmei
    Background: Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. Methods: We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. Results: The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. Conclusions: Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions.
  • Item
    caCORRECT2: improving accuracy and reliability of microarray data in the presence of artifacts
    (Georgia Institute of Technology, 2011) Moffitt, Richard A. ; Yin-Goen, Qiqin ; Stokes, Todd H. ; Parry, R. M. ; Torrance, James H. ; Phan, John H. ; Young, Andrew N. ; Wang, May Dongmei
    Background. In previous work, we reported the development of caCORRECT, a novel microarray quality control system built to identify and correct spatial artifacts commonly found on Affymetrix arrays. We have made recent improvements to caCORRECT, including the development of a model-based data-replacement strategy and integration with typical microarray workflows via caCORRECT's web portal and caBIG grid services. In this report, we demonstrate that caCORRECT improves the reproducibility and reliability of experimental results across several common Affymetrix microarray platforms. caCORRECT represents an advance over state-of-art quality control methods such as Harshlighting, and acts to improve gene expression calculation techniques such as PLIER, RMA and MAS5.0, because it incorporates spatial information into outlier detection as well as outlier information into probe normalization. The ability of caCORRECT to recover accurate gene expressions from low quality probe intensity data is assessed using a combination of real and synthetic artifacts with PCR follow-up confirmation and the affycomp spike in data. The caCORRECT tool can be accessed at the website: http://cacorrect.bme.gatech.edu webcite. Results. We demonstrate that (1) caCORRECT's artifact-aware normalization avoids the undesirable global data warping that happens when any damaged chips are processed without caCORRECT; (2) When used upstream of RMA, PLIER, or MAS5.0, the data imputation of caCORRECT generally improves the accuracy of microarray gene expression in the presence of artifacts more than using Harshlighting or not using any quality control; (3) Biomarkers selected from artifactual microarray data which have undergone the quality control procedures of caCORRECT are more likely to be reliable, as shown by both spike in and PCR validation experiments. Finally, we present a case study of the use of caCORRECT to reliably identify biomarkers for renal cell carcinoma, yielding two diagnostic biomarkers with potential clinical utility, PRKAB1 and NNMT. Conclusions. caCORRECT is shown to improve the accuracy of gene expression, and the reproducibility of experimental results in clinical application. This study suggests that caCORRECT will be useful to clean up possible artifacts in new as well as archived microarray data.
  • Item
    An interactive visualization tool and data model for experimental design in systems biology
    (Georgia Institute of Technology, 2008-08) Kapoor, Shray ; Quo, Chang Feng ; Merrill, Alfred H. ; Wang, May Dongmei
    Experimental design is important, but is often under-supported, in systems biology research. To improve experimental design, we extend the visualization of complex sphingolipid pathways to study biosynthetic origin in SphinGOMAP. We use the ganglio-series sphingolipid dataset as a test bed and the Java Universal Network / Graph Framework (JUNG) visualization toolkit. The result is an interactive visualization tool and data model for experimental design in lipid systems biology research. We improve the current SphinGOMAP in terms of interactive visualization by allowing (i) choice of four different network layouts, (ii) dynamic addition / deletion of on-screen molecules and (iii) mouse-over to reveal detailed molecule data. Future work will focus on integrating various lipid-relevant data systematically i.e. SphinGOMAP biosynthetic data, Lipid Bank molecular data (Japan) and Lipid MAPS metabolic pathway data (USA). We aim to build a comprehensive and interactive communication platform to improve experimental design for scientists globally in high-throughput lipid systems biology research.
  • Item
    ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses
    (Georgia Institute of Technology, 2008) Stokes, Todd H. ; Torrance, J. T. ; Li, Henry ; Wang, May Dongmei
    Background. A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.e., data obtained from different chip manufacturers), and that the repositories provide only basic biological keywords linking to PubMed. As a result, it is difficult to find datasets using research context or analysis parameters information beyond a few keywords. For example, to reduce the "curse-of-dimension" problem in microarray analysis, the number of samples is often increased by merging array data from different datasets. Knowing chip data parameters such as pre-processing steps (e.g., normalization, artefact removal, etc), and knowing any previous biological validation of the dataset is essential due to the heterogeneity of the data. However, most of the microarray repositories do not have meta-data information in the first place, and do not have a a mechanism to add or insert this information. Thus, there is a critical need to create "intelligent" microarray repositories that (1) enable update of meta-data with the raw array data, and (2) provide standardized archiving protocols to minimize bias from the raw data sources. Results. To address the problems discussed, we have developed a community maintained system called ArrayWiki that unites disparate meta-data of microarray meta-experiments from multiple primary sources with four key features. First, ArrayWiki provides a user-friendly knowledge management interface in addition to a programmable interface using standards developed by Wikipedia. Second, ArrayWiki includes automated quality control processes (caCORRECT) and novel visualization methods (BioPNG, Gel Plots), which provide extra information about data quality unavailable in other microarray repositories. Third, it provides a user-curation capability through the familiar Wiki interface. Fourth, ArrayWiki provides users with simple text-based searches across all experiment meta-data, and exposes data to search engine crawlers (Semantic Agents) such as Google to further enhance data discovery. Conclusions. Microarray data and meta information in ArrayWiki are distributed and visualized using a novel and compact data storage format, BioPNG. Also, they are open to the research community for curation, modification, and contribution. By making a small investment of time to learn the syntax and structure common to all sites running MediaWiki software, domain scientists and practioners can all contribute to make better use of microarray technologies in research and medical practices. ArrayWiki is available at http://www.bio-miblab.org/arraywiki.
  • Item
    Computational modeling of a metabolic pathway in ceramide de novo synthesis
    (Georgia Institute of Technology, 2007-08) Dhingra, Shobhika ; Freedenberg, Melissa ; Quo, Chang Feng ; Merrill, Alfred H. ; Wang, May Dongmei
    Studies have implicated ceramide as a key molecular agent in regulating programmed cell death, or apoptosis. Consequently, there is significant potential in targeting intracellular ceramide as a cancer therapeutic agent. The cell’s major ceramide source is the ceramide de novo synthesis pathway, which consists of a complex network of interdependent enzyme-catalyzed biochemical reactions. To understand how ceramide works, we have initiated the study of the ceramide de novo synthesis pathway using computational modeling based on fundamental principles of biochemical kinetics. Specifically, we designed and developed the model in MATLAB SIMULINK for the behavior of dihydroceramide desaturase. Dihydroceramide desaturase is one of three key enzymes in the ceramide de novo synthesis pathway, and it converts a relatively inert precursor molecule, dihydroceramide into biochemically reactive ceramide. A major issue in modeling is parameter estimation. We solved this problem by adopting a heuristic strategy based on a priori knowledge from literature and experimental data. We evaluated model accuracy by comparing the model prediction results with interpolated experimental data. Our future work includes more experimental validation of the model, dynamic rate constants assessment, and expansion of the model to include additional enzymes in the ceramide de novo synthesis pathway.
  • Item
    Dynamic pathway modeling of sphingolipid metabolism
    (Georgia Institute of Technology, 2004-09) Henning, Peter A. ; Merrill, Alfred H. ; Wang, May Dongmei
    We report our research results on computational metabolome study. The goal of this research is to extend the integrated experimental modeling methodologies in sphingolipid metabolism study to other complex biological process studies such as signal transduction or gene regulation. Another feature of this work is that the 3-D information representation enables the user orchestrate the simulated pathways in real time.