Antiretroviral Therapy Optimisation without Genotype Resistance Testing: A Perspective on Treatment History Based Models

Written by Scott Christley et al. on October 29, 2010 – 7:00 am -

Background

Although genotypic resistance testing (GRT) is recommended to guide combination antiretroviral therapy (cART), funding and/or facilities to perform GRT may not be available in low to middle income countries. Since treatment history (TH) impacts response to subsequent therapy, we investigated a set of statistical learning models to optimise cART in the absence of GRT information.

Methods and Findings

The EuResist database was used to extract 8-week and 24-week treatment change episodes (TCE) with GRT and additional clinical, demographic and TH information. Random Forest (RF) classification was used to predict 8- and 24-week success, defined as undetectable HIV-1 RNA, comparing nested models including (i) GRT+TH and (ii) TH without GRT, using multiple cross-validation and area under the receiver operating characteristic curve (AUC). Virological success was achieved in 68.2% and 68.0% of TCE at 8- and 24-weeks (n = 2,831 and 2,579), respectively. RF (i) and (ii) showed comparable performances, with an average (st.dev.) AUC 0.77 (0.031) vs. 0.757 (0.035) at 8-weeks, 0.834 (0.027) vs. 0.821 (0.025) at 24-weeks. Sensitivity analyses, carried out on a data subset that included antiretroviral regimens commonly used in low to middle income countries, confirmed our findings. Training on subtype B and validation on non-B isolates resulted in a decline of performance for models (i) and (ii).

Conclusions

Treatment history-based RF prediction models are comparable to GRT-based for classification of virological outcome. These results may be relevant for therapy optimisation in areas where availability of GRT is limited. Further investigations are required in order to account for different demographics, subtypes and different therapy switching strategies.


Tags: , ,
Posted in Computatioanl biology | Comments Off

MicroRNA-Integrated and Network-Embedded Gene Selection with Diffusion Distance

Written by Scott Christley et al. on October 29, 2010 – 7:00 am -

Gene network information has been used to improve gene selection in microarray-based studies by selecting marker genes based both on their expression and the coordinate expression of genes within their gene network under a given condition. Here we propose a new network-embedded gene selection model. In this model, we first address the limitations of microarray data. Microarray data, although widely used for gene selection, measures only mRNA abundance, which does not always reflect the ultimate gene phenotype, since it does not account for post-transcriptional effects. To overcome this important (critical in certain cases) but ignored-in-almost-all-existing-studies limitation, we design a new strategy to integrate together microarray data with the information of microRNA, the major post-transcriptional regulatory factor. We also handle the challenges led by gene collaboration mechanism. To incorporate the biological facts that genes without direct interactions may work closely due to signal transduction and that two genes may be functionally connected through multi paths, we adopt the concept of diffusion distance. This concept permits us to simulate biological signal propagation and therefore to estimate the collaboration probability for all gene pairs, directly or indirectly-connected, according to multi paths connecting them. We demonstrate, using type 2 diabetes (DM2) as an example, that the proposed strategies can enhance the identification of functional gene partners, which is the key issue in a network-embedded gene selection model. More importantly, we show that our gene selection model outperforms related ones. Genes selected by our model 1) have improved classification capability; 2) agree with biological evidence of DM2-association; and 3) are involved in many well-known DM2-associated pathways.


Tags: , ,
Posted in Computatioanl biology | Comments Off

Systematical Detection of Significant Genes in Microarray Data by Incorporating Gene Interaction Relationship in Biological Systems

Written by Scott Christley et al. on October 29, 2010 – 7:00 am -

Many methods, including parametric, nonparametric, and Bayesian methods, have been used for detecting differentially expressed genes based on the assumption that biological systems are linear, which ignores the nonlinear characteristics of most biological systems. More importantly, those methods do not simultaneously consider means, variances, and high moments, resulting in relatively high false positive rate. To overcome the limitations, the SWang test is proposed to determine differentially expressed genes according to the equality of distributions between case and control. Our method not only latently incorporates functional relationships among genes to consider nonlinear biological system but also considers the mean, variance, skewness, and kurtosis of expression profiles simultaneously. To illustrate biological significance of high moments, we construct a nonlinear gene interaction model, demonstrating that skewness and kurtosis could contain useful information of function association among genes in microarrays. Simulations and real microarray results show that false positive rate of SWang is lower than currently popular methods (T-test, F-test, SAM, and Fold-change) with much higher statistical power. Additionally, SWang can uniquely detect significant genes in real microarray data with imperceptible differential expression but higher variety in kurtosis and skewness. Those identified genes were confirmed with previous published literature or RT-PCR experiments performed in our lab.


Tags: , ,
Posted in Computatioanl biology | Comments Off

Identifying Shared Genetic Structure Patterns among Pacific Northwest Forest Taxa: Insights from Use of Visualization Tools and Computer Simulations

Written by Scott Christley et al. on October 29, 2010 – 7:00 am -

Background

Identifying causal relationships in phylogeographic and landscape genetic investigations is notoriously difficult, but can be facilitated by use of multispecies comparisons.

Methodology/Principal Findings

We used data visualizations to identify common spatial patterns within single lineages of four taxa inhabiting Pacific Northwest forests (northern spotted owl: Strix occidentalis caurina; red tree vole: Arborimus longicaudus; southern torrent salamander: Rhyacotriton variegatus; and western white pine: Pinus monticola). Visualizations suggested that, despite occupying the same geographical region and habitats, species responded differently to prevailing historical processes. S. o. caurina and P. monticola demonstrated directional patterns of spatial genetic structure where genetic distances and diversity were greater in southern versus northern locales. A. longicaudus and R. variegatus displayed opposite patterns where genetic distances were greater in northern versus southern regions. Statistical analyses of directional patterns subsequently confirmed observations from visualizations. Based upon regional climatological history, we hypothesized that observed latitudinal patterns may have been produced by range expansions. Subsequent computer simulations confirmed that directional patterns can be produced by expansion events.

Conclusions/Significance

We discuss phylogeographic hypotheses regarding historical processes that may have produced observed patterns. Inferential methods used here may become increasingly powerful as detailed simulations of organisms and historical scenarios become plausible. We further suggest that inter-specific comparisons of historical patterns take place prior to drawing conclusions regarding effects of current anthropogenic change within landscapes.


Tags: , ,
Posted in Computatioanl biology | Comments Off

Do Seasons Have an Influence on the Incidence of Depression? The Use of an Internet Search Engine Query Data as a Proxy of Human Affect

Written by Scott Christley et al. on October 28, 2010 – 7:00 am -

Background

Seasonal depression has generated considerable clinical interest in recent years. Despite a common belief that people in higher latitudes are more vulnerable to low mood during the winter, it has never been demonstrated that human's moods are subject to seasonal change on a global scale. The aim of this study was to investigate large-scale seasonal patterns of depression using Internet search query data as a signature and proxy of human affect.

Methodology/Principal Findings

Our study was based on a publicly available search engine database, Google Insights for Search, which provides time series data of weekly search trends from January 1, 2004 to June 30, 2009. We applied an empirical mode decomposition method to isolate seasonal components of health-related search trends of depression in 54 geographic areas worldwide. We identified a seasonal trend of depression that was opposite between the northern and southern hemispheres; this trend was significantly correlated with seasonal oscillations of temperature (USA: r = −0.872, p<0.001; Australia: r = −0.656, p<0.001). Based on analyses of search trends over 54 geological locations worldwide, we found that the degree of correlation between searching for depression and temperature was latitude-dependent (northern hemisphere: r = −0.686; p<0.001; southern hemisphere: r = 0.871; p<0.0001).

Conclusions/Significance

Our findings indicate that Internet searches for depression from people in higher latitudes are more vulnerable to seasonal change, whereas this phenomenon is obscured in tropical areas. This phenomenon exists universally across countries, regardless of language. This study provides novel, Internet-based evidence for the epidemiology of seasonal depression.


Tags: , ,
Posted in Computer Science | Comments Off

Identification and Optimization of Classifier Genes from Multi-Class Earthworm Microarray Dataset

Written by Scott Christley et al. on October 28, 2010 – 7:00 am -

Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. A variety of toxicological effects have been associated with explosive compounds TNT and RDX. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. We have developed an earthworm microarray containing 15,208 unique oligo probes and have used it to profile gene expression in 248 earthworms exposed to TNT, RDX or neither. We assembled a new machine learning pipeline consisting of several well-established feature filtering/selection and classification techniques to analyze the 248-array dataset in order to construct classifier models that can separate earthworm samples into three groups: control, TNT-treated, and RDX-treated. First, a total of 869 genes differentially expressed in response to TNT or RDX exposure were identified using a univariate statistical algorithm of class comparison. Then, decision tree-based algorithms were applied to select a subset of 354 classifier genes, which were ranked by their overall weight of significance. A multiclass support vector machine (MC-SVM) method and an unsupervised K-mean clustering method were applied to independently refine the classifier, producing a smaller subset of 39 and 30 classifier genes, separately, with 11 common genes being potential biomarkers. The combined 58 genes were considered the refined subset and used to build MC-SVM and clustering models with classification accuracy of 83.5% and 56.9%, respectively. This study demonstrates that the machine learning approach can be used to identify and optimize a small subset of classifier/biomarker genes from high dimensional datasets and generate classification models of acceptable precision for multiple classes.


Tags: , ,
Posted in Computatioanl biology | Comments Off

Cooperation under Indirect Reciprocity and Imitative Trust

Written by Scott Christley et al. on October 27, 2010 – 7:00 am -

Indirect reciprocity, a key concept in behavioral experiments and evolutionary game theory, provides a mechanism that allows reciprocal altruism to emerge in a population of self-regarding individuals even when repeated interactions between pairs of actors are unlikely. Recent empirical evidence show that humans typically follow complex assessment strategies involving both reciprocity and social imitation when making cooperative decisions. However, currently, we have no systematic understanding of how imitation, a mechanism that may also generate negative effects via a process of cumulative advantage, affects cooperation when repeated interactions are unlikely or information about a recipient's reputation is unavailable. Here we extend existing evolutionary models, which use an image score for reputation to track how individuals cooperate by contributing resources, by introducing a new imitative-trust score, which tracks whether actors have been the recipients of cooperation in the past. We show that imitative trust can co-exist with indirect reciprocity mechanisms up to a threshold and then cooperation reverses -revealing the elusive nature of cooperation. Moreover, we find that when information about a recipient's reputation is limited, trusting the action of third parties towards her (i.e. imitating) does favor a higher collective cooperation compared to random-trusting and share-alike mechanisms. We believe these results shed new light on the factors favoring social imitation as an adaptive mechanism in populations of cooperating social actors.


Tags: , ,
Posted in Computer Science | Comments Off

How Wealth Accumulation Can Promote Cooperation

Written by Scott Christley et al. on October 27, 2010 – 7:00 am -

Explaining the emergence and stability of cooperation has been a central challenge in biology, economics and sociology. Unfortunately, the mechanisms known to promote it either require elaborate strategies or hold only under restrictive conditions. Here, we report the emergence, survival, and frequent domination of cooperation in a world characterized by selfishness and a strong temptation to defect, when individuals can accumulate wealth. In particular, we study games with local adaptation such as the prisoner's dilemma, to which we add heterogeneity in payoffs. In our model, agents accumulate wealth and invest some of it in their interactions. The larger the investment, the more can potentially be gained or lost, so that present gains affect future payoffs. We find that cooperation survives for a far wider range of parameters than without wealth accumulation and, even more strikingly, that it often dominates defection. This is in stark contrast to the traditional evolutionary prisoner's dilemma in particular, in which cooperation rarely survives and almost never thrives. With the inequality we introduce, on the contrary, cooperators do better than defectors, even without any strategic behavior or exogenously imposed strategies. These results have important consequences for our understanding of the type of social and economic arrangements that are optimal and efficient.


Tags: , ,
Posted in Computer Science | Comments Off

A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction

Written by Scott Christley et al. on October 27, 2010 – 7:00 am -

Background

An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins.

Methodology

We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential.

Significance

RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW.


Tags: , ,
Posted in Computatioanl biology | Comments Off

A Comparative Analysis of Gene-Expression Data of Multiple Cancer Types

Written by Scott Christley et al. on October 27, 2010 – 7:00 am -

A comparative study of public gene-expression data of seven types of cancers (breast, colon, kidney, lung, pancreatic, prostate and stomach cancers) was conducted with the aim of deriving marker genes, along with associated pathways, that are either common to multiple types of cancers or specific to individual cancers. The analysis results indicate that (a) each of the seven cancer types can be distinguished from its corresponding control tissue based on the expression patterns of a small number of genes, e.g., 2, 3 or 4; (b) the expression patterns of some genes can distinguish multiple cancer types from their corresponding control tissues, potentially serving as general markers for all or some groups of cancers; (c) the proteins encoded by some of these genes are predicted to be blood secretory, thus providing potential cancer markers in blood; (d) the numbers of differentially expressed genes across different cancer types in comparison with their control tissues correlate well with the five-year survival rates associated with the individual cancers; and (e) some metabolic and signaling pathways are abnormally activated or deactivated across all cancer types, while other pathways are more specific to certain cancers or groups of cancers. The novel findings of this study offer considerable insight into these seven cancer types and have the potential to provide exciting new directions for diagnostic and therapeutic development.


Tags: , ,
Posted in Computatioanl biology | Comments Off
RSS