Publications

Identifying representative sequences of protein families using submodular optimization

Identifying representative sequences for groups of functionally similar proteins and enzymes poses significant computational …

Current approaches and outstanding challenges of functional annotation of metabolites: a comprehensive review

Metabolite profiling is a powerful approach for the clinical diagnosis of complex diseases, ranging from cardiometabolic diseases, …

CCPA: cloud-based, self-learning modules for consensus pathway analysis using GO, KEGG and Reactome

This manuscript describes the development of a resource module that is part of a learning platform named ‘NIGMS Sandbox for Cloud-based …

SSA: a novel method for Single-cell and Spatial transcriptomics Alignment

Single-cell RNA sequencing (scRNA-seq) provides expression profiles of individual cells but fails to preserve crucial spatial …

RCPA: An Open-Source R Package for Data Processing, Differential Analysis, Consensus Pathway Analysis, and Visualization

Identifying impacted pathways is important because it provides insights into the biology underlying conditions beyond the detection of …

Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges

Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each …

A novel approach for predicting upstream regulators (PURE) that affect gene expression

External factors such as exposure to a chemical, drug, or toxicant (CDT), or conversely, the lack of certain chemicals can cause many …

Acyltransferase families that act on thioesters: Sequences, structures, and mechanisms

Acyltransferases (AT) are enzymes that catalyze the transfer of acyl group to a receptor molecule. This review focuses on ATs that act …

Machine Learning Techniques for Cancer Subtype Discovery and Single-Cell RNA Sequencing Data Analysis

Cancer is an umbrella term that includes a range of disorders, from those that are aggressive and life-threatening to indolent lesions …

Novel Techniques for Single-cell RNA Sequencing Data Imputation and Clustering

Advances in single-cell technologies have shifted genomics research from the analysis of bulk tissues toward a comprehensive …

A robust and accurate single-cell data trajectory inference method using ensemble pseudotime

Background

The advance in single-cell RNA sequencing technology has enhanced the analysis of cell development by profiling …

ViT-DeiT: An Ensemble Model for Breast Cancer Histopathological Images Classification

Breast cancer is the most common cancer in the world and the second most common type of cancer that causes death in women. The timely …

DWEN: A novel method for accurate estimation of cell type compositions from bulk data samples

Advances in single-cell RNA sequencing (scRNAseq) technologies have allowed us to study the heterogeneity of cell populations. The cell …

A comprehensive survey of the approaches for pathway analysis using multi-omics data integration

Pathway analysis has been widely used to detect pathways and functions associated with complex disease phenotypes. The proliferation of …

Mouse genomic associations with in vitro sensitivity to simulated space radiation

Exposure to ionizing radiation is considered by NASA to be a major health hazard for deep space exploration missions. Ionizing …

scCAN: single-cell clustering using autoencoder and network fusion

Unsupervised clustering of single-cell RNA sequencing data (scRNA-seq) is important because it allows us to identify putative cell …

DrGA: cancer driver gene analysis in a simpler manner

Background

To date, cancer still is one of the leading causes of death worldwide, in which the cumulative of genes carrying mutations …

A novel method for single-cell data imputation using subspace regression

Recent advances in biochemistry and single-cell RNA sequencing (scRNA-seq) have allowed us to monitor the biological systems at the …

Provenance documentation to enable explainable and trustworthy AI: A literature review

Recently artificial intelligence (AI) and machine learning (ML) models have demonstrated remarkable progress with applications …

Identification and Validation of a Novel Three Hub Long Noncoding RNAs With m6A Modification Signature in Low-Grade Gliomas

It has been evident that N6-methyladenosine (m6A)-modified long noncoding RNAs (m6A-lncRNAs) involves regulating tumorigenesis, …

Thioesterase enzyme families: Functions, structures, and mechanisms

Thioesterases are enzymes that hydrolyze thioester bonds in numerous biochemical pathways, for example in fatty acid synthesis. This …

Single-cell RNA sequencing data imputation using similarity preserving network

Recent advancements in single-cell RNA sequencing (scRNA-seq) technologies have allowed us to monitor the gene expression of individual …

scIDS: Single-cell Imputation by combining Deep autoencoder neural networks and Subspace regression

Single-cell RNA-sequencing (scRNA-seq) has emerged as a powerful high throughput technique that enables the characterization of …

SMRT: Randomized Data Transformation for Cancer Subtyping and Big Data Analysis

Cancer is an umbrella term that includes a range of disorders, from those that are fast-growing and lethal to indolent lesions with low …

Re-Identification of Patient Subgroups in Uveal Melanoma

Uveal melanoma (UM) is a comparatively rare cancer but requires serious consideration since patients with developing metastatic UM …

Cell-to-cell and type-to-type heterogeneity of signaling networks: insights from the crowd

Recent technological developments allow us to measure the status of dozens of proteins in individual cells. This opens the way to …

CPA: a web-based platform for consensus pathway analysis and interactive visualization

In molecular biology and genetics, there is a large gap between the ease of data collection and our ability to extract knowledge from …

Single-Cell RNA Sequencing Data Imputation Using Deep Neural Network

Recent research in biology has shifted the focus toward single-cell data analysis. The new single-cell technologies have allowed us to …

Fast and precise single-cell data analysis using a hierarchical autoencoder

A primary challenge in single-cell RNA sequencing (scRNA-seq) studies comes from the massive amount of data and the excess noise level. …

Disease subtyping using community detection from consensus networks

Cancer is a complex disease including a range of disorders that are activated simultaneously by multiple biological processes on …

Analysis of Short-read Aligners using Genome Sequence Complexity

Next generation sequencing technologies have the capability to provide large numbers of short reads inexpensively and accurately. …

Multi-Omics Analysis Detects Novel Prognostic Subgroups of Breast Cancer

The unprecedented proliferation of recent large-scale and multi-omics databases of cancers has given us many new insights into genomic …

A comprehensive survey of regulatory network inference methods using single-cell RNA sequencing data

Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living …

A Novel Method for Cancer Subtyping and Risk Prediction Using Consensus Factor Analysis

Cancer is an umbrella term that includes a range of disorders, from those that are fast-growing and lethal to indolent lesions with low …

NBIA: a network-based integrative analysis framework – applied to pathway analysis

With the explosion of high-throughput data, effective integrative analyses are needed to decipher the knowledge accumulated in …

Integrated Cancer Subtyping using Heterogeneous Genome-Scale Molecular Datasets

Vast repositories of heterogeneous data from existing sources present unique opportunities. Taken individually, each of the datasets …

RIA: a novel Regression-based Imputation Approach for single-cell RNA sequencing

Advances in single-cell technologies have shifted genomics research from the analysis of bulk tissues toward a comprehensive …

Identifying Significantly Impacted Pathways: A Comprehensive Review and Assessment

Background

Many high-throughput experiments compare two phenotypes such as disease vs. healthy, with the goal of understanding the …

GSMA: an approach to identify robust global and test Gene Signatures using Meta-Analysis

Motivation: Recent advances in biomedical research have made massive amount of transcriptomic data available in public repositories …

R Tutorial: Detection of Differentially Interacting Chromatin Regions From Multiple Hi-C Datasets

The three‐dimensional (3D) interactions of chromatin regulate cell‐type‐specific gene expression, recombination, X‐chromosome …

Functional analysis tools for post‐translational modification: a post‐translational modification database for analysis of proteins and metabolic pathways

Post‐translational modifications (PTMs) are critical regulators of protein function, and nearly 200 different types of PTM have been …

Statistical Software

This article discusses selected statistical software, aiming to help readers find the right tool for their needs. We categorize …

A comprehensive survey of tools and software for active subnetwork identification

A recent focus of computational biology has been to integrate the complementary information available in molecular profiles as well as …

Orthogonal approach to integrate independent omic data

Methods and devices for integrating a plurality of data types are provided. The methods include obtaining, via a processor, a plurality …

A Multi-Cohort and Multi-Omics Meta-Analysis Framework to Identify Network-Based Gene Signatures

Although massive amounts of condition-specific molecular profiles are being accumulated in public repositories every day, meaningful …

Robust Fuzzy Cluster Ensemble on Cancer Gene Expression Data

Noise remains a particularly challenging and ubiquitous problem in cancer gene expression data clustering research, which may cause …

MIA: Multi-cohort Integrated Analysis for Biomarker Identification

Advanced high-throughput technologies have produced vast amounts of biological data. Data integration is the key to obtain the power …

PINSPlus: A tool for tumor subtype discovery in integrated genomic data

Since cancer is a heterogeneous disease, tumor subtyping is crucial for improved treatment and prognosis. We have developed a subtype …

Network‐Based Approaches for Pathway Level Analysis

Identification of impacted pathways is an important problem because it allows us to gain insights into the underlying biology beyond …

A survey of the approaches for identifying differential methylation using bisulfite sequencing data.

DNA methylation is an important epigenetic mechanism that plays a crucial role in cellular regulatory systems. Recent advancements in …

A Systems Biology Approach for Unsupervised Clustering of High-Dimensional Data

One main challenge in modern medicine is the discovery of molecular disease subtypes characterized by relevant clinical differences, …

MicroRNA-augmented pathways (mirAP) and their applications to pathway analysis and disease subtyping

MicroRNAs play important roles in the development of many complex diseases. Because of their importance, the analysis of signaling …

TOMAS: A novel TOpology-aware Meta-Analysis approach applied to System biology

With the explosion of high-throughput data, an effective integrative analysis is needed to decipher the knowledge accumulated in …

PINS: A Perturbation Clustering Approach for Data Integration and Disease Subtyping

Disease subtyping is accomplished by a computer-implemented algorithm that manipulates a first genetic dataset to construct a set of …

A novel approach for data integration and disease subtyping

Advances in high-throughput technologies allow for measurements of many types of omics data, yet the meaningful integration of several …

Overcoming the matched-sample bottleneck: an orthogonal approach to integrate omic data

MicroRNAs (miRNAs) are small non-coding RNA molecules whose primary function is to regulate the expression of gene products via …

DANUBE: Data-Driven Meta-ANalysis Using UnBiased Empirical Distributions—Applied to Biological Pathway Analysis

Identifying the pathways and mechanisms that are significantly impacted in a given phenotype is challenging. Issues include patient …

A novel bi-level meta-analysis approach: applied to biological pathway analysis

Motivation: The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of …

MarkovBin: An Algorithm to Cluster Metagenomic Reads Using a Mixture Modeling of Hierarchical Distributions

Metagenomics is the study of genomic content of microorganisms from environmental samples without isolation and cultivation. Recently …

QSEA for fuzzy subgraph querying of KEGG pathways

As biological pathway databases continually increase in size and availability, efficient tools and techniques to query these databases …

SPATA: A seeding and patching algorithm for de novo transcriptome assembly

RNA-seq reads are sampled from the underlying human transcriptome sequence, consisting of hundreds of thousands of mRNA transcripts. De …

SPATA: A highly accurate GUI tool for de novo transcriptome assembly

Transcript quantification using RNA-seq is central to contemporary and future transcriptomics research. The existing tools are useful …

iQuant: A fast yet accurate GUI tool for transcript quantification

Transcript quantification using RNA-seq is central to contemporary and future transcriptomics research. The existing tools are useful …