Associate Professor of Applied and Computational Mathematics and Statistics
My research focuses on developing statistical and computational methods for analyzing big data from biology, transcriptomics, proteomics, biochemistry, biophysics, and other fields. Especially, I am interested in: (1) modeling the data: what statistical distribution/model best describes the data, (2) testing on the data: which factors/predictors are relevant/important to the outcome or different between groups, (3) mining on the data: using existing data mining tools such as linear regression and random forests, or developing new tools such as deep learning neural networks, to classify, cluster, or reduce the dimension of the data. I am very interested in collaborating with researchers in various fields, helping them solve challenging problems in real data.
1. "Widespread position-specific conservation of synonymous rare codons within coding sequences" Chaney, J.L.; Steele, A.; Carmichael, R.; Rodriguez, A.; Specht, A.T.; Ngo, K.; Li, J.; Emrich, S.; Clark, P.L. PLoS Comput Biol. 2017, 13, e1005531.
2. "GRAFENE: Graphlet-based alignment-free network approach integrates 3D structural and sequence (residue order) data to improve protein structural comparison" Faisal, F.E.; Newaz, K.; Chaney, J.L.; Li, J.; Emrich, S.J.; Clark, P.L.; Milenkovic, T. Sci Rep. 2017, 7, 14890.
3. "A sparse differential clustering algorithm for tracing cell type changes via single-cell RNA-sequencing data" Barron, M; Zhang, S.;, Li, J. Nucleic Acids Res. 2018, 46, e14.
4. "DiPhiSeq: Robust comparison of expression levels on RNA-Seq data with large sample sizes" Li, J.; Lamere A.T. Bioinformatics. 2019, bty952.