Professor of Applied and Computational Mathematics and Statistics
My research focuses on developing statistical and computational methods for analyzing big data from biology, transcriptomics, proteomics, biochemistry, biophysics, and other fields. Especially, I am interested in: (1) modeling the data: what statistical distribution/model best describes the data, (2) testing on the data: which factors/predictors are relevant/important to the outcome or different between groups, (3) mining on the data: using existing data mining tools such as linear regression and random forests, or developing new tools such as deep learning neural networks, to classify, cluster, or reduce the dimension of the data. I am very interested in collaborating with researchers in various fields, helping them solve challenging problems in real data.
1. "DiPhiSeq: Robust comparison of expression levels on RNA-Seq data with large sample sizes" Li, J.; Lamere A.T. Bioinformatics. 2019, bty952.
2. "A sparse differential clustering algorithm for tracing cell type changes via single-cell RNA-sequencing data" Barron, M; Zhang, S.; Li, J. Nucleic Acids Res. 2018, 46, e14.
3. "GRAFENE: Graphlet-based alignment-free network approach integrates 3D structural and sequence (residue order) data to improve protein structural comparison" Faisal, F.E.; Newaz, K.; Chaney, J.L.; Li, J.; Emrich, S.J.; Clark, P.L.; Milenkovic, T. Sci Rep. 2017, 7, 14890.
4. "Widespread position-specific conservation of synonymous rare codons within coding sequences" Chaney, J.L.; Steele, A.; Carmichael, R.; Rodriguez, A.; Specht, A.T.; Ngo, K.; Li, J.; Emrich, S.; Clark, P.L. PLoS Comput Biol. 2017, 13, e1005531.