Donald Geman



Home Research Projects Publications Teaching Recent Talks Curriculum Vita

FALL 2012

Computational Molecular Medicine, 550.450

Biomedical research has been transformed by the development of new technologies for sequencing genomes and measuring RNA and protein expression levels. Due to the massive number of interacting components, the traditional approach, which is experimental and component-by-component, is no longer adequate. In contrast, statistical learning, modeling and inference have emerged as core methodologies for analyzing these data and uncovering the relationships between molecules, networks and disease, where knowledge extraction is formulated as a problem in high-dimensional pattern recognition. We will cover selected aspects of this methodology (e.g., measuring associations, testing multiple hypotheses, learning predictors and network models, and stochastic simulation) and illustrate how it enhances our ability to discover molecular disease networks, detect disease, predict clinical outcomes, and characterize disease progression.

FALL 2011

Topics in Bioinformatics, 550.635

A "readings" course organized around research articles in the recent computational biology literature. In this term, the topics covered will include: inferring phenotype from genotype based on gene microarray data; discovering gene regulatory patterns and networks from sequence and expression data; predicting active sites and detecting harmful mutations in proteins; and stochastic modeling of carcinogenesis. One major objective is to prepare students to comfortably read articles which involve extensive mathematical and statistical modeling as well as techniques from pattern recognition and machine learning. The papers will be presented by the students. However, all student expositions will be preceded by comprehensive "tutorials" by the instructor on the various "theoretical" issues in learning, modeling and inference required for understanding the papers, such as performance metrics, properly estimating generalization error, over-fitting, statistical genetics, graphical models (e.g., Bayesian networks and hidden Markov models), classification algorithms (e.g., SVMs) and stochastic simulation .

FALL 2009

Statistical Learning with Applications, 550.435

Statistical modeling and inference, inductive learning and information theory together provide a cohesive framework for machine perception, which amounts to building a data-description machine converting physical measurements (images, molecular counts, etc.) to interpretations or descriptions. Recurring themes include quantifying uncertainty, estimating generalization error, model complexity, the bias/variance dilemma, small-sample learning and estimating interactions. Various problems in computational vision, speech and biology will be analyzed in this context, including visual tracking, object recognition, language modeling, molecular cancer diagnosis and learning gene networks.