Eugene Katsevich

Assistant Professor of Statistics and Data Science

Contact Information

Primary Email:
ekatsevi@wharton.upenn.edu

office Address:
311 Academic Research Building
265 South 37th Street
Philadelphia, PA 19104

Research Interests: high-dimensional variable selection, multiple testing, applications to genetics and genomics

Links: Personal Website

Research

Eugene Katsevich and Aaditya Ramdas (2020), Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings, Annals of Statistics, (to appear) ().
Eugene Katsevich and Aaditya Ramdas (Working), A theoretical treatment of conditional independence testing under Model-X.
Eugene Katsevich and Aaditya Ramdas (Working), The leave-one-covariate-out conditional randomization test.
Eugene Katsevich, Chiara Sabatti, Marina Bogomolov (Working), Filtering the rejection set while preserving false discovery rate control.
Eugene Katsevich and Chiara Sabatti (2020), Multilayer knockoff filter: Controlled variable selection at multiple resolutions, Annals of Applied Statistics, 13 (1), pp. 1-33.
Matteo Sesia, Eugene Katsevich, Stephen Bates, Emmanuel Candes, Chiara Sabatti (2020), Multi-resolution localization of causal variants across the genome, Nature Communications, 11:1093 ().
Junjie Zhu, Qian Zhu, Eugene Katsevich, Chiara Sabatti (2019), Exploratory gene ontology analysis with interactive visualization, Nature Scientific Reports, 9:7793 ().
Joakim Anden, Eugene Katsevich, Amit Singer (2015), Covariance estimation using conjugate gradient for 3D classification in cryo-EM, Proceedings of the IEEE International Symposium on Biomedical Imaging.
Eugene Katsevich, Alexander Katsevich, Amit Singer (2015), Covariance matrix estimation for the cryo-EM heterogeneity problem, SIAM Journal on Imaging Sciences, 8 (1), pp. 126-185.
Bibo Shi, Eugene Katsevich, Be-Shan Chiang, Alexander Katsevich, Alexander Zamyatin (2014), Image registration for motion estimation in cardiac CT, Proceedings of SPIE Medical Imaging, 9033 ().

Teaching

All Courses

AMCS5999 - Independent Study
Independent Study allows students to pursue academic interests not available in regularly offered courses. Students must consult with their academic advisor to formulate a project directly related to the student’s research interests. All independent study courses are subject to the approval of the AMCS Graduate Group Chair.
AMCS9999 - Ind Study & Research
Study under the direction of a faculty member.
STAT4710 - Modern Data Mining
With the advent of the internet age, data are being collected at unprecedented scale in almost all realms of life, including business, science, politics, and healthcare. Data mining—the automated extraction of actionable insights from data—has revolutionized each of these realms in the 21st century. The objective of the course is to teach students the core data mining skills of exploratory data analysis, selecting an appropriate statistical methodology, applying the methodology to the data, and interpreting the results. The course will cover a variety of data mining methods including linear and logistic regression, penalized regression (including lasso and ridge regression), tree-based methods (including random forests and boosting), and deep learning. Students will learn the conceptual basis of these methods as well as how to apply them to real data using the programming language R. This course may be taken concurrently with the prerequisite with instructor permission.
STAT5710 - Modern Data Mining
Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class. Prerequisite: two courses at the statistics 4000 or 5000 level or permission from instructor.
STAT9610 - Statistical Methodology
This is a course that prepares 1st year PhD students in statistics for a research career. This is not an applied statistics course. Topics covered include: linear models and their high-dimensional geometry, statistical inference illustrated with linear models, diagnostics for linear models, bootstrap and permutation inference, principal component analysis, smoothing and cross-validation.
STAT9910 - Sem in Adv Appl of Stat
This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.
STAT9950 - Dissertation
Dissertation

Activity

Latest Research

Eugene Katsevich and Aaditya Ramdas (2020), Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings, Annals of Statistics, (to appear) ().

All Research

Wharton Stories

Large blue banners with white letters spell out

Get to Know the 20 New Faculty Members Joining Wharton This Year

This upcoming academic year, the Wharton School will welcome 20 new faculty members. These brilliant minds are leading experts in a wide range of fields, including business, social science, finance, economics, public policy, management, marketing, statistics, real estate, and operations. One of the most exciting additions to the Wharton community…

Wharton Stories - 08/17/2020

All Stories

Eugene Katsevich

Contact Information

Overview

Education

Academic Positions Held

Research

Teaching

All Courses

AMCS5999 - Independent Study

AMCS9999 - Ind Study & Research

STAT4710 - Modern Data Mining

STAT5710 - Modern Data Mining

STAT9610 - Statistical Methodology

STAT9910 - Sem in Adv Appl of Stat

STAT9950 - Dissertation

Awards and Honors

Activity

Latest Research

Wharton Stories