a small group of people stand and squat together on a stone bridge in an outdoor setting

Young Researchers Workshop —2024

The 2024 Young Researchers Workshop will be held on Cornell's Ithaca campus Wednesday, October 9 through Friday, October 11!

All participants will need to register for this workshop.

(You will find travel, hotel, and parking information below the schedule and presenter information.)

Schedule


Wednesday 10/9

6pm-9pm. Reception: Baker Portico, Physical Sciences Building, 245 East Avenue.
Very limited parking onsite.


Thursday 10/10

8.30-9:45am Sign-in, Breakfast and Poster Session 1: Duffield Hall Atrium, 343 Campus Road.

10:00am-11:00am Opening Session (G10 Biotech) Hosted by Adrian Lewis

  • Yuchen Wu—Stochastic Runge-Kutta Methods: Provable Acceleration of Diffusion Models
  • Yu Ma—M3H: Multimodal Multitask Machine Learning for Healthcare

11:00am - 11:30am Coffee Break

11:30am - 1:00pm Optimization I (G10 Biotech) Hosted by Soroosh Shafiee

  • Xin Jiang—Graph sequences with finite-time convergence for decentralized average consensus and its application in distributed optimization
  • Xiaopeng Li—Proximal random reshuffling under local Lipschitz continuity
  • Tianjiao Li—A simple uniformly optimal method without line search for convex optimization
  • Zikai Xiong—Enhancing the Solution Methods for Huge-Scale Linear Programming via Level-Set Geometry

1:00pm - 2:00pm Lunch (Biotech Atrium)

2:00pm - 3:30pm Reinforcement Learning (G10 Biotech) Hosted by Ziv Scully

  • Yuchen Hu—Policy Evaluation in Dynamic Experiments
  • Lucy Huo—Asymptotic product-form steady state for multiclass queueing networks in multi-scale heavy traffic
  • Yashaswini Murthy—Learning and Control in Countable State-Space
  • Ming Yin—Understanding Q*: from sequential decision making to Large language model alignment

3:30pm - 4:00pm Coffee Break

4:00pm - 5:30pm Markets and Decisions (G10 Biotech) Hosted by Paul Gölz 

  • Nicolas Christianson—Reliable AI-Augmented Algorithms for Energy and Sustainability
  • Meena Jagadeesan— Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry
  • Devansh Jalota—Algorithm and Incentive Design for Sustainable Resource Allocation: Beyond Classical Fisher Markets
  • Jiaqi Zhang—Designing and learning from large-scale interventions

6:30pm – 9:00pm Dinner. Ithaca Downtown Conference Center, 116 East Green Street


Friday  10/11

8.30am-9:30am Breakfast and Poster Session 2: Duffield Hall Atrium, 343 Campus Road

9:30am - 9:45am Walk to Statler Hall, Room 265, 106 Statler Drive
                             
9:45am-11am Optimization II (Statler Hall, Room 265) Hosted by Peter Frazier

  • Yifan Hu—Contextual Stochastic Bilevel Optimization and Three-Stage Stochastic Programming
  • Akshit Kumar—Dynamic Resource Allocation: Algorithmic Design Principles and Spectrum of Achievable Performances
  • Xinyi Chen—A Non-stochastic Control Approach to Optimization

11:00am - 11:30am Coffee Break

11:30am - 12:45pm Causal Inference & Distribution Shift (Statler Hall, Room 265) Hosted by Christina Lee Yu

  • Ayoub Foussoul—Distributionally Robust Newsvendor on a Metric
  • Yihong Gu—Causality Pursuit from Heterogeneous Environments
  • Reese Pathak—Towards optimal learning under distribution shift: new complexity measures and algorithms for tackling covariate shift

1:00pm – 2:00pm Lunch (available “to go”) Duffield Atrium

Session 1 Poster Presenters and Titles
Thursday, October 10, 8:30-9:45am Duffield Hall Atrium

PresenterTitle

Jerry Anunrojwong

Battery Operations in Electricity Markets: Strategic Behavior and Distortions

Manuel Arnese

Convergence of Coordinate Ascent Variational Inference for log-concave measures via optimal transport

Sandeep Chitla

Consumers' Cart-Building Behavior in Online Grocery: Impact of Cart-Level Promotions

Natalie Collina

Algorithmic Collusion Without Threats

Anna Deza

Myungeun Eom

Batching and Greedy Policies: How Good Are They in Dynamic Matching?

Andrei Graur

Sparse Submodular Function Minimization

Feihong Hu

"Uber" Your Cooking: The Sharing-Economy Operations of a Ghost-Kitchen Platform

Marouane Ibn Brahim

Maximum Load Assortment Optimization: Approximation Algorithms and Adaptivity Gaps

Youngseo Kim

Estimate then Predict: Convex Formulation for Travel Demand Forecasting

Thodoris Koukouvinos

A novel algorithm for a broad class of non convex optimization problems

Eitan Levin

Any-dimensional optimization

Brian Liu

FASTopt: An Optimization Framework for Fast Additive Segmentation in Transparent ML

Calum MacRury

On Online Contention Resolution Schemes for the Matching Polytope of Graphs

Jason Milionis

A Myersonian Framework for Optimal Liquidity Provision in Automated Market Makers

Naren Manoj

On the Robustness of Spectral Algorithms for Semirandom Stochastic Block Models

Laurel Newman

When Lines Become Contagious: Analyzing the Spread of Infection in the M/M/1 Queue

Sloan Nietert

Robust Decision Making with Local and Global Adversarial Corruptions

Haripriya Pulyassary

Network Flow Problems with Electric Vehicles

Ayush Sekhari 

Learning-Unlearning Schemes

Richard Shapley

An additive approximation for unsplittable multicommodity flow in outerplanar graphs

Devin Smedira

Deterministically Computing Volumes and Integrals in High Dimensions

Asterios Tsiourvas

Overcoming the Optimizer’s Curse: Obtaining Realistic Prescriptions from Neural Networks

Jie Wang

Regularization for Adversarial Robust Learning

Guanghui Wang

Faster Margin Maximization Rates for Generic and Adversarially Robust Optimization Methods

Priyank Agrawal

Optimistic Q-learning for average reward and episodic reinforcement learning

Mallory Gaspard

Monotone Causality in Opportunistically Stochastic Shortest Path Problems

Jaingze Han

Probabilistic design in the operation of service systems

Boya Hou

Nonparametric Sparse Learning of Dynamical Systems

Laixi Shi

Can We Break the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning?

Jonah Botvinick-Greenhouse

Abstracts

Opening Session—Hosted by Adrian Lewis
Thursday, October 10, (10 a.m. — 11 a.m.) (G10 Biotech)

Yuchen Wu—Stochastic Runge-Kutta Methods: Provable Acceleration of Diffusion Models

Diffusion models have become pivotal in the realm of modern generative modeling, achieving state-of-the-art performance across multiple domains. Despite the high sample quality they obtain, diffusion models generally require hundreds to thousands of sequential neural network evaluations, incurring considerably higher computational costs compared to single-step generators like generative adversarial networks (GANs) and variational auto-encoders (VAEs). While various different acceleration methods have been proposed, the theoretical foundations for accelerating diffusion models remain largely underexplored. In this talk, I will discuss a training-free acceleration algorithm for SDE-based diffusion samplers  based on the stochastic Runge-Kutta method. Our sampler requires $\widetilde O(d^{3/2} / \varepsilon)$ network function evaluations to attain $\varepsilon$ error measured in KL divergence, outperforming the state-of-the-art guarantee $\widetilde O(d^{3} / \varepsilon)$ in terms of dimensional dependency. Numerical simulations validate the efficiency of the proposed method.

Yu Ma—M3H: Multimodal Multitask Machine Learning for Healthcare

Developing an integrated many-to-many framework leveraging multimodal data for multiple tasks is crucial to unifying healthcare applications ranging from diagnoses to operations. In resource-constrained hospital environments, a scalable and unified machine learning framework that improves previous forecast performances could improve hospital operations and save costs. We introduce M3H, an explainable Multimodal Multitask Machine Learning for Healthcare framework that consolidates learning from tabular, time-series, language, and vision data for supervised binary/multiclass classification, regression, and unsupervised clustering. It features a novel attention mechanism balancing self-exploitation (learning source-task), and cross-exploration (learning cross-tasks), and offers explainability through a proposed TIM score, shedding light on the dynamics of task learning interdependencies. M3H encompasses an unprecedented range of medical tasks and machine learning problem classes and consistently outperforms traditional single-task models by on average 11.6% across 40 disease diagnoses from 16 medical departments, three hospital operation forecasts, and one patient phenotyping task. The modular design of the framework ensures its generalizability in data processing, task definition, and rapid model prototyping, making it production ready for both clinical and operational healthcare settings, especially those in constrained environments.

Optimization I—Hosted by Soroosh Shafiee
Thursday, October 10, 11:30 a.m. — 1:00 p.m.  (G10 Biotech)

Xin Jiang—Graph sequences with finite-time convergence for decentralized average consensus and its application in distributed optimization

In decentralized computing, computational resources are abstracted as agents and collaborate to solve a global problem through peer-to-peer interactions. Such a paradigm avoids the communication bottleneck induced by a central coordinator, as does in distributed computing. A fundamental challenge in decentralized computing is computing the average of all agents, also referred to as the average consensus problem.

This talk presents an approach to the average consensus problem which leverages the finite-time consensus property of a sequence of graphs. Graph sequences with the finite-time consensus property have the desirable feature that iterating through the entire graph sequence is equivalent to performing global or exact averaging. Prior results were limited to establishing the finite-time consensus property when the total number of agents is a power of two. In contrast, we develop two novel sequences of graphs, the p-peer hyper-cuboids and sequential doubly stochastic (SDS) graphs, that not only satisfy the finite-time consensus property for an arbitrary number of agents but also account for the inter-and intra-cluster communication structure found in high-performance computing resources.

Xiaopeng Li—Proximal random reshuffling under local Lipschitz continuity

We study proximal random reshuffling for minimizing the sum of locally Lipschitz functions and a proper lower semicontinuous convex function without assuming coercivity or the existence of limit points. The algorithmic guarantees pertaining to near approximate stationarity rely on a new tracking lemma linking the iterates to trajectories of conservative fields. One of the novelties in the analysis consists in handling conservative fields with unbounded values.

Tianjiao Li—A simple uniformly optimal method without line search for convex optimization

Line search (or backtracking) procedures have been widely employed into first-order methods for solving convex optimization problems, especially those with unknown problem parameters (e.g., Lipschitz constant). In this work, we show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori. In particular, we present a novel accelerated gradient descent type algorithm called auto-conditioned fast gradient method (AC-FGM) that can achieve an optimal O(1/k^2) rate of convergence for smooth convex optimization without requiring the estimate of a global Lipschitz constant or the employment of line search procedures. We then extend AC-FGM to solve convex optimization problems with Holder continuous gradients and show that it automatically achieves the optimal rates of convergence uniformly for all problem classes with the desired accuracy of the solution as the only input. Finally, we report some encouraging numerical results that demonstrate the advantages of AC-FGM over the previously developed parameter-free methods for convex optimization.

Zikai Xiong—Enhancing the Solution Methods for Huge-Scale Linear Programming via Level-Set Geometry

Just in the last several years, we have witnessed a dramatic shift in the way linear programs (LPs) are actually solved -- the classic methods (simplex method and interior-point methods) are being replaced by the primal-dual hybrid gradient method (PDHG) to solve the huge-scale LP problems. PDHG, with heuristic enhancements and GPU implementation, has been very successful in solving huge-scale LP problems; however, its performance can have substantial variability, and an intuitive understanding of the drivers of its performance has been lacking. In this talk I will present new results on both the theoretical foundation and the practical performance of PDHG. Furthermore, these results extend to general conic convex optimization. Our analysis shows that the geometry of the primal-dual (sub-)level sets plays a critical role in the performance of PDHG – both in theory and practice. This is both good and bad, because it means that unfavorable geometry of some instances leads to the poor performance of PDHG both in theory and practice. To address this issue, we show how to construct central-path-based linear transformations—including conic rescaling—that can markedly enhance the convergence of PDHG by changing the geometry. Last of all, we present computational results that demonstrate how our rescalings accelerate convergence to high-accuracy solutions and lead to more efficient methods for huge-scale LP problems.

Reinforcement Learning —Hosted by Ziv Scully
Thursday, October 10, 2:00 p.m. — 3:30 p.m.  (G10 Biotech)

Yuchen Hu—Policy Evaluation in Dynamic Experiments

Experiments where treatment assignment varies over time, such as micro-randomized trials and switchback experiments, are essential for guiding dynamic decisions. These experiments often exhibit nonstationarity due to factors like hidden states or unstable environments, posing substantial challenges for accurate policy evaluation.

In this talk, I will discuss how mixing assumptions inform the design of simple yet robust methodologies for policy evaluation in Partially Observed Markov Decision Processes (POMDPs). In the first part of the talk, I will discuss the properties of switchback experiments in finite-sample, non-stationary dynamic systems. We find that, in this setting, standard switchback designs suffer considerably from carryover bias, but judicious use of burn-in periods can considerably improve the situation and enable errors that decay nearly at the parametric rate. In the second part of the talk, I will discuss policy evaluation in micro-randomized experiments and provide further theoretical grounding on mixing-based policy evaluation methodologies. Under a sequential ignorability assumption, we provide rate-matching upper and lower bounds that sharply characterize the hardness of off-policy evaluation in POMDPs. These findings demonstrate the promise of using stochastic modeling techniques to enhance tools for causal inference. Our formal results are mirrored in empirical evaluations based on ride-sharing and mobile health simulators.

Lucy Huo—Asymptotic product-form steady state for multiclass queueing networks in multi-scale heavy traffic

Multiclass queueing networks are essential models for data centers and complex service systems. In this work, we study their stationary distribution under static buffer priority service policies in a multi-scale heavy traffic regime. We prove that the scaled queue lengths converge weakly to a product-form limit, whose components are independent and each component follows an exponential distribution. This closed-form result not only advances the theoretical understanding of multi-class networks, but also provides practical insights into the design and optimization of service policies. 

Yashaswini Murthy—Learning and Control in Countable State-Spaces

We consider policy optimization methods in reinforcement learning settings where the state space is arbitrarily large, or even countably infinite. The motivation arises from control problems in communication networks, matching markets, and other queueing systems. Specifically, we consider the popular Natural Policy Gradient (NPG) algorithm, which has been studied in the past only under the assumption that the cost is bounded and the state space is finite, neither of which holds for the aforementioned control problems. Assuming a Lyapunov drift condition, which is naturally satisfied in some cases and can be satisfied in other cases at a small cost in performance, we design a state-dependent step-size rule which dramatically improves the performance of NPG for our intended applications. In addition to experimentally verifying the performance improvement, we also theoretically show that the iteration complexity of NPG can be made independent of the size of the state space. The key analytical tool we use is the connection between NPG stepsizes and the solution to Poisson's equation. In particular, we provide policy-independent bounds on the solution to Poisson's equation, which are then used to guide the choice of NPG stepsizes.

Ming Yin—Understanding Q*: from sequential decision making to Large language model alignment

Sequential decision-making problems are ubiquitous in everyday life, and different problems have different levels of harness for making good decisions. Given a problem at hand, understanding its “hardness” is vital for deploying strategies. In the first part of this talk, I will provide a fundamental characterization of offline learning hardness which we termed “intrinsic learning bound” when the sequential decision-making problem has discrete state and action spaces. To this end, we design a fine-grained pessimistic uncertainty estimation that adapts the individual instances and uses sharp statistical analysis tools to recover the hardness of the learning problems. Our result explains the major sources of hardness in learning: distribution shift (between data distribution and optimal distribution) and environmental variation (over Q*). Due to its generic form, we believe the intrinsic bound could help illuminate what makes a specific problem hard and reveal the fundamental challenges in offline RL.

In the second part of the talk, I will cast Large Language Model alignment problem as a sequential decision-making problem, and design principled decoding by leveraging an implicitly estimated Q* function. One key feature of our algorithm is that we do not need training for alignment. Compared to other training-free LLM alignment methods, our method provides the best performance, and can easily transfer to other similar but different alignment tasks

Markets and Decisions—Hosted by Paul Gölz
Thursday, October 10, 4:00 p.m. — 5:30 p.m. (G10 Biotech)

Nicolas Christianson—Reliable AI-Augmented Algorithms for Energy and Sustainability

Modern AI and ML algorithms can deliver transformative performance improvements for decision-making under uncertainty, where traditional, worst-case algorithms are often too conservative. This can potentially be transformative for energy and sustainability applications, where rapid improvements are crucial to facilitate the energy transition and reduce carbon emissions. However, AI and ML lack worst-case guarantees, hindering their deployment to real-world problems where safety and reliability are critical. In this talk, I will discuss recent work on developing AI-augmented algorithms with provable performance guarantees and their application across the domains of energy systems and sustainable computing. In particular, I will discuss progress on the question of designing optimal algorithms integrating black-box AI “advice” and classic, worst-case strategies in general online optimization problems, while highlighting recent steps toward leveraging uncertainty quantification for risk-aware decision-making in these settings.

Meena Jagadeesan— Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry

Emerging marketplaces for large language models and other large-scale machine learning (ML) models appear to exhibit market concentration, which has raised concerns about whether there are insurmountable barriers to entry in such markets. In this work, we study this issue from both an economic and an algorithmic point of view, focusing on a phenomenon that reduces barriers to entry. Specifically, an incumbent company risks reputational damage unless its model is sufficiently aligned with safety objectives, whereas a new company can more easily avoid reputational damage. To study this issue formally, we define a multi-objective high-dimensional regression framework that captures reputational damage, and we characterize the number of data points that a new company needs to enter the market. Our results demonstrate how multi-objective considerations can fundamentally reduce barriers to entry -- the required number of data points can be significantly smaller than the incumbent company's dataset size. En route to proving these results, we develop scaling laws for high-dimensional linear regression in multi-objective environments, showing that the scaling rate becomes slower when the dataset size is large, which could be of independent interest.

Devansh Jalota—Algorithm and Incentive Design for Sustainable Resource Allocation: Beyond Classical Fisher Markets

Technological advances have opened new avenues for designing market mechanisms for resource allocation, from enhancing resource allocation efficiency with widespread data availability to enabling real-time algorithm implementation. While these technological advancements hold significant promise, they also introduce new societal challenges pertaining to equity, privacy, data uncertainty, and security that existing market mechanisms often fail to address. My research develops data-driven and online learning algorithms and incentive schemes to address these challenges of traditional market mechanisms, thereby advancing the science and practice of market design for sustainable and society-aware resource allocation.

In this talk, I focus on addressing data uncertainty and privacy issues in the context of Fisher markets, a classical framework for fair resource allocation where the problem of computing equilibrium prices relies on complete information of user attributes, which are typically unavailable in practice. Motivated by this practical limitation, we study a modified online incomplete information variant of Fisher markets, where users with privately known utility and budget parameters, drawn i.i.d. from a distribution, arrive sequentially. In this novel market, we establish the limitations of static pricing and design dynamic posted-price algorithms with improved guarantees. Our main result is a posted-price algorithm that solely relies on revealed preference (RP) feedback, i.e., observations of user consumption, achieving the best-known guarantees for first-order algorithms in the RP setting while providing a regret analysis of a fairness-promoting logarithmic objective, unlike typical non-negative and bounded efficiency-promoting objectives in online learning.

Jiaqi Zhang—Designing and learning from large-scale interventions

Advances in recent technologies have enabled the possibility to perform and measure the effects of external interventions in many fields. For example, it is now possible to target and perturb single or combinatorial genes in living cells. These provide a unique opportunity to study underlying causal mechanisms behind complex systems, such as the regulatory mechanisms behind cell states. In our work, we focus on three key aspects that have arisen from these new capabilities: (1) how to define and learn the causal programs governing high-dimensional or perceptual data; (2) how to model interventional effects in scenarios where the measurements are incomplete; and (3) how to design the next experiments if one is interested in achieving a desired state. For (1), we establish new causal disentanglement theories that guarantee identifiability of the underlying causal programs, given sufficient samples and regularizing conditions. For (2), we develop a scalable algorithm that models the interventional effect using the discrepancy between distributions. It can capture nuanced changes at the sample level and extrapolate to identify non-additive combinatorial perturbations. Lastly, for (3), we discuss our proposals borrowing ideas from extreme event theory for finding desirable interventions more efficiently. 

Session 2 Poster Presenters and Titles
Friday, October 11, 8:30-9:45am Duffield Hall Atrium

PresenterTitle

Matheus Jun Ota

Benders cuts via corner polyhedra: an application to the stochastic vehicle routing problem

Marios Papachristou

Optimal Resource Allocation for Remediating Networked Contagions

Vrishabh Patil

Addressing Estimation Errors on Expected Asset Returns through Robust Portfolio Optimization

Ali Shirali

Allocation Requires Prediction Only if Inequality is Low

Man Yiu Tsang

On the Trade-off Between Distributional Belief and Ambiguity: Conservatism, Finite-Sample Guarantees, and Asymptotic Properties

Kabir Verchand

Sharp guarantees for iterative nonconvex optimization with random data

Chonghuan Wang

Adaptive Experimental Design In Operations

Qi Wang

Stochastic Constrained Optimization

Zhuoyu Xiao

Synchronous and Asynchronous Gradient-Response Schemes for Computing Quasi-Nash Equilibria under Uncertainty

Kunhe Yang

Platforms for Efficient and Incentive-Aware Collaboration

Wen Yun

Vague Pricing Optimization

Anthony Karahalios

Column Elimination

Yoav Kolumbus

Tu Ni

Abdellah Aznag

An active learning framework for multi-group mean estimation

Qian Xie

Cost-aware Bayesian optimization via the Pandora's Box Gittins index

Yifang Chen

Algorithmic data efficient learning in the era of large model

Rares Cristian

Robust End-to-End Learning under Endogenous Uncertainty

Haiyun He

Information-Theoretic Generalization Bounds for Deep Neural Networks

Vikas Deep

Asymptotically Optimal Adaptive A/B tests for Average Treatment Effect

Zhicheng Guo

Improving Clinician Performance in Classification of EEG Patterns on the Ictal-Interictal-Injury Continuum using Interpretable Machine Learning

Su Jia

Sammy Khalife

Sample Complexity of Data-Driven Algorithm Design using Neural Networks

Yueying Li

Sustainable LLM Lifecycle

Varun Suriyanarayana

Online Generalized Flow and Load Balancing with recourse

Zongyi Li

Neural operator for partial differential equations

Zhou Lu

When is Inductive Inference Possible?

Jennifer Sun

Online Control in Population Dynamics

Yige Hong

Unichain and Aperiodicity are Sufficient for Asymptotic Optimality of Average-Reward Restless Bandits

Taylan Kargin

Infinite-Horizon Distributionally Robust Regret-Optimal Control

Minda Zhao

Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action

Abstracts

Optimization II—Hosted by Peter Frazier
Friday, October 11, 9:45 a.m.— 11:00 a.m. (Statler Hall, Room 265)

Yifan Hu—Contextual Stochastic Bilevel Optimization and Three-Stage Stochastic Programming

We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the upper-level decision maker but also to some side information and when there are multiple or even infinite many followers. It captures important applications such as meta-learning, personalized federated learning, end-to-end learning, and Wasserstein distributionally robust optimization with side information (WDRO-SI). Due to the presence of contextual information, existing single-loop methods for classical stochastic bilevel optimization are unable to converge. To overcome this challenge, we introduce an efficient double-loop gradient method based on the Multilevel Monte-Carlo (MLMC) technique and establish its sample and computational complexities. When specialized to stochastic nonconvex optimization, our method matches existing lower bounds. Extending to three-stage stochastic programming, our results break the long-standing belief about three-stage stochastic programming is harder than classical stochastic optimization, and open up new directions for algorithmic design for three-stage problems.  

Akshit Kumar—Dynamic Resource Allocation: Algorithmic Design Principles and Spectrum of Achievable Performances

We consider a broad class of dynamic resource allocation problems, and study the performance of practical algorithms. In particular, we focus on the interplay between the distribution of request types and achievable performance, given the broad set of configurations that can be encountered in practical settings. While prior literature studied either a small number of request types or a continuum of types with no gaps, the present work allows for a large class of type distributions. Using initially the prototypical multi-secretary problem to explore fundamental performance limits as a function of type distribution properties, we develop a new algorithmic property “conservativeness with respect to gaps,” that guarantees near-optimal performance. In turn, we introduce a practically-motivated, simulation-based algorithm, and establish its near-optimal performance, not only for multi-secretary problems, but also for general dynamic resource allocation problems.

Xinyi Chen—A Non-stochastic Control Approach to Optimization

Selecting the best hyperparameters for a particular optimization instance, such as the learning rate and momentum, is an important but nonconvex problem. As a result, iterative optimization methods such as hypergradient descent lack global optimality guarantees in general.We propose an online nonstochastic control methodology for mathematical optimization. First, we formalize the setting of meta-optimization, an online learning formulation of learning the best optimization algorithm from a class of methods. The meta-optimization problem over gradient-based methods can be framed as a feedback control problem over the choice of hyperparameters, including the learning rate, momentum, and the preconditioner. We show how recent methods from online nonstochastic control can be applied to develop a convex relaxation, and obtain regret guarantees vs. the best offline solution. This guarantees that in meta-optimization, given a sequence of optimization problems, we can learn a method with performance comparable to that of the best method in hindsight from a class of methods. We end with experiments on a variety of tasks, from regression to deep neural network training, that demonstrate the practical effectiveness of our method.

Causal Inference and Distribution Shift—Hosted by Christina Lee Yu
Friday, October 11, 11:30 a.m. — 12:45 p.m. (Statler Hall, Room 265)

Ayoub Foussoul—Distributionally Robust Newsvendor on a Metric

We study the distributionally robust newsvendor on a metric problem, a fundamental generalization of the distributionally robust newsvendor problem of Scarf (1957), where the decision maker needs to jointly determine the inventory levels at multiple locations on a metric and design an online fulfillment policy for the uncertain demand that realizes sequentially over time. The goal is to minimize the total expected inventory and fulfillment costs. We design a near-optimal policy for the problem with theoretical guarantees on its performance. Our policy generalizes the classical solution of Scarf (1957), maintaining its simplicity and interpretability: it identifies a hierarchical set of clusters, assigns a “virtual” underage cost to each cluster, then makes sure that each cluster holds at least the inventory suggested by Scarf’s solution if the cluster behaved as a single point with “virtual” underage cost and original overage cost. As demand arrives sequentially, our policy fulfills orders from nearby clusters, minimizing transshipment costs, while balancing inventory consumption across the clusters to avoid depleting any single one. In addition to its theoretical performance, numerical experiments show that our policy performs well in practice.

Yihong Gu—Causality Pursuit from Heterogeneous Environments

Pursuing causality from data is a fundamental problem in scientific discovery, treatment intervention, and transfer learning. This paper introduces a novel algorithmic method for addressing nonparametric invariance and causality learning in regression models across multiple environments, where the joint distribution of response variables and covariates varies, but the conditional expectations of outcome given an unknown set of quasi-causal variables are invariant. The challenge of finding such an unknown set of quasi-causal or invariant variables is compounded by the presence of endogenous variables that have heterogeneous effects across different environments, including even one of them in the regression would make the estimation inconsistent. The proposed Focused Adversial Invariant Regularization (FAIR) framework utilizes an innovative minimax optimization approach that breaks down the barriers, driving regression models toward prediction-invariant solutions through adversarial testing. Leveraging the representation power of neural networks, FAIR neural networks (FAIR-NN) are introduced for causality pursuit. It is shown that FAIR-NN can find the invariant variables and quasi-causal variables under a minimal identification condition and that the resulting procedure is adaptive to low-dimensional composition structures in a non-asymptotic analysis. Under a structural causal model, variables identified by FAIR-NN represent pragmatic causality and provably align with exact causal mechanisms under conditions of sufficient heterogeneity. Computationally, FAIR-NN employs a novel Gumbel approximation with decreased temperature and stochastic gradient descent ascent algorithm. The procedures are convincingly demonstrated using simulated and real-data examples.

Reese Pathak—Towards optimal learning under distribution shift: new complexity measures and algorithms for tackling covariate shift

Common machine learning practice implicitly assumes that training data will resemble future data. However, this is often violated, as seen in evolving e-commerce trends, shifting patient populations in healthcare, and changing environments in autonomous driving. Ignoring such distribution shifts can lead to alarming consequences like misguided recommendations, ineffective medical treatments, or risky maneuvers in self-driving cars. In this talk, we describe new advances in learning under a form of distribution shift known as covariate shift. Our main contribution is establishing the (minimax) statistical complexity for prediction and estimation under covariate shift, within the rich framework of reproducing kernel Hilbert spaces (RKHSs). A key tool we develop is a novel, semidefinite programming (SDP)-based complexity measure through which we determine the optimal statistical rate. Our analysis provides concrete recommendations for new, rate-optimal procedures, based on a form of distribution-dependent shrinkage. Unlike previous results in this area, our work is comparatively fine-grained and assumption-light: we impose essentially no restrictions on covariate distributions and require neither the existence nor boundedness of likelihood ratios, thereby broadening the applicability of our theory. 

Poster Presenter Information

Poster sessions will be held 8:30am-9:45am Thursday Oct 10 (with sign in) and 8:30am-9:30am Friday Oct 11 in the atrium of Duffield Hall (343 Campus Rd, map here), in concert with breakfast (provided). You can either bring a physical copy of your poster with you, or bring a digital copy on USB drive and pay to print locally at Cornell University’s Mann LibraryPosters should be at most36" x 48” (inches).


Transportation in and around Ithaca

Tompkins County Area Transit (TCAT) serves the greater Ithaca area and the Cornell campus. TCAT has a free app you can use to find route information as well as to pay fares. You can learn more and download the TCAT app here.

Lyft and Uber are also available.

Parking on campus

Parking rules and regulations are in effect Mon-Fri 7:30am-5:00pm and some locations have signs that state there are night and weekend restrictions too.  If you would like to park on campus, you can pay for parking through the Parkmobile app on your phone.  Short-term parking options


Travel Information

Airports and Buses

Ithaca Airport: The closest option for flying would be arriving at the Ithaca airport.  It is only three miles away from Cornell campus, and an easy taxi/uber/lyft ride to get to and from. Many hotels provide shuttle service to and from the airport as well as around town.

Syracuse Airport: While Syracuse has a lot more flight options, it is a bit over an hour lyft/uber ride to get from Syracuse to Ithaca. Or you can rent a car via this link.

Other nearby airports (less than an hour drive) are Elmira and Binghamton. Both are quite small.

NYC Airports: The last option would be flying into any of the major NYC airports and taking a bus to Ithaca.  This is the longest option, as most buses from NYC to Ithaca take about 4.5 hours.  Some bus options include:

- Cornell Campus to Campus: Most comfortable option that leaves from the Cornell Club in midtown NYC and drops you off right on Cornell campus in Ithaca.

- OurBus: Has options leaving from either the George Washington Bridge or Port Authority to downtown Ithaca.  From downtown it is a short cab ride to get to Cornell.

Ithaca Hotels:

Cayuga Blu Hotel

2310 N Triphammer Rd, Ithaca, NY 14850•(607) 257-3100

Courtyard by Marriott Ithaca Airport/University

29 Thornwood Dr, Ithaca, NY 14850•(607) 330-1000

Fairfield Inn & Suites by Marriott Ithaca

359 Elmira Rd, Ithaca, NY 14850•(607) 277-1000

Hampton Inn Ithaca

337 Elmira Rd, Ithaca, NY 14850•(607) 277-5500

Hotels in the Greater Ithaca Area:

Best Western Plus Finger Lakes Inn & Suites

3175 Fingerlake East Dr, Cortland, NY 13045•(607) 756-2233

Fairfield by Marriott Inn & Suites Cortland

3707 NY-281, Cortland, NY 13045•(607) 299-4455

Hampton Inn Cortland

26 River St, Cortland, NY 13045•(607) 662-0007

Where the Locals Stay: 

The Dorm Hotel

518 Stewart Ave, Ithaca, NY 14850•(607) 319-4611

Firelight Camps 

1150 Danby Road, Ithaca, NY•(607) 229-1644 

Reservations: reservations@firelightcamps.com

Ithaca also has many premier Bed & Breakfasts, Airbnbs, VRBOs.