Publications
Filter by research theme
International Conference26 papers
ICML 2026
AAMAS 2026 (Extended abstract)
Time-Varyingness in Auction Breaks Revenue Equivalence
Theme:Bandits & Online LearningAISTATS 2026
Policy Testing in Markov Decision Processes
Theme:Reinforcement LearningNeurIPS 2025
Last Iterate Convergence in Monotone Mean Field Games
Theme:Learning in GamesNeurIPS 2025
Learning from Delayed Feedback in Games via Extra Prediction
Theme:Learning in GamesAAMAS 2025 (Full paper)
Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry
Theme:Learning in GamesAAMAS 2025 (Extended abstract)
Nash Equilibrium and Learning Dynamics in Three-Player Matching m-Action Games
Theme:Learning in GamesNAACL 2025
Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment
Theme:LLM AlignmentICLR 2025
Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games
Theme:Learning in GamesWSDM 2025 (Industry day talks)
Efficient Creative Selection in Online Advertising using Top-Two Thompson Sampling
Theme:Bandits & Online LearningAAAI 2025
Approximate State Abstraction for Markov Games
Theme:Learning in GamesAAAI 2025
Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibrium
Theme:Learning in GamesEMNLP 2024
Filtered Direct Preference Optimization
Theme:LLM AlignmentReinforcement Learning Conference (RLC) 2024
Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes
Theme:Reinforcement LearningICML 2024
Adaptively Perturbed Mirror Descent for Learning in Games
Theme:Learning in GamesICML 2024
Model-Based Minimum Bayes Risk Decoding
Theme:LLM AlignmentWWW 2024
Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems
Theme:Fairness & AllocationAISTATS 2024
Learning Fair Division from Bandit Feedback
Theme:Fairness & AllocationAAAI 2024
Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games
Theme:Learning in GamesIJCAI 2023
Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium
Theme:Learning in GamesSIGIR 2023 (Short Paper)
Exploration of Unranked Items in Safe Online Learning to Re-Rank
Theme:Bandits & Online LearningAISTATS 2023
Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games
Theme:Learning in GamesUAI 2022
Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games
Theme:Learning in GamesIJCAI 2022
Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search
Theme:Fairness & AllocationICML 2022
Thresholded LASSO Bandit
Theme:Bandits & Online LearningAAMAS 2021 (Full Paper)
Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games
Theme:Learning in GamesJournal2 papers
Transactions on Machine Learning Research
Return-Aligned Decision Transformer
Theme:Reinforcement LearningTransactions on Machine Learning Research
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment
Theme:LLM AlignmentInternational Workshop7 papers
NeurIPS 2025 Workshop on Aligning Reinforcement Learning Experimentalists and Theorists
Policy Testing in Markov Decision Processes
Theme:Reinforcement LearningICML 2024 Workshop on Models of Human Feedback for AI Alignment
Filtered Direct Preference Optimization
Theme:LLM AlignmentICML 2024 Workshop on Models of Human Feedback for AI Alignment
Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment
Theme:LLM AlignmentRecSys 2022 FAccTRec Workshop
Fair Matrix Factorisation for Large-Scale Recommender Systems
Theme:Fairness & AllocationAAAI 2022 Workshop on Reinforcement Learning in Games (Oral Presentation)
Computing Strategies of American Football via Counterfactual Regret Minimization
Theme:Learning in GamesNeurIPS 2021 Workshop on Deep Reinforcement Learning
Direct Expected Quadratic Utility Maximization for Mean-Variance Controlled Reinforcement Learning
Theme:Reinforcement LearningAAAI 2020 Workshop on Reinforcement Learning in Games
Online Learning for Bidding Agent in First Price Auction
Theme:Bandits & Online LearningPreprints4 papers
arXiv
The Power of Perturbation under Sampling in Solving Extensive-Form Games
Theme:Learning in GamesarXiv
Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative
Theme:LLM AlignmentarXiv
A Practical Guide of Off-Policy Evaluation for Bandit Problems
Theme:Bandits & Online LearningarXiv
A Simple Heuristic for Bayesian Optimization with A Low Budget
No papers match your filters.