<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Publications on Kenshi Abe</title><link>https://bakanaouji.github.io/publications/</link><description>Recent content in Publications on Kenshi Abe</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>© 2026 Kenshi Abe</copyright><lastBuildDate>Tue, 07 Jul 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://bakanaouji.github.io/publications/index.xml" rel="self" type="application/rss+xml"/><item><title>Asymmetric Perturbation in Solving Bilinear Saddle-Point Optimization</title><link>https://bakanaouji.github.io/publications/asymmetric-perturbation-bilinear-icml-2026/</link><pubDate>Tue, 07 Jul 2026 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/asymmetric-perturbation-bilinear-icml-2026/</guid><description/></item><item><title>Time-Varyingness in Auction Breaks Revenue Equivalence</title><link>https://bakanaouji.github.io/publications/time-varyingness-auction-aamas-2026/</link><pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/time-varyingness-auction-aamas-2026/</guid><description/></item><item><title>Policy Testing in Markov Decision Processes</title><link>https://bakanaouji.github.io/publications/policy-testing-mdp-aistats-2026/</link><pubDate>Sat, 02 May 2026 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/policy-testing-mdp-aistats-2026/</guid><description/></item><item><title>Policy Testing in Markov Decision Processes</title><link>https://bakanaouji.github.io/publications/policy-testing-mdp-neurips-2025/</link><pubDate>Sat, 06 Dec 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/policy-testing-mdp-neurips-2025/</guid><description/></item><item><title>Last Iterate Convergence in Monotone Mean Field Games</title><link>https://bakanaouji.github.io/publications/monotone-mean-field-games-neurips-2025/</link><pubDate>Wed, 03 Dec 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/monotone-mean-field-games-neurips-2025/</guid><description/></item><item><title>Learning from Delayed Feedback in Games via Extra Prediction</title><link>https://bakanaouji.github.io/publications/learning-from-delayed-feedback-neurips-2025/</link><pubDate>Wed, 03 Dec 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/learning-from-delayed-feedback-neurips-2025/</guid><description/></item><item><title>Return-Aligned Decision Transformer</title><link>https://bakanaouji.github.io/publications/return-aligned-decision-transformer-tmlr-2025/</link><pubDate>Sun, 08 Jun 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/return-aligned-decision-transformer-tmlr-2025/</guid><description/></item><item><title>Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry</title><link>https://bakanaouji.github.io/publications/global-behavior-zero-sum-games-aamas-2025/</link><pubDate>Wed, 21 May 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/global-behavior-zero-sum-games-aamas-2025/</guid><description/></item><item><title>Nash Equilibrium and Learning Dynamics in Three-Player Matching m-Action Games</title><link>https://bakanaouji.github.io/publications/three-player-matching-games-aamas-2025/</link><pubDate>Wed, 21 May 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/three-player-matching-games-aamas-2025/</guid><description/></item><item><title>Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment</title><link>https://bakanaouji.github.io/publications/regularized-best-of-n-naacl-2025/</link><pubDate>Wed, 30 Apr 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/regularized-best-of-n-naacl-2025/</guid><description/></item><item><title>Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games</title><link>https://bakanaouji.github.io/publications/boosting-perturbed-gradient-ascent-iclr-2025/</link><pubDate>Thu, 24 Apr 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/boosting-perturbed-gradient-ascent-iclr-2025/</guid><description/></item><item><title>Efficient Creative Selection in Online Advertising using Top-Two Thompson Sampling</title><link>https://bakanaouji.github.io/publications/creative-selection-online-advertising-wsdm-2025/</link><pubDate>Tue, 11 Mar 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/creative-selection-online-advertising-wsdm-2025/</guid><description/></item><item><title>Approximate State Abstraction for Markov Games</title><link>https://bakanaouji.github.io/publications/state-abstraction-markov-games-aaai-2025/</link><pubDate>Thu, 27 Feb 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/state-abstraction-markov-games-aaai-2025/</guid><description/></item><item><title>Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibrium</title><link>https://bakanaouji.github.io/publications/synchronization-periodic-zero-sum-aaai-2025/</link><pubDate>Thu, 27 Feb 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/synchronization-periodic-zero-sum-aaai-2025/</guid><description/></item><item><title>Evaluation of Best-of-N Sampling Strategies for Language Model Alignment</title><link>https://bakanaouji.github.io/publications/evaluation-best-of-n-tmlr-2025/</link><pubDate>Sat, 15 Feb 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/evaluation-best-of-n-tmlr-2025/</guid><description/></item><item><title>The Power of Perturbation under Sampling in Solving Extensive-Form Games</title><link>https://bakanaouji.github.io/publications/perturbation-under-sampling-efg-arxiv-2025/</link><pubDate>Tue, 28 Jan 2025 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/perturbation-under-sampling-efg-arxiv-2025/</guid><description/></item><item><title>Filtered Direct Preference Optimization</title><link>https://bakanaouji.github.io/publications/filtered-dpo-emnlp-2024/</link><pubDate>Tue, 12 Nov 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/filtered-dpo-emnlp-2024/</guid><description/></item><item><title>Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes</title><link>https://bakanaouji.github.io/publications/policy-gradient-mcts-rlc-2024/</link><pubDate>Sat, 10 Aug 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/policy-gradient-mcts-rlc-2024/</guid><description/></item><item><title>Filtered Direct Preference Optimization</title><link>https://bakanaouji.github.io/publications/filtered-dpo-icml-2024/</link><pubDate>Fri, 26 Jul 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/filtered-dpo-icml-2024/</guid><description/></item><item><title>Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment</title><link>https://bakanaouji.github.io/publications/regularized-best-of-n-icml-2024/</link><pubDate>Fri, 26 Jul 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/regularized-best-of-n-icml-2024/</guid><description/></item><item><title>Adaptively Perturbed Mirror Descent for Learning in Games</title><link>https://bakanaouji.github.io/publications/adaptively-perturbed-mirror-descent-icml-2024/</link><pubDate>Tue, 23 Jul 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/adaptively-perturbed-mirror-descent-icml-2024/</guid><description/></item><item><title>Model-Based Minimum Bayes Risk Decoding</title><link>https://bakanaouji.github.io/publications/model-based-mbr-icml-2024/</link><pubDate>Tue, 23 Jul 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/model-based-mbr-icml-2024/</guid><description/></item><item><title>Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems</title><link>https://bakanaouji.github.io/publications/scalable-fair-exposure-control-www-2024/</link><pubDate>Tue, 14 May 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/scalable-fair-exposure-control-www-2024/</guid><description/></item><item><title>Learning Fair Division from Bandit Feedback</title><link>https://bakanaouji.github.io/publications/learning-fair-division-aistats-2024/</link><pubDate>Thu, 02 May 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/learning-fair-division-aistats-2024/</guid><description/></item><item><title>Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games</title><link>https://bakanaouji.github.io/publications/memory-asymmetry-heteroclinic-orbits-aaai-2024/</link><pubDate>Thu, 22 Feb 2024 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/memory-asymmetry-heteroclinic-orbits-aaai-2024/</guid><description/></item><item><title>Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium</title><link>https://bakanaouji.github.io/publications/multi-memory-games-ijcai-2023/</link><pubDate>Tue, 22 Aug 2023 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/multi-memory-games-ijcai-2023/</guid><description/></item><item><title>Exploration of Unranked Items in Safe Online Learning to Re-Rank</title><link>https://bakanaouji.github.io/publications/safe-online-learning-to-rerank-sigir-2023/</link><pubDate>Mon, 24 Jul 2023 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/safe-online-learning-to-rerank-sigir-2023/</guid><description/></item><item><title>Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative</title><link>https://bakanaouji.github.io/publications/guided-dialog-adversarial-arxiv-2023/</link><pubDate>Thu, 13 Jul 2023 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/guided-dialog-adversarial-arxiv-2023/</guid><description/></item><item><title>Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games</title><link>https://bakanaouji.github.io/publications/last-iterate-full-noisy-feedback-aistats-2023/</link><pubDate>Tue, 25 Apr 2023 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/last-iterate-full-noisy-feedback-aistats-2023/</guid><description/></item><item><title>Fair Matrix Factorisation for Large-Scale Recommender Systems</title><link>https://bakanaouji.github.io/publications/scalable-fair-exposure-control-recsys-2022/</link><pubDate>Fri, 23 Sep 2022 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/scalable-fair-exposure-control-recsys-2022/</guid><description/></item><item><title>Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games</title><link>https://bakanaouji.github.io/publications/mutation-driven-ftrl-uai-2022/</link><pubDate>Tue, 02 Aug 2022 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/mutation-driven-ftrl-uai-2022/</guid><description/></item><item><title>Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search</title><link>https://bakanaouji.github.io/publications/medical-residency-match-ijcai-2022/</link><pubDate>Tue, 26 Jul 2022 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/medical-residency-match-ijcai-2022/</guid><description/></item><item><title>Thresholded LASSO Bandit</title><link>https://bakanaouji.github.io/publications/thresholded-lasso-bandit-icml-2022/</link><pubDate>Tue, 19 Jul 2022 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/thresholded-lasso-bandit-icml-2022/</guid><description/></item><item><title>Computing Strategies of American Football via Counterfactual Regret Minimization</title><link>https://bakanaouji.github.io/publications/american-football-cfr-aaai-2022/</link><pubDate>Mon, 28 Feb 2022 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/american-football-cfr-aaai-2022/</guid><description/></item><item><title>Direct Expected Quadratic Utility Maximization for Mean-Variance Controlled Reinforcement Learning</title><link>https://bakanaouji.github.io/publications/direct-quadratic-utility-maximization-neurips-2021/</link><pubDate>Mon, 13 Dec 2021 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/direct-quadratic-utility-maximization-neurips-2021/</guid><description/></item><item><title>Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games</title><link>https://bakanaouji.github.io/publications/off-policy-exploitability-evaluation-aamas-2021/</link><pubDate>Wed, 05 May 2021 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/off-policy-exploitability-evaluation-aamas-2021/</guid><description/></item><item><title>A Practical Guide of Off-Policy Evaluation for Bandit Problems</title><link>https://bakanaouji.github.io/publications/off-policy-evaluation-bandits-guide-arxiv-2020/</link><pubDate>Fri, 23 Oct 2020 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/off-policy-evaluation-bandits-guide-arxiv-2020/</guid><description/></item><item><title>Online Learning for Bidding Agent in First Price Auction</title><link>https://bakanaouji.github.io/publications/bidding-agent-first-price-auction-aaai-2020/</link><pubDate>Sat, 08 Feb 2020 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/bidding-agent-first-price-auction-aaai-2020/</guid><description/></item><item><title>A Simple Heuristic for Bayesian Optimization with A Low Budget</title><link>https://bakanaouji.github.io/publications/bayesian-optimization-low-budget-arxiv-2019/</link><pubDate>Mon, 18 Nov 2019 00:00:00 +0000</pubDate><guid>https://bakanaouji.github.io/publications/bayesian-optimization-low-budget-arxiv-2019/</guid><description/></item></channel></rss>