Publications on Kenshi Abe

Publications on Kenshi Abehttps://bakanaouji.github.io/publications/Recent content in Publications on Kenshi AbeHugo -- gohugo.ioen© 2026 Kenshi AbeTue, 07 Jul 2026 00:00:00 +0000Asymmetric Perturbation in Solving Bilinear Saddle-Point Optimizationhttps://bakanaouji.github.io/publications/asymmetric-perturbation-bilinear-icml-2026/Tue, 07 Jul 2026 00:00:00 +0000https://bakanaouji.github.io/publications/asymmetric-perturbation-bilinear-icml-2026/Time-Varyingness in Auction Breaks Revenue Equivalencehttps://bakanaouji.github.io/publications/time-varyingness-auction-aamas-2026/Wed, 27 May 2026 00:00:00 +0000https://bakanaouji.github.io/publications/time-varyingness-auction-aamas-2026/Policy Testing in Markov Decision Processeshttps://bakanaouji.github.io/publications/policy-testing-mdp-aistats-2026/Sat, 02 May 2026 00:00:00 +0000https://bakanaouji.github.io/publications/policy-testing-mdp-aistats-2026/Policy Testing in Markov Decision Processeshttps://bakanaouji.github.io/publications/policy-testing-mdp-neurips-2025/Sat, 06 Dec 2025 00:00:00 +0000https://bakanaouji.github.io/publications/policy-testing-mdp-neurips-2025/Last Iterate Convergence in Monotone Mean Field Gameshttps://bakanaouji.github.io/publications/monotone-mean-field-games-neurips-2025/Wed, 03 Dec 2025 00:00:00 +0000https://bakanaouji.github.io/publications/monotone-mean-field-games-neurips-2025/Learning from Delayed Feedback in Games via Extra Predictionhttps://bakanaouji.github.io/publications/learning-from-delayed-feedback-neurips-2025/Wed, 03 Dec 2025 00:00:00 +0000https://bakanaouji.github.io/publications/learning-from-delayed-feedback-neurips-2025/Return-Aligned Decision Transformerhttps://bakanaouji.github.io/publications/return-aligned-decision-transformer-tmlr-2025/Sun, 08 Jun 2025 00:00:00 +0000https://bakanaouji.github.io/publications/return-aligned-decision-transformer-tmlr-2025/Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetryhttps://bakanaouji.github.io/publications/global-behavior-zero-sum-games-aamas-2025/Wed, 21 May 2025 00:00:00 +0000https://bakanaouji.github.io/publications/global-behavior-zero-sum-games-aamas-2025/Nash Equilibrium and Learning Dynamics in Three-Player Matching m-Action Gameshttps://bakanaouji.github.io/publications/three-player-matching-games-aamas-2025/Wed, 21 May 2025 00:00:00 +0000https://bakanaouji.github.io/publications/three-player-matching-games-aamas-2025/Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignmenthttps://bakanaouji.github.io/publications/regularized-best-of-n-naacl-2025/Wed, 30 Apr 2025 00:00:00 +0000https://bakanaouji.github.io/publications/regularized-best-of-n-naacl-2025/Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Gameshttps://bakanaouji.github.io/publications/boosting-perturbed-gradient-ascent-iclr-2025/Thu, 24 Apr 2025 00:00:00 +0000https://bakanaouji.github.io/publications/boosting-perturbed-gradient-ascent-iclr-2025/Efficient Creative Selection in Online Advertising using Top-Two Thompson Samplinghttps://bakanaouji.github.io/publications/creative-selection-online-advertising-wsdm-2025/Tue, 11 Mar 2025 00:00:00 +0000https://bakanaouji.github.io/publications/creative-selection-online-advertising-wsdm-2025/Approximate State Abstraction for Markov Gameshttps://bakanaouji.github.io/publications/state-abstraction-markov-games-aaai-2025/Thu, 27 Feb 2025 00:00:00 +0000https://bakanaouji.github.io/publications/state-abstraction-markov-games-aaai-2025/Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibriumhttps://bakanaouji.github.io/publications/synchronization-periodic-zero-sum-aaai-2025/Thu, 27 Feb 2025 00:00:00 +0000https://bakanaouji.github.io/publications/synchronization-periodic-zero-sum-aaai-2025/Evaluation of Best-of-N Sampling Strategies for Language Model Alignmenthttps://bakanaouji.github.io/publications/evaluation-best-of-n-tmlr-2025/Sat, 15 Feb 2025 00:00:00 +0000https://bakanaouji.github.io/publications/evaluation-best-of-n-tmlr-2025/The Power of Perturbation under Sampling in Solving Extensive-Form Gameshttps://bakanaouji.github.io/publications/perturbation-under-sampling-efg-arxiv-2025/Tue, 28 Jan 2025 00:00:00 +0000https://bakanaouji.github.io/publications/perturbation-under-sampling-efg-arxiv-2025/Filtered Direct Preference Optimizationhttps://bakanaouji.github.io/publications/filtered-dpo-emnlp-2024/Tue, 12 Nov 2024 00:00:00 +0000https://bakanaouji.github.io/publications/filtered-dpo-emnlp-2024/Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processeshttps://bakanaouji.github.io/publications/policy-gradient-mcts-rlc-2024/Sat, 10 Aug 2024 00:00:00 +0000https://bakanaouji.github.io/publications/policy-gradient-mcts-rlc-2024/Filtered Direct Preference Optimizationhttps://bakanaouji.github.io/publications/filtered-dpo-icml-2024/Fri, 26 Jul 2024 00:00:00 +0000https://bakanaouji.github.io/publications/filtered-dpo-icml-2024/Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignmenthttps://bakanaouji.github.io/publications/regularized-best-of-n-icml-2024/Fri, 26 Jul 2024 00:00:00 +0000https://bakanaouji.github.io/publications/regularized-best-of-n-icml-2024/Adaptively Perturbed Mirror Descent for Learning in Gameshttps://bakanaouji.github.io/publications/adaptively-perturbed-mirror-descent-icml-2024/Tue, 23 Jul 2024 00:00:00 +0000https://bakanaouji.github.io/publications/adaptively-perturbed-mirror-descent-icml-2024/Model-Based Minimum Bayes Risk Decodinghttps://bakanaouji.github.io/publications/model-based-mbr-icml-2024/Tue, 23 Jul 2024 00:00:00 +0000https://bakanaouji.github.io/publications/model-based-mbr-icml-2024/Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systemshttps://bakanaouji.github.io/publications/scalable-fair-exposure-control-www-2024/Tue, 14 May 2024 00:00:00 +0000https://bakanaouji.github.io/publications/scalable-fair-exposure-control-www-2024/Learning Fair Division from Bandit Feedbackhttps://bakanaouji.github.io/publications/learning-fair-division-aistats-2024/Thu, 02 May 2024 00:00:00 +0000https://bakanaouji.github.io/publications/learning-fair-division-aistats-2024/Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Gameshttps://bakanaouji.github.io/publications/memory-asymmetry-heteroclinic-orbits-aaai-2024/Thu, 22 Feb 2024 00:00:00 +0000https://bakanaouji.github.io/publications/memory-asymmetry-heteroclinic-orbits-aaai-2024/Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibriumhttps://bakanaouji.github.io/publications/multi-memory-games-ijcai-2023/Tue, 22 Aug 2023 00:00:00 +0000https://bakanaouji.github.io/publications/multi-memory-games-ijcai-2023/Exploration of Unranked Items in Safe Online Learning to Re-Rankhttps://bakanaouji.github.io/publications/safe-online-learning-to-rerank-sigir-2023/Mon, 24 Jul 2023 00:00:00 +0000https://bakanaouji.github.io/publications/safe-online-learning-to-rerank-sigir-2023/Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternativehttps://bakanaouji.github.io/publications/guided-dialog-adversarial-arxiv-2023/Thu, 13 Jul 2023 00:00:00 +0000https://bakanaouji.github.io/publications/guided-dialog-adversarial-arxiv-2023/Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Gameshttps://bakanaouji.github.io/publications/last-iterate-full-noisy-feedback-aistats-2023/Tue, 25 Apr 2023 00:00:00 +0000https://bakanaouji.github.io/publications/last-iterate-full-noisy-feedback-aistats-2023/Fair Matrix Factorisation for Large-Scale Recommender Systemshttps://bakanaouji.github.io/publications/scalable-fair-exposure-control-recsys-2022/Fri, 23 Sep 2022 00:00:00 +0000https://bakanaouji.github.io/publications/scalable-fair-exposure-control-recsys-2022/Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Gameshttps://bakanaouji.github.io/publications/mutation-driven-ftrl-uai-2022/Tue, 02 Aug 2022 00:00:00 +0000https://bakanaouji.github.io/publications/mutation-driven-ftrl-uai-2022/Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Searchhttps://bakanaouji.github.io/publications/medical-residency-match-ijcai-2022/Tue, 26 Jul 2022 00:00:00 +0000https://bakanaouji.github.io/publications/medical-residency-match-ijcai-2022/Thresholded LASSO Bandithttps://bakanaouji.github.io/publications/thresholded-lasso-bandit-icml-2022/Tue, 19 Jul 2022 00:00:00 +0000https://bakanaouji.github.io/publications/thresholded-lasso-bandit-icml-2022/Computing Strategies of American Football via Counterfactual Regret Minimizationhttps://bakanaouji.github.io/publications/american-football-cfr-aaai-2022/Mon, 28 Feb 2022 00:00:00 +0000https://bakanaouji.github.io/publications/american-football-cfr-aaai-2022/Direct Expected Quadratic Utility Maximization for Mean-Variance Controlled Reinforcement Learninghttps://bakanaouji.github.io/publications/direct-quadratic-utility-maximization-neurips-2021/Mon, 13 Dec 2021 00:00:00 +0000https://bakanaouji.github.io/publications/direct-quadratic-utility-maximization-neurips-2021/Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Gameshttps://bakanaouji.github.io/publications/off-policy-exploitability-evaluation-aamas-2021/Wed, 05 May 2021 00:00:00 +0000https://bakanaouji.github.io/publications/off-policy-exploitability-evaluation-aamas-2021/A Practical Guide of Off-Policy Evaluation for Bandit Problemshttps://bakanaouji.github.io/publications/off-policy-evaluation-bandits-guide-arxiv-2020/Fri, 23 Oct 2020 00:00:00 +0000https://bakanaouji.github.io/publications/off-policy-evaluation-bandits-guide-arxiv-2020/Online Learning for Bidding Agent in First Price Auctionhttps://bakanaouji.github.io/publications/bidding-agent-first-price-auction-aaai-2020/Sat, 08 Feb 2020 00:00:00 +0000https://bakanaouji.github.io/publications/bidding-agent-first-price-auction-aaai-2020/A Simple Heuristic for Bayesian Optimization with A Low Budgethttps://bakanaouji.github.io/publications/bayesian-optimization-low-budget-arxiv-2019/Mon, 18 Nov 2019 00:00:00 +0000https://bakanaouji.github.io/publications/bayesian-optimization-low-budget-arxiv-2019/