Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

Open AccessPosted Content

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

- 13 Feb 2021

- arXiv: Computer Science and Game Theory

9

TL;DR: In this paper, the authors formalize behavioral deviations as a general class of deviations that respect the structure of extensive-form games, and introduce an extensive form regret minimization (EFR) algorithm that achieves hindsight rationality for any given set of behavioral deviations with computation that scales closely with the complexity of the set.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

Simple Uncoupled No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium.

Gabriele Farina, +3 more

- 04 Apr 2021

- arXiv: Computer Science and Game Theory

TL;DR: The existence of uncoupled no-regret learning dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems.

...read moreread less

3

•Journal Article•10.1609/aaai.v36i5.20431

Fast Payoff Matrix Sparsification Techniques for Structured Extensive-Form Games

28 Jun 2022

- Proceedings of the ... AAAI Conference o...

TL;DR: In this paper , the existence of extremely sparse factorizations in poker games can be tied to their particular Kronecker-product structure, and two ways of computing strong sparsifications of poker games (as well as any other game with a similar structure) are given.

...read moreread less

•Posted Content

Efficient Decentralized Learning Dynamics for Extensive-Form Coarse Correlated Equilibrium: No Expensive Computation of Stationary Distributions Required.

Gabriele Farina, +2 more

- 16 Sep 2021

- arXiv: Computer Science and Game Theory

TL;DR: In this paper, the authors show that EFCCE is more akin to NFCCE than to EFCE from a learning perspective, and they show that any learning dynamics for EFCCEs automatically guarantees convergence to EFCCe.

...read moreread less

•Posted Content

The Partially Observable History Process.

Dustin Morrill, +2 more

- 16 Nov 2021

- arXiv: Artificial Intelligence

TL;DR: The partially observable history process (POHP) formalism for reinforcement learning as discussed by the authors provides a streamlined interface for designing algorithms that defy categorization as exclusively single or multi-agent and for developing theory that applies across these domains.

...read moreread less

Proceedings Article•10.1609/aaai.v38i9.28859

On the Outcome Equivalence of Extensive-Form and Behavioral Correlated Equilibria

Brian Hu Zhang, +1 more

- 24 Mar 2024

- Proceedings of the ... AAAI Conference o...

TL;DR: The extensive-form and behavioral correlated equilibria are outcome-equivalent.

...read moreread less

References

•Journal Article•10.1016/0304-4068(74)90037-8

Subjectivity and correlation in randomized strategies

Robert J. Aumann

- 01 Mar 1974

- Journal of Mathematical Economics

TL;DR: This paper examined the consequences of basing mixed strategies on subjective random devices, i.e. devices on the probabilities of whose outcomes people may disagree (such as horse races, elections, etc.).

...read moreread less

2K

•Journal Article•10.1111/1468-0262.00153

A simple adaptive procedure leading to correlated equilibrium

Sergiu Hart, +1 more

- 01 Sep 2000

- Econometrica

TL;DR: In this article, regret-matching is proposed for playing a game, where players may depart from their current play with probabilities that are proportional to measures of regret for not having used other strategies in the past.

...read moreread less

1.4K

Book Chapter•10.1515/9781400881970-012

11. Extensive Games and the Problem of Information

H. W. Kuhn

- 31 Dec 1953

1K

•Proceedings Article•10.7939/R3Q23R282

Regret Minimization in Games with Incomplete Information

Martin Zinkevich, +3 more

- 03 Dec 2007

TL;DR: It is shown how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium, and is demonstrated in the domain of poker, showing it can solve abstractions of limit Texas Hold'em with as many as 1012 states, two orders of magnitude larger than previous methods.

...read moreread less

877

•Journal Article•10.1023/A:1007424614876

Tracking the Best Expert

Mark Herbster, +1 more

- 01 Aug 1998

- Machine Learning

TL;DR: The generalization allows the sequence to be partitioned into segments, and the goal is to bound the additional loss of the algorithm over the sum of the losses of the best experts for each segment to model situations in which the examples change and different experts are best for certain segments of the sequence of examples.

...read moreread less

616