Combinatorial game theory in Health Sciences #03 - Cooling games and hot strategies
Time is money. Be cool.
Now that we are familiar with the games, we can analyze our meta decisions. We will learn how to freeze games and also how to apply the hot strategy (Hostrat) to deal with a board full of options.
After all, when to make decisions?
If we have several concurrent games, how do we decide on the order of actions?
To do this, we will explore the concepts of cooling and heating.
The concept of temperatures, heating and cooling is appropriate in games with trade-offs between time/movements and scoring. For example, turn-based games such as chess, checkers and Go. In games like Go, each move allows you to place a piece on the field, occupying a territory and earning 1 point. Thus, we can think about the cost of movements in relation to the rewards. Certain scenarios may require several moves with a low final score (and vice versa).
Fortunately, for a good range of situations, “time is money,” which makes combinatorial game theory useful.
Contracts
We can see games as contracts in which both parties have the option of implementing their position and redeeming a value.
Using the form presented previously, G: {G_L|G_R}, we adopt the convention that positives ({10|},{3|},{10|}…) favor Belle and negatives, Wright ({|-2}, {|-10}…) . In the game {1000|-1000}, whoever has the initiative between Belle and Wright can win 1000 coins. In the {1002|1000} game, Belle can choose to win 1002 coins and, alternatively, Wright can choose to have Belle win 1000 coins. It is obvious that the first contract has higher priority. 1000 coins are at stake for whoever plays first, Belle or Wright. The second game guarantees Belle 1000 or 1002 points, so it is less sensitive to player decisions. As we will see, the first has temperature t=1000, and average value m=0, while the second has temperature t=1 and average value m=1001. This reflects the neutrality and urgency of the first situation, while the second has a positive average value (favors Belle) and is less ‘hot’.
Let's spice things up a little. Considering the case in which Wright rents a property for 1 salary with Belle, on an ongoing basis, and is in default. The terms say that if Wright is in default, Belle can either (a) charge 1 salary rent or (b) charge a penalty, in which Wright must pay 2 salaries to pay off the debt. If Wright doesn't pay, she can also charge a new fine of 3 salaries. We have the game: G = {1 , {3|2} | 1 }. The GL2 game (second game available to Belle, {3|2}) represents the option to activate the fine clause.
If you have the opportunity, for Wright, the best thing is to take action and pay quickly, as 1 coin (GR) is less costly than risking the possibility of Belle applying sanctions by choosing the {3|2} game. A delay could cost Wright 2 salaries (paying the fine). If Wright does not pay off the debt again on his shift, Belle can still charge 3 salaries (additional fine).
However, this changes in scenarios with other concurrent games. Let's assume that Wright has another contract in place, H = {2 , {4|3} | 1/2 }. In this other rental contract, Wright can pay 'on time', 1/2 salary, avoiding possible charges of 2, 3 or 4 salaries in late fines. If he has limited time/money, between G and H, he should prioritize game H. Let's understand how to formalize this case by cooling things down a bit.
Cooling
When we consider the cold situation, we subtract from the set on the left and add to the set on the right. In the convention we have adopted here, both sides are harmed. That is, Belle's options (positives) are subtracted at t and Wright's options (negatives) are added at t. A game G cooled to a temperature t is given by: Gt = {GL - t | GR + t}.
In cooled contracts, the return to Belle is lower and the amount to be paid by Wright is higher. The cooling process is especially interesting in games like G={4|3}. There are no numbers greater than 4 and less than 3, but cooling G leads to games that are “almost numbers”, like {1|1} (the same as 1 + *), infinitesimally different from these. Cooling almost numbers turns them into numbers. For example, cooling at t=1/2, we have G1/2 = {4 - 1/2 | 3 + 1/2} = {7/2 | 7/2 } = 7/2 + * (star). Thus, we know that the average value of G is 3.5 (7/2) for temperatures colder than 1/2. Note that, in this convention, cooling for higher temperatures t means cooling more.
In Conway's analogy, games can be seen as vibrating between two states, their values L(G) and R(G). Thus, the cooling process consists of finding the average numerical value m(G) of a game by extinguishing these vibrations to a freezing point.
Knowing the behavior of games at different temperatures, it is possible to analyze priorities in a more procedural way. Considering the two contract games in force for Wright:
G = {1 , {3|2} | 1 }
H = {2 , {4|3} | 1/2 }
The cooled forms for low temperatures are:
Gt = {1 -t, {3 - t | 2 +t} -t | 1 + t }.
Ht = {2 -t , {4 -t |3 +t} -t | 1/2 + t}.
The process here consists of gradually cooling the games, observing 'state changes', such as the above transformation of games “{a|b}” into “c + *”.
Cooling G and H at t=1/2:
G1/2 = {1 -1/2, {3 - 1/2 | 2 +1/2} -1/2 | 1 + 1/2 }
H1/2 = {2 -1/2 , {4 -1/2 |3 +1/2} -1/2| | 1/2 + 1/2}
G1/2 = {1/2 , {2|2} | 3/2 } = {1/2 , 2+* | 3/2 }, in simple form, we eliminate ‘dominated’ (minor) options in GL. As 1/2 < 2+*, we have: { 2 + * | 3/2 }
H1/2 = {3/2, {3|3} | 1 }. = {3/2, 3 + * | 1 }, in simple form, 3/2 (GL1) is dominated by the game 3+* (GL2) in GL and we have: {3 + * | 1}.
At a temperature of 1/2, the games GL2 and HL2 (‘the penalty games’) come to the form {k|k}, with values 2+* for GL2 and 3 + * for HL2. As we have seen, numbers in the form k + * (star) are infinitely larger than k.
G1/2 = {1/2, 2 + * | 3/2}
H1/2 = {3/2, 3 + * | 1}.
Let us cool the temperature by +1/4, for t=3/4
G3/4 = {1/4 , 7/4 + * | 7/4 } = 7/4 + * (in simple form, eliminating GL1 = 1/4 < GL2)
H3/4 = {5/4 , 11/4 * | 5/4 }
At a temperature t=3/4, G has reached its mean value: m = 7/4* ~ 1.75.
G6/4 = 7/4 = 1.75 (a game cooled beyond its freezing point maintains its mean value)
H6/4 = {1/2 , 8/4* | 8/4} = {1/2, 2* | 2} = 2*.
The freezing temperature (the moment at which H becomes a number or quasi-number, such as k*) is higher in H (6/4) than in G (3/4) and than in GL2 and HL2 (1/2). Above t= 1.5 (6/4), we have mG = 1.75 and mH = 2.
By playing H, Belle can avoid a bargain (HR: +1/2), ensuring fines of 4 or 3 salaries (HL2), although Wright can pay the other rent (GR: +1) and avoid the fine for defaulting (2 salaries in GL2_R, or 3 in GL2_L, if he is ‘late’ again). On the other hand, it is also advantageous for Wright to play H, paying 1/2 salary (HR), avoiding larger fines (3 or 4 salaries in HL2) and suffering the smaller fines in GL2.
Knowing average values and temperatures, we can examine how the set of games behaves in different situations and act using some strategies, or heuristics of our own.
Hotstrat consists of always playing at the highest temperature. That is, considering the option of many games, G1, G2, G3… play at the hottest. After a new round (opponent's move), recalculate the temperatures and play again at the hottest.
Using Hotsrat for the games above, Wright or Belle having initiative:
Wright with initiative: Wright plays HR (1/2), ending H. Belle plays penalty clause, in GL2, and Wright pays 2 salaries (GL2_R). Total +2.5 favoring Belle.
W: HR (+1/2), B: GL2 , W: GL2_R (+2).
Belle with initiative: Belle plays HL2. Now, we have G= {1 , {3|2} | 1 } and HL2={4|3}. The temperature of G, as we have seen, is 3/4. That of HL2 is t=1/2, when we have the average value {4 - 1/2 | 3 + 1/2} = 7/2 = 3.5. On his turn, Wright should play GR, since G has a higher temperature (t=3/4>1/2), ending this game with GR=1 (+1). In turn, Belle, enforces the penalty contract, closing H with HL2_L (+4). Total +5 favoring Belle.
B: HL2, W: GR (+1), B: HL2_L (+4).
Soon, we will see specific diagrams to study game temperatures and their sums. The thermographs below represent our games, G and H with all the suggested phase transitions.