Deniz Yuret's Homepage: Mathematical Go

[Math_ Books_ Games_] --DRAFT-- Berlekamp and Wolfe's Mathematical Go is a book on the wonderful subject of the relation between Combinatorial Game Theory (CGT) and end games in Go. I am a computer scientist interested in game playing programs and it seems the ideas of CGT may help conquer Go, one of the last frontiers of computer game playing (in most other popular games, computers have approached or passed the world championship level). I have tried to read and understand the book a couple of times, each time getting more frustrated. This may have three possible reasons: (i) I am dumb, (ii) the concepts are just too complicated, or (iii) there is something wrong with the exposition.

Now, it is a fact that more people in the world know about Go than
CGT. So it would make sense to use the language of Go to explain
difficult concepts in CGT. But the authors of course (being
mathematicians) have chosen to use the arcane language of CGT to
explain its relation to Go. It is no surprise that mathematicians
prefer a cryptic but exact language to an intuitive but fuzzy one. Yet
I have a suspicion that there could be a better way to explain these
ideas, and this is my attempt. The target audience is people who know
a little bit about Go and a little bit about game search algorithms.

Motivation

A Go game usually revolves around several semi-independent regions on
the 19x19 board. This is unlike chess, where a move in one corner of
the board usually effects the whole game. Thus typical chess programs
perform a (optimized) search on a tree of all possible legal
moves. Such a tree contains around b^n moves to analyze if (i) we want
to look ahead n moves and (ii) the number of legal moves at each turn
(the branching factor) is around b. The optimizations bring the actual
analyzed number to around b^(n/2). For chess, the branching factor is
around 10-20, and 12-15 move look-aheads are routinely performed by
modern programs.

Go starts with a 360+ branching factor. Thus searching a single tree
of all legal moves is out of the question. But note what happens if we
can successfully split the game into k independent regions. Each
region will have b/k legal moves on average. An optimized n move
look-ahead search will need to analyze (b/k)^(n/2) moves for each
region. If we could combine the results of these searches and compute
the best move in constant time using CGT, we will only need to search
k * (b/k)^(n/2) which is much smaller than b^(n/2). For example if b =
100 and we want a 10 move look ahead, the full search will cost us 10
billion moves (about three hours at a million moves per second). If we
could split the board into 5 equally sized independent games, search
each one 10 moves ahead and combine the results, the search would cost
16 million moves (16 seconds at a million moves per second).

Of course it is rarely possible to isolate one part of a Go board from
the rest. The examples in Mathematical Go exclusively deal with
completely isolated regions surrounded by unconditionally live
groups. Such a region has no effect on the rest of the board except
for the timing of the moves. However with further progress, it may be
possible to apply the theory to weakly interacting regions. Spatially
dividing the game seems to me the only option if search is going to
have any chance.

An Empty Corridor

So let's try to understand what CGT is trying to tell us. Here is an
isolated region (assuming the surrounding white and black stones are
alive).


$$ Empty corridor 
$$ O X X X X
$$ O a b c X
$$ O X X X X

The search tree is simple. If it is white's turn, he'll move into (a)
and black will defend at (b) to get 1 point of territory, if it is
black's turn, he'll move into (a) and get 2 points of territory. If
this was the only active region left on the board this is all the
analysis we would need.

However we should consider potential moves in other parts of the
board. If black has a better prospect elsewhere, he may choose not to
defend his corridor, in effect "passing" in this region of the
board. The Go term for such "playing elsewhere" is tenuki. When would
black do such a thing? If it is black's turn we said he can grab 2
points. If he plays tenuki, and white attacks at (a) black can take at
most 1. If he plays tenuki twice, white may not give him any points in
the corridor. So every tenuki costs black one point up to two
points. Black will choose to play elsewhere if such a move is worth
more than a point. He may be indifferent if such a move is worth one
point.

To decide whether to spend our next move in this region rather than
some other part of the board, does the location or the exact shape of
the corridor matter? Obviously not. Does the size of the corridor
matter? We will see that sometimes it does. What is the exact
information we need to make the correct decision? All we need is the
analysis from the last paragraph, i.e. black can play tenuki twice at
a regional cost of one point each. I'd like to write this out as
[1,1,0]. CGT represents it using the number 1+1/4.

The first big idea of the CGT is that we can represent these isolated
regions with such small bits of information and mostly forget about
the details. The second big idea is to use these small bits of
information to calculate (i) which region contains the best move, and
(ii) who is winning by how much assuming ideal play. To illustrate how
this calculation works, we need an example with more than one region.

Lots of empty corridors

Let's assume that our endgame consists of n copies of the above
corridor. If it is white's turn, he can attack the first corridor,
black can defend and get one point, white can attack the second
corridor, etc. and black will have n points at the end.


$$w Bad black strategy
$$ O X X X X
$$ O 1 2 . X
$$ O X X X X
$$ O 1 2 . X
$$ O X X X X

But this is not the best possible strategy. Black may choose not to
defend where white attacks and instead may block another corridor to
get 2 points. In fact he can get an extra point using this strategy
if there are four corridors:


$$w Better black strategy
$$ O X X X X _ _ O X X X X
$$ O 1 5 . X _ _ O 1 3 . X
$$ O X X X X _ _ O X X X X
$$ O 2 . . X _ _ O 2 . . X
$$ O X X X X _ _ O X X X X
$$ O 3 6 . X _ _ O 4 . . X
$$ O X X X X _ _ O X X X X
$$ O 4 . . X _ _ O 5 6 . X
$$ O X X X X _ _ O X X X X

Verifying the following is left as an exercise for the reader:

* The ideal strategy for both black and white is to play at the edge
of the longest remaining corridor.
* If white attacks first, playing the ideal strategy, black can get
one point for each corridor plus an extra point for every fourth
corridor.
* If black defends first, he gets two points for the first corridor
and refer to the previous rule for the other corridors.

Now the CGT value of 1+1/4 for this position may make a little more
sense. If you add up the values of all the corridors you get a
fractional number of points. The player on move will be able to round
the fraction in his favor.

Draft...

Here is a slightly different example:

This time if it is black's turn, he can get 4 points (one for the prisoner). If he plays elsewhere, and white attacks at "a", he can get 3 points. If he tenukis twice and white keeps attacking, he can still get 2 points. But if he tenukis a third time, white will save his stone and black will get zero points. So the tenuki costs for black this time are [1,1,2]. CGT represents this situation with an infinitesimal (something that is greater in absolute value than zero yet smaller than any positive real number) called "double up star" (written "^^*").

A third example to introduce another member of the CGT menagerie:

In the top corridor black will get 4, 3, 0 points respectively if he tenukis 0, 1, and 2 times. In my notation this is [1,3]. CGT calls it TINY-1. The bottom corridor gives black 5, 4, 0 points to black if he tenukis 0, 1, and 2 times. The costs of the tenukis are [1,4]. According to CGT, this is TINY-2. Tinies are things that are smaller than infinitesimals yet still larger than zero. Opposite of a tiny is called a miny.

So the first big idea of the CGT is that we can represent these isolated regions with small amounts of information and mostly forget about the details. The second big idea is to use these small bits of information to calculate (i) who is winning by how much, and (ii) in which region is the best move.


$$ X O O O O O
$$ X a b c d O
$$ X O O O O O
$$ X e f O . .
$$ X O O O . .

2+1/8, 1/2
wa,be - 3
we,ba,wb - 3
ba,we,bb,wc - 2
be,wa - 3
if g: be,wf,ba,wb - 3 (2+1/8,1+1/2)
ba,wb,be,wf - 3

Related link

Deniz Yuret's Homepage

May 20, 2005

Mathematical Go

No comments:

Labels

Popular Posts

My Blog List

Archive