An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators.
In multiagent systems, each agent must evaluate the activities of other agents and their impact on its own well-being. The unpredictability of these other agents might bring variables into the agent’s problem-solving process. Adversarial search difficulties emerge in competitive settings when the objectives of actors are contradictory.
Minmax Algorithm
The minimax algorithm determines the minimax choice based on the present condition. It employs a straightforward recursive calculation of the minimax values for each successor state, simply applying the defining equations. The recursion continues until the leaves of the tree, after which the minimax values are propagated back through the tree as the recursion unwinds.
Algorithm
function MINIMAX-DECISION(state) returns an action return argmaxa ∈ ACTIONS(s) MIN-VALUE(RESULT(state,a)) |
function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v ← −∞ for each a in ACTIONS(state) do v ← MAX(v, MIN-VALUE(RESULT(s, a))) return v |
function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v←∞ for each a in ACTIONS(state) do v ← MIN(v, MAX-VALUE(RESULT(s, a))) return v |
Alpha Beta Pruning
The whole minimax search looks at certain areas of the tree it does not need. Applying Alpha Beta pruning to a regular minimax tree results in the same action as minimax would, but prunes away branches that cannot potentially affect the ultimate choice.
In other words, the value of the root and hence the minimax decision are independent of the values of the pruned leaves x and y.
function ALPHA-BETA-SEARCH(state) returns an action v ← MAX-VALUE(state, −∞, +∞) |
function MAX-VALUE(state,α,β) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v ← −∞ for each a in ACTIONS(state) do v ←MAX(v, MIN-VALUE(RESULT(s,a),α,β)) If v ≥ β then return v return v |
function MIN-VALUE(state,α,β) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v ← +∞ v ←MIN(v, MAX-VALUE(RESULT(s,a) ,α,β)) If v ≤ α then return v return v |
(i) The first leaf below B has the value 4. Hence, B, which is a MIN node, has a value of at most 4.
(ii) The second leaf below B has a value of 13; MIN would avoid this move, so the value of B is still at most 4.
(iii) The third leaf underneath B has a value of 9; having examined all of B’s successor states, the value of B is precisely 4. This may deduce that the root’s value is no less than 4, since MAX has an option valued at 4 at the root.
(iv) The first leaf below C has the value 3. Hence, C, which is a MIN node,has a value of atmost 3. But we know that B is worth 4, so MAX would never choose C. Therefore, there is no point in looking at the other successor states of C. This is an example of alpha–beta pruning.
(v) The first leaf below D has the value 15, so D is worth at most 15. This is still higher than MAX’s best alternative (i.e., 4), D’s successor states has to be explored. Additionally, there are now limits on all successors of the root, indicating that the root’s value cannot exceed 15.
(vi) The second successor of D is worth 6, so again we need to keep exploring. The third successor is worth 3, so now D is worth exactly 3. MAX’s decision at the root is to move to B, giving a value of 4.
MINIMAX(root) = max(min(4,13,9),min(3,x,y),min(15,6,3)) = max(4,min(3,x,y),3) = max(4,z,3) where z = min(3,x,y) ≤ 3 = 4. |
Effectiveness of Pruning
The outcome is contingent upon the sequence in which children are visited. If the children of a node are traversed in the least favorable sequence, it is feasible that no pruning takes place. To optimize node exploration, we prioritize visiting the most advantageous child first, hence avoiding the inefficiency of examining less favorable possibilities among the other children. For min nodes, the most disadvantageous child should be prioritized first, from our viewpoint rather than that of the opponent. Two evident sources of this knowledge exist:
- The static evaluator function may rank the child nodes.
- Prior analyses of the game tree, such as those derived from earlier moves, conducted minimax assessments of several game situations. These values may be used to rank the nodes, if accessible.
When the ideal child is chosen at each opportunity, alpha-beta pruning eliminates all other children at every subsequent level of the tree; just that singular child is examined. This indicates that, on average, the tree may be searched twice as deeply as previously—significantly enhancing search performance.
Bibliography:
[1] Stuart Russell and Peter Norvig, “Artificial Intelligence: A Modern Approach, Third Edition.” Chapter 5
[2] Andrew Myers, “CS312 Recitation 21Minimax search and Alpha-Beta Pruning.”
[3] Vamsikrishna Gopikrishna, “Instructions on Alpha-Beta Search: CSE 5360: Artificial Intelligence I”