Search Algorithms for Discrete Optimization Problems

References

V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing, Chapter 8, The Benjamin/Cumming Publishing Company, Inc., 1994.

Discrete Optimization Problmes (Combinatorial Problems): Find the optimal solution from a finite or countably infinite set of possible solutions.

-Optimal solution minimizes the objective function.

-The set of all possible solution is too large to be searched exhaustively.

Model: State space graph

-nodes: states

-edges (associated with costs): Transition from one state to another

Example: 8-puzzle problem

Search mechanisms

Depth First Search (DFS). Search the left most leaf, then back track and repeat. Stack can be used for data representation.

Best First Search. Use heuristics to search part of the solution set.

The characteristics of the above search algorithms:

The tree grows dynamically.

The tree is unstructured (not balanced).

Static partition does not work.

Dynamic Load Balancing

When a processor runs out of work, it requests work from another processor. When a processor gets requests, it gives some work.

    While not finish
        if work stack not empty
	    do some work;
	    poll message buffer;
	        if requests, give some work;
	else
	    repeat
	        send a request to a processor;
	        receive a message from that processor;
	    until not a rejection;
	    if work received, put the work on the work stack
	end
    end

Issues in the above model

How to split work when serving a request?

-Avoid giving away small work, use a cutoff depth. Three possible ways.

1. Send nodes near the bottom (works well in uniform search space).

2. Send nodes near the cutoff depth (works well in strong heuristics).

3. Send half the nodes between bottom and cutoff depth (works well in uniform or highly irregular search space).

Whom to send work request to?

-Asynchronous Round Robin. Keep a variable target. Send request to target. After the request is sent, target = (target + 1) mod p.

-Global Round Robin. P0 keeps targe. A process making a request gets target from P0, and P0 updates target.

-Random Polling. Select a donor processor randomly.

When to terminate?

-Token Termination. When P0 becomes idle (its work stack become empty), it initiates a token and makes no further requests. When Pi receives the token, it holds it until it becomes idle. Then Pi passes the token to Pi+1 and it makes no further requests for work. When P0 gets the token back, the program terminates. This algorithm can be inefficient and cause load imbalance. The modified algorithm:

1. When P0 becomes idle, it initiates termination by making itself white and sending a white token to P1;

2. If Pi sends work to Pj and i>j (sending a work backwards), then processor Pi becomes black;

3. If Pi has the token and idle, then it passes the token to Pi+1. If Pi is black, then make the token black before the token is sent to Pi+1. If Pi is white, keep the color of the token unchanged;

4. After Pi passes the token to Pi+1, it becomes white;

5. The algorithm terminates when P0 receives a white token.

-Tree Based Termination. Each work piece is associated with a weight. Initially, P0 has all work, say weight is 1.0. As processors complete their work, P0 accumulates the weights of completed work pieces. When the accumulation reaches 1.0, the work terminates. Warning: be aware of underflow and rounding errors.