CS 104: Introduction to Computer Science

Speeding Convergence (cont.)


•	Training using updating sequence *in reverse*
	*order* speeds convergence.

•	Tradeoff: Requires more memory

•	Suppose exploration and learning cost great
	time/expense.

•	Can retrain on same data repeatedly.

•	Ratio of old/new update sequences a matter of
	relative costs for problem domain.

•	Tradeoff: Requires more memory, less diversity of
	state/action pairs