 |
 |
 |
 |
 |
 |
 |
 |
| • |
Another
possibility: probabilistic approach
|
|
|
| • |
Choose
actions probabilistic such that there's
|
|
|
always
a positive probability of choose each
|
|
|
|
action.
|
|
|
| • |
One example: P(ai|s) = kQ(s,ai) / sumj(kQ(s,aj))
|
|
|
|
|
| • |
Greater
k à greater greedy exploitation
|
|
|
| • |
Lesser
k à greater random exploration
|
|