Template-Type: ReDIF-Paper 1.0 Title: Updating Strategies Through Observed Play - Optimization Under Bounded Rationality Author-Name: R. Cressman Author-Name: K.H. Schlag Author-Postal: Author-Phone: Author-Homepage: Classification-JEL: C72, C79 Keywords: Multi-Armed Bandit, improving, undominated behavioral rule, play-wise imitating, replicator dynamic, monotone dynamics Abstract: Individuals repeatedly face a multi-decision task with unknown payoff distributions. They have minimal memory and update their strategy by observing previous play (and not strategy) of someone else. We select behavior rules that increase average payoffs as often as possible in a large population where all use the same rule. Here imitation generalizes to a pasting procedure. When decisions within the task are unrelated, individuals eventually learn the efficient strategy but the underlying dynamic is not monotone. However, when choices influence which decisions are subsequently faced in the task, play may not be efficient in the long run as it approaches a Nash equilibrium of the agent normal form. Series: Sonderforschungsbereich 303, University of Bonn, Germany Length: Creation-Date: 1998-04 Revision-Date: File-URL: http://www.wiwi.uni-bonn.de/bgsepapers/bonsfb/bonsfb432.pdf File-Format: application/pdf File-Size: 5500000 bytes File-URL: http://www.wiwi.uni-bonn.de/bgsepapers/bonsfb/bonsfb432.ps File-Format: application/postscript File-Size: 8421376 bytes Handle: RePEc:bon:bonsfb:432