RT Conference Proceedings T1 A unified framework for linear function approximation of value functions in stochastic control A1 Sánchez-Fernández, Matilde A1 Valcárcel, Sergio A1 Zazo, Santiago AB This paper contributes with a unified formulation that merges previous analysis on the prediction of the performance (value function) of certain sequence of actions (policy) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approximated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the proposed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions. PB IEEE - The Institute of Electrical and Electronics Engineers, Inc YR 2013 FD 2013-09 LK https://hdl.handle.net/10016/21205 UL https://hdl.handle.net/10016/21205 LA eng NO The proceeding at:21st European Signal Processing Conference (EUSIPCO 2013), took place 2013, September 9-13, in Marrakech (marroc). NO This work has been partly funded by the Spanish Ministry of Science andInnovation with the project GRE3N (TEC 2011-29006-C03-01/02/03) and inthe program CONSOLIDER-INGENIO 2010 under project COMONSENS(CSD 2008-00010). This work was supported in part by the Spanish Ministry of Science and Innovation under the grants TEC2009-14219-C03-01,TEC2010-21217-C02-02-CR4HFDVL and in the program CONSOLIDER-INGENIO 2010 underthe grant CSD2008-00010 COMONSENS; and by the European Commissionunder the grant FP7-ICT-2009-4-248894-WHERE-2. DS e-Archivo RD 19 may. 2024