Аннотация:
In this talk, we discuss a finite-time analysis of linear stochastic approximation (LSA) algorithms with a fixed step size, a core method in statistics and machine learning. We cover the setting of both independent and identically distributed noise variables and a uniformly geometrically ergodic Markov chain. We derive p-th moment and high-probability deviation bounds for the iterates defined by LSA and its Polyak-Ruppert averaged version. Our finite-time instance-dependent bounds for the averaged LSA iterates are sharp in the sense that the leading term we obtain coincides with the local asymptotic minimax limit. Moreover, the remainder terms of our bounds have a tight dependence on the mixing time of the underlying noise sequence. Our results yield new finite-time error bounds for temporal difference learning with linear functional approximation and instance-independent step size. The talk is based on the recent paper https://arxiv.org/abs/2207.04475.