Abstract:
We consider the minimax setup for the two-armed bandit problem in application to processing of large amounts of data. The data can be processed by one of two alternative methods with fixed but unknown efficiencies. One should design the processing so that to determine the most effective method and then to ensure its preferable application. It is allowed to aggregate data into groups and then to process them in parallel. The essence of the result is that even for sufficiently small number of stages the parallel processing does not lead actually to increasing of the minimax risk.
We consider the strategy which compares methods at initial stages of the control and then applies the most effective method at the final stage. For this strategy the asymptotically optimal parameters are given.