Article collection "Mathematical Problems of Cybernetics" ╣13, Moscow, 2004
Combinatorial approach to generalization bounds of learning algorithms
A combinatorial approach is proposed for estimating the generalizing ability of learning algorithms. We introduce combinatorial functionals based on cross-validation principle, which characterize the generalization performance of a learning algorithm on a final data set. The upper bounds for such functionals are derived without using excessive assumptions about the probabilistic nature of the data. At the same time, these estimates turn out to be more accurate in comparison with the statistical VapnikľChervonenkis (VC) bounds. The effect of localization of a family of algorithms is described in terms of a local growth function. Some of the statements of the statistical VC theory are revised, including: the uniform convergence bounds and its relative form, the role of the correctness of the learning algorithm, the method of structural risk minimization and the concept of the effective VC-dimension. We show that complexity-based generalization bounds are highly overestimated, and we analyze the main causes of their weakness. Two new combinatorial generalization bounds are proposed for classification algorithms. Instead of complexity, they are based on using additional information about the compactness of classes or the monotonicity of the target function.
statistical learning theory, generalization bounds, VapnikľChervonenkis bounds, growth function, local growth function, empirical risk minimization, structural risk minimization, correctness of the learning algorithm, compactness profile, monotonicity profile
Publication language:russian, pages:32 (p. 5-36)
Mathematical problems and theory of numerical methods