Preview

Comparison of Optimized Backpropagation Algorithms

Powerful Essays
Open Document
Open Document
2894 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Comparison of Optimized Backpropagation Algorithms
Comparison of Optimized Backpropagation Algorithms
W. Schiffmann, M. Joost, R. Werner
University of Koblenz
Institute of Physics
Rheinau 3–4
W-5400 Koblenz e-mail: evol@infko.uni-koblenz.de
Presented at ESANN 93, Br¨ ssel u Abstract
Backpropagation is one of the most famous training algorithms for multilayer perceptrons. Unfortunately it can be very slow for practical applications. Over the last years many improvement strategies have been developed to speed up backpropagation. It’s very difficult to compare these different techniques, because most of them have been tested on various specific data sets. Most of the reported results are based on some kind of tiny and artificial training sets like XOR, encoder or decoder.
It’s very doubtful if these results hold for more complicate practical application.
In this report an overview of many different speedup techniques is given. All of them were assessed by a very hard practical classification task, which consists of a big medical data set. As you will see many of these optimized algorithms fail in learning the data set.

1

Introduction

This report is intended to summarize our experience using many different speedup techniques for the backpropagation algorithm. We have tested 16 different algorithms on a very hard classification task. Most of these algorithms are using many parameters, which have to be tuned by hand. So hundreds of the tests runs have to be performed. It’s beyond the scope of this paper to discuss every approach in detail. We rather group the different approaches in some classes of algorithms and discuss these classes. A much more detailed report will be available via ftp.

2

Thyroid-Data

In order to compare many different approaches we have used measurements of the thyroid gland [Quinlan, 1987]. Each measurement vector consists of 21 values – 15
This work is supported by the Deutsche Forschungsgemeinschaft (DFG) as part of the project FE– generator (grant Schi



References: and Touretzky D. S. (Editors), page 832-838, 1991 3 Networks, Technical Report CMU-CS-88-162, 1988 4 Touretzky D. S. (Editor), page 40-48, 1989 9 Training Neural Networks, Computers Chem. Engng., Volume 14, No 3, page 337-341, 1990 10. Luenberger, David G.: Introduction to linear and nonlinear programming, AddisonWesley, 1973 11 nism in perspective, Elsevier Science Publishers B.V. (North-Holland), page 439-445, 1989

You May Also Find These Documents Helpful