**Network architecture.
**

In our implementation and examples we have restricted
ourself on Feed-Forward-Neural-Networks (FNN). They are the most often used
neural networks for regression and classification tasks. But too often they
are treated as black boxes in daily use. The advantage of flexibility is
mostly compensated by non-transparency of the training process and the
final model. However, the idea of visualization of the inner geometry of
the network can be applied to nearly any kind of neural network.

**
Visualization.
**

To understand how a neural network works and what it learns it is necessary
to understand the topological behaviour of the weights of the network
during and after the training process. In statistical modelling it is often
desirable to interpret the model, that is to find out which variables are
contributing to one or more response variables and of what kind the
contributions are (e.g., linear, nonlinear) .

To see the topological behaviour of the weights we visualize it based on a statistical technique called "multidimensional scaling" (MDS). We interpret the weights of the FFN as distances between the locations of the units. For these purpose we propose an intuitively non-linear transformation of the weights and a "better" behaving linear transformation. Thus nearby located units are connected by large weights.

Now we select the locations of the units, based on these distances, through an optimization algorithm. The drawback of MDS is that the true structure is high-dimensional. If we use two or three dimensions for visualization we will get only the best 2 or 3-dimensional approximation of true structure. Nevertheless we should be able to grasp some important properties of the network structure.

**
Implementation in XploRe 3.2.
**

The implementation consists of four commands:

NNINIT which checks a connection and weight matrix (CWM) if it is a FFN,
NNFUNC which computes for a specific input and CWM the output,
NNVISU which visualizes the geometry of the network, and
NNANAL which allows the analysis
of the input and output a single unit. The macro NN finally allows to
generate a multi-layer FFN, to train it (via a test set or cross-validation
and early stopping) and to analyze it for classification or regression.

proc()=3Dmain() func ("nn") ; load the NN-macro x=3Dread("kredit") ; load the credit data t=3Dread("tkredit") ; load training, test and validation set y=3Dx[,1] ; create y x=3Dx[,2:21] ; create x x=3D(x-mean(x)=B4)./sqrt(var(x)=B4)~matrix(1000) ; standardize the data nn(x y t) ; run the NN-macro endp

**
Applications.
**

**1. **We apply our technique to the credit scoring data of Fahrmeir and
Hammerle (1981) with different number of units in the hidden layer
(see Figure 1 with no hidden units). The aim is to predict from some
variables if we have "good" or a "bad" client such that the repayment
of the credit is not a problem.

**Figure 1: **The best generalization network for the logistic regression with
one output-unit. We can see easily that we have 3 important variables (near to
the output unit o) and 5 less important variables (far away from the output
unit).

**2. **The second application comes from a very popular field of molecular biology
(protein structure prediction). First we consider only one of the simplest
cases. As input variables we chose the relative amino acid frequencies
within the protein. Secondary structural elements (e.g., alpha- helix,
betha-strand, coil) are used for a rather rough class
definition of four supersecondary structural classes. Thus we have here four
output units.

Statistical Computing '96 auf Schloß Reisensburg