Thursday, April 4, 2024

A daughter neural network fails to learn the structure of its parent network.


I have two neural networks, namely the parent and the daughter neural networks. These two networks have the same architecture with the same activation function, number of parameters, etc. The child neural network is trained to imitate the behavior of the parent neural network, but the child neural network only knows the input and output behavior of the parent neural network, and the child neural network is not trained on the inner workings of the parent neural network. The child network is trained to minimize the distance squared between C(x) and P(x) for all inputs x selected from a Gaussian distribution. New data values are generated after every update to the weights and biases. Both networks are of the form Chain(SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+)). We use a skip connection so that it is sensible to show three different weight matrices in the animation. The animation shows the weight matrices during the training process of the daughter neural network where the three weight matrices are colored red,green, and blue. During the training of the network, I have zeroed out rows and columns in the weight matrix of the daughter network in order to remove the randomness present in the initialization so that it has a better chance of learning the parent network. In this visualization, after training, the daughter neural network has a structure that marginally resembles its parent network. The cosine similarities between the weight matrices of the daughter network and the parent network are 0.60,0.21,-0.16 respectively. I have trained daughter networks that have been able to perfectly learn their parent networks, so this performance is relatively poor, but the poor performance is due to the difficulty in the task rather than the way the network was trained. The notion of a neural network is not my own. I am simply making an animation to illustrate the advantages and disadvantages of neural networks especially with regard to AI safety. It is not good for AI safety for a network to be unable to learn the structure of its parent network. Keep in mind that the ability for a daughter network to learn the structure of its parent network has more to do with the interpretability of a neural network than its capabilities. A neural network could be difficult to learn because the network is performing a difficult task. We observe that the daughter network has some additional structure that is not present in the parent network. In other words, the structure of the daughter network is non-random. A daughter network that does not learn its parent network but has produced structure that is not present in its parent network may be more dangerous and unpredictable. Randomness after training does not point the network towards specific but unpredictable behaviors, but learned structure that is not present in the training data does point the network towards unpredictable behaviors. Unless otherwise stated, all algorithms featured on this channel are my own. You can go to https://github.com/sponsors/jvanname to support my research on machine learning algorithms. I am also available to consult on the use of safe and interpretable AI for your business. I am designing machine learning algorithms for AI safety such as LSRDRs. In particular, my algorithms are designed to be more predictable and understandable to humans than other machine learning algorithms, and my algorithms can be used to interpret more complex AI systems such as neural networks. With more understandable AI, we can ensure that AI systems will be used responsibly and that we will avoid catastrophic AI scenarios. There is currently nobody else who is working on LSRDRs, so your support will ensure a unique approach to AI safety.

No comments:

Post a Comment