Resource of free step by step video how to guides to get you started with machine learning.
Thursday, April 4, 2024
A daughter neural network fails to learn the structure of its parent network.
I have two neural networks, namely the parent and the daughter neural networks. These two networks have the same architecture with the same activation function, number of parameters, etc. The child neural network is trained to imitate the behavior of the parent neural network, but the child neural network only knows the input and output behavior of the parent neural network, and the child neural network is not trained on the inner workings of the parent neural network. The child network is trained to minimize the distance squared between C(x) and P(x) for all inputs x selected from a Gaussian distribution. New data values are generated after every update to the weights and biases. Both networks are of the form Chain(SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+)). We use a skip connection so that it is sensible to show three different weight matrices in the animation. The animation shows the weight matrices during the training process of the daughter neural network where the three weight matrices are colored red,green, and blue. During the training of the network, I have zeroed out rows and columns in the weight matrix of the daughter network in order to remove the randomness present in the initialization so that it has a better chance of learning the parent network. In this visualization, after training, the daughter neural network has a structure that marginally resembles its parent network. The cosine similarities between the weight matrices of the daughter network and the parent network are 0.60,0.21,-0.16 respectively. I have trained daughter networks that have been able to perfectly learn their parent networks, so this performance is relatively poor, but the poor performance is due to the difficulty in the task rather than the way the network was trained. The notion of a neural network is not my own. I am simply making an animation to illustrate the advantages and disadvantages of neural networks especially with regard to AI safety. It is not good for AI safety for a network to be unable to learn the structure of its parent network. Keep in mind that the ability for a daughter network to learn the structure of its parent network has more to do with the interpretability of a neural network than its capabilities. A neural network could be difficult to learn because the network is performing a difficult task. We observe that the daughter network has some additional structure that is not present in the parent network. In other words, the structure of the daughter network is non-random. A daughter network that does not learn its parent network but has produced structure that is not present in its parent network may be more dangerous and unpredictable. Randomness after training does not point the network towards specific but unpredictable behaviors, but learned structure that is not present in the training data does point the network towards unpredictable behaviors. Unless otherwise stated, all algorithms featured on this channel are my own. You can go to https://github.com/sponsors/jvanname to support my research on machine learning algorithms. I am also available to consult on the use of safe and interpretable AI for your business. I am designing machine learning algorithms for AI safety such as LSRDRs. In particular, my algorithms are designed to be more predictable and understandable to humans than other machine learning algorithms, and my algorithms can be used to interpret more complex AI systems such as neural networks. With more understandable AI, we can ensure that AI systems will be used responsibly and that we will avoid catastrophic AI scenarios. There is currently nobody else who is working on LSRDRs, so your support will ensure a unique approach to AI safety.
Subscribe to:
Post Comments (Atom)
-
Using GPUs in TensorFlow, TensorBoard in notebooks, finding new datasets, & more! (#AskTensorFlow) [Collection] In a special live ep...
-
JavaやC++で作成された具体的なルールに従って動く従来のプログラムと違い、機械学習はデータからルール自体を推測するシステムです。機械学習は具体的にどのようなコードで構成されているでしょうか? 機械学習ゼロからヒーローへの第一部ではそのような疑問に応えるため、ガイドのチャー...
-
#minecraft #neuralnetwork #backpropagation I built an analog neural network in vanilla Minecraft without any mods or command blocks. The n...
-
STUMPY is a robust and scalable Python library for computing a matrix profile, which can create valuable insights about our time series. STU...
-
Using More Data - Deep Learning with Neural Networks and TensorFlow part 8 [Collection] Welcome to part eight of the Deep Learning with ...
-
Linear Algebra Tutorial on the Determinant of a Matrix 🤖Welcome to our Linear Algebra for AI tutorial! This tutorial is designed for both...
-
❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers 📝 The paper "Alias-Free GAN" is available here: h...
-
Why are humans so good at video games? Maybe it's because a lot of games are designed with humans in mind. What happens if we change t...
-
Visual scenes are often comprised of sets of independent objects. Yet, current vision models make no assumptions about the nature of the p...
-
#ai #attention #transformer #deeplearning Transformers are famous for two things: Their superior performance and their insane requirements...
No comments:
Post a Comment