Build a Neural Net in 4 Minutes

Hello world, welcome to Sirajology. Today, we’re going to be building
a neural net in four minutes, let’s get started. There are like a million and one
machine learning models out there, but neural nets in particular have gotten
really popular recently because of two things,
faster computers and more data. They’ve helped produce some amazing
breakthroughs in everything from image recognition to
generating rap songs. There’s really just 3 steps
involved in machine learning. Build it. Train it. and test it. Once we build our model we can
train it against our input and output data to make it better and
better at pattern recognition. So, let’s build our model. A three layer neural network in Python. We’ll want to start off by importing
NumPy, which is my go to library for scientific computing in Python. Then, we’ll want to create
a function that will map any value to a value between zero and one. This is called a sigmoid. This function will be run in every
neuron of our network when data hits it. It’s useful for
creating probabilities out of numbers. Once we’ve created that, let’s initialize our input
data set as a matrix. Each row is a different
training example. Each column represents
a different neuron. So we have four training examples
with three input neurons each. Then we’ll create our output data set. Four examples, one output neuron each. Since we’ll be generating
random numbers in a second, let’s seed them to make
them deterministic. This just means give random numbers that
are generated the same starting point or seed so that we’ll get the same
sequence of generated numbers every time we run our program. This is useful for debugging. Next, we’ll create our synapse matrices. Synapses are the connections
between each neuron in one layer to every neuron in the next layer. Since we will have three layers in our
network, we need two synapses matrixes. Each synapse has a random
weight assigned to it. After that,
we’ll begin the training code. We’ll create a for-loop that
iterates over the training code to optimize the network for
the given data set. We’ll start of by
creating our first layer. It’s just our input data. Now comes the prediction step. We’ll perform matrix multiplication
between each layer and its synapse. Then we’ll run our sigmoid function on
all the values in the matrix to create the next layer. The next layer contains
a prediction of the output data. Then we do the same thing on that
layer to get our next layer, which is a more refined prediction. So now that we have a prediction of the
output value in layer two, let’s compare it to the expected output data using
subtraction to get the error rate. We’ll also want to print out the average
error rate at a set interval to make sure it goes down every time. Next, we’ll multiply our error rate by
the result of our sigmoid function. The function is used to get the
derivative of our output prediction from layer two. This will give us a delta which we’ll
use to reduce the error rate of our predictions when we update
our synapses every iteration. Then we’ll want to see how much layer
one contributed to the error in layer two. This is called back propagation. We’ll get this error by
multiplying layer two’s delta by synapse one’s transpose. Then we’ll get layer one’s
delta by multiplying its error by the result of
our sigmoid function. The function is used to get
the derivative of layer one. Now that we have deltas for each of
our layers, we can use them to update our synapse rates to reduce the error
rate more and more every iteration. This is an algorithm
called gradient descent. To do this, we’ll just multiply
each layer by a delta. Finally, let’s print
the predicted output. And there you have it. Let’s run this in Terminal and
see what we get. Awesome, we can see that our error
rate decreases every iteration and the predicted output is very,
very close to the actual output. There is so much we can do to
improve our neural network. For more information check out
the links in the description below and please subscribe for
more technology videos. Thanks for watching.

100 Replies to “Build a Neural Net in 4 Minutes”

  1. wait how do I get the pyton version like yours mine looks like the one that you get when downloading windows and the one they use in the matrix with the green letters and black background I forgot what it's called?

    SyntaxError: invalid syntax
    PS C:UsersJan ZienkiewiczDesktopKasztaneqBOT Xkkk> py
    File "", line 11
    SyntaxError: invalid syntax

  3. I think you should do 40mins video of the same thing exactly – but this time SLOWER…. MUCH MUCH SLOWER….And with more details. Only THEN it'll be a good video. As speed is not the target here – but understanding.

  4. Something similar in javascript???? How to keep the experience that neural network has got, so next time you run it it remembers all the previous training and can continue to train more for a better result?

  5. Hi Siraj,
    I appreciate your effort to educate people.
    Looks like your verbal delivered code and github codes are different. Please upload the same code or email to me.

  6. Amazing video !!! please tell me why have you multiplied the synapse matrix by 2 and subtracted 1 in line 24 and 25 ?

  7. Hello man
    Your videos are more confusing then informative.
    Anyways thanks for a litttttttttle information your videos give

  8. Anyone know of a good place to learn the context of this code? I know, I know, I'm here on the lazy person's tutorial. But longer videos I've watched seem to still expect me to know more. Anyone know of a good series to follow for absolute beginners?

  9. Thank you for posting the suggested articles in your description. Those are helping out soooo much for understanding neural networks!!!!

  10. can someome direct me to a really good video on HOW TO MAKE A LEARNING AI i wanna make some experiments too

  11. Awesome! Can you do long& thorough podcasts with engineers working in AI& ML& Data Analysis fields ? 😀

  12. can someone explain what exactly happened with those numbers? is this some logic gate simulation? like 0,0,1 —> 0 and 1,0,1 –>1?

  13. Clear, concise, eloquent. I think Siraj is actually an AI bot who has been brought back from the future to help save humanity. Or to hasten our conquest by our robot overlord masters. Not sure yet, but I'm on board.

  14. I understand all of it, except why we need a hidden layer. Why can't we just do a single layer perceptron for this?

    import numpy as np

    def sigmoid(x):
    return 1 / (1 + np.exp(-x))

    def sigmoid_derivative(x):
    return x * (1 – x)

    training_inputs = np.array([[0,0,1],

    training_outputs = np.array([[0,1,1,0]]).T


    synaptic_weights = 2 * np.random.random((3, 1)) -1

    for iteration in range(10):

    print('random starting synaptic weights: ')

    input_layer = training_inputs

    stuff =, synaptic_weights)

    outputs = sigmoid(, synaptic_weights))

    error = training_outputs – outputs

    adjustments = error * sigmoid_derivative(outputs)

    synaptic_weights +=, adjustments)

    print("synaptic weights after training")

    print('outputs after training: ')

  15. Iv'e seen the same example so many times and dont know how to actually apply this to a real world example. someone help please

  16. This video name shold be: "I don't understand what I am doing and my onscreen code consists of mistakes". Guys, don't waste your precious 4 minutes of yor life.

  17. I honestly believe people who make 4 minute videos on huge topics like neural networks and especially "showing" them how to code it is such a bad way to learn. You want to learn how to build a NN? Can't be done in 4 minutes. You can read an article for 4 minutes on Neural Networks and learn so much more with actual understanding rather than this dude pretending like he actually knows what he is talking about.

  18. um, so, what i got was, You need to propagate the perpendicular bisectors, that way the circumference eliminates the overall variable decay.

    or something like that

  19. Well may I add, the code in the background is complete dogshit, and aside from the fact, that he very obviously copied some of that code from somewhere else, there are so many syntax errors in this code that i'm just disappointed. I mean I'm not the greatest programmer in the world, but dude. You literally used dots instead of commas in line 12, closed a bracket too much immedeately after, imported numpy as numpy but used np everywhere, then wrote a comma instead of a dot in line 8, which makes it look like you're adding the module numpy to one, in a tuple, which then divides 1 and returns the result. I'm not sure, but it looks like you wrote a – instead of a = in line 3, and I can't tell, because the video resolution is so bad.
    But aside from that, your code works. The guy who actually wrote it deserves respect.

  20. ah neh. you talk too fast. whats the point of spitting so fast and no one can grab what you talk. for those already know what you talking, they wont even be watching you spit.

  21. line 39: l2_delta=l2_error*nonlin(l2,deriv=true)
    is 'l2_delta' just 'weighted l2 errors' so that the l2 elements with the non-extreme values(ie. closer to 0.5) get larger update steps?
    i get that it makes the corrections less severe as l2 values approach 0 or 1 but doesn't that make error corrections of WRONG l2 values at the extremes also less severe?

    line 46: syn1 +=
    [l1]*[syn1 update matrix]=[l2_delta]
    so [l1]^T*[l1]*[syn1 update matrix]=[l1]^T*[l2_delta]
    so [syn1 update matrix]=[l1]^T*[l2_delta]

    is this the logic behind line 46? similarly for line 41?

  22. Hello Siraj sir,
    I had made my first Neural network at

    Please review my code and give me suggestion to improve it.

Leave a Reply

Your email address will not be published. Required fields are marked *