Now it is easy to make your own facial recognition system using face recognition algorithm which processes through Image recognition

Facial recognition is a biometric method of identifying an individual by comparing face captured through the face camera or digital image data with the stored data of that person. Facial recognition is the latest and trending system of user authentication system.

Recently Apple launched its new iPhone x which uses the same facial recognition algorithm to authenticate the authorized user of the device. Similarly, Baidu is also using the same facial recognition for the entrance of their officers into the perimeter instead of using ID cards.

If you are all unknown to the facial recognition system then it appears all magic to you thinking that face camera captures the face and unlock the device for the access. But this is not all magic. Here you can also make your own facial recognition system using Python for your own devices. If you are unaware of making and using python then don’t worry, this reading can help you with all those problems.

The codes for the facial recognition using python is included at the end of the article. Download and follow the step and start using your own facial recognition system on your own devices.


We use FaceNet in this recognition system which is a neural network that masters a mapping from face images to a compact Euclidean space where the distance reflects a measure of similarities between the faces. It is used to take an instant action without making a delay more than 1 second. In this, distance and faces similarities are inversely proportional to each other. This means the more similar two face images are the lesser the distance between them.

Triplet Loss

FaceNet uses a different loss method which is used to calculate the loss. This method is called as Triplet loss. Triplet loss in facial recognition plays a vital role in calculating the distances between an anchor and positive or negative. It minimizes the distances between an anchor and positive in which both have the same identity and maximizes the distances between an anchor and a negative of a different identity. If you see the triplet loss, it says same identity distance is less whereas other have more distance, comparing the distance with the Facenet definition, the lesser distance will get the positive result on matching the images. Hence positive will be confirmed whereas negative will be rejected.triplet loss equation face recognition

  • f(a) refers to the output encoding of the anchor
  • f(p) refers to the output encoding of the positive
  • f(n) refers to the output encoding of the negative
  • alpha is a constant used to make sure that the network does not try to optimize towards f(a)-f(p)=f(a)-f(n)=0
  • […]+ is equal to max(0, sum)

Siamese Network

FaceNet is a Siamese network which is also a type of neural network architecture that learn how to differentiate between two different points and allows the system to map the understanding which images are similar and which are different from each other.

It consists of two identical neural network with the same exact weights where one network takes one of the two images as an input to the system and then the calculation and processing on that input are started by the system. The output of the last layer of each network is passed as a parameter to the function which determines whether the image contain the same identity or not which is performed by calculating the distance between the two outputs.


The implementation of facial recognition proceeds with the use of both Keras and Tensorflow. Keras is an API which is supported by the high-level neural network and python programming languages which hold the capabilities of running on the top of TensorFlow, CNTK or Theano.

TensorFlow was developed by researchers and engineers working on the Google Brain Team with Google’s machine Intelligence research organization. It is an open source software which is used for numerical computation using the data flow graphs.

Along with the Keras and TensorFlow, this implementation of facial recognition algorithm also uses two utility files which were extracted from the repository to abstract all interactions with the FaceNet network:

  • py – It contains all the function which are required to feed images to the network and getting the encoding of images.

  • – It contains all the functions which are required to prepare and compile the FaceNet network.

Compiling the FaceNet Network

Compiling is the first process need to perform in order to compile the FaceNet network which helps in using the network for face recognition system.

Before starting the initiation of the system, it is necessary to initialize the input shape of the channels, hence here we initialize out network with the input shape of (3, 96, 96) which are the Red-Green-Blue (RGB) channels of the first dimension of image volume fed to the system or Network which must be 96×96 pixels.

After the initialization of the input shapes for the fed images, we will define the Triplet Loss function which is defined by the Triplet Loss equation given above.

We can proceed to compilation part once we have our loss function and start the compilation of our face recognition model using Keras. Here we will use Adam optimizer which helps in minimizing the loss calculated by the Triplet Loss function.

Database Preparation

Now we have successfully compiled FaceNet. The next step is to prepare a database of individuals that we want our system to recognize. We will use all the available images in the image directory for our database of individuals. But here we use only one image of each individual in our implementation because FaceNet network is powerful enough to only need one image of an individual to recognize.

We call the function img_path_to_encoding which will convert the image data to an encoding of 128 float number which is understood by the system and network. This function holds the path to an image and provides this images to our face recognition network where it returns the output in encoding format of the image. This encoding is added to our database where our system can start the recognizing individuals.

Recognizing a Face

As we know the work of the FaceNet is to minimize the distance between images of the same individual and maximizes the distance between the images of different individuals, this information is used by our system to determine which individual has the new images fed to our system is most likely to be.

Now to find the individual, we need to go to our database where we have stored the images and there we calculate the distance between our new images and each individual in the database. The authorized candidate will be the one with the lowest distance to the new image.
Then to make our confirmation we look into the images to confirm whether the person in candidate image and the new image are the same or not.

This all process has the only process to determine the most likely individual. Here is the code snippet which comes in handy:

This code implies:

  • If the distance is above 0.52, then we determine that individual in the new image does not exist in our database
  • Or else they are the same individual.

Here the value 0.52 is provided with the method trial and error. The best value can be lower or slightly higher depending on your implementation and data.

Building System for Face Recognition

Now we have successfully created the face recognition system using the facial recognition algorithm, so we can start building the system for it.

In my case, I have used an image named ‘oli.jpg’ which I have linked to the GitHub repository which is captured by my webcam to feed video frames to our face recognition algorithm. When the facial recognition algorithm recognizes the individual in the frame, there will be a demo play audio message that welcomes the user using the same name that has been stored in the database. Here is the image of my case:

Facial recognition algorithm
An image captured at the exact moment when the network recognized the individual in the image. The name of the image in the database was “oli.jpg” so the audio message played was “Welcome oli, have a nice day!”

This will clear your all the concept of facial recognition system and help you make your own facial recognition system. If you want to play around with it, here is the link to my GitHub repository.

GitHub Repository for Facial Recognition system:

What do you say about it ?