diff --git a/docs/source/README_pygad_cnn_ReadTheDocs.rst b/docs/source/cnn.rst similarity index 97% rename from docs/source/README_pygad_cnn_ReadTheDocs.rst rename to docs/source/cnn.rst index 6dc9050..cb578bf 100644 --- a/docs/source/README_pygad_cnn_ReadTheDocs.rst +++ b/docs/source/cnn.rst @@ -1,748 +1,748 @@ -.. _pygadcnn-module: - -``pygad.cnn`` Module -==================== - -This section of the PyGAD's library documentation discusses the -**pygad.cnn** module. - -Using the **pygad.cnn** module, convolutional neural networks (CNNs) are -created. The purpose of this module is to only implement the **forward -pass** of a convolutional neural network without using a training -algorithm. The **pygad.cnn** module builds the network layers, -implements the activations functions, trains the network, makes -predictions, and more. - -Later, the **pygad.gacnn** module is used to train the **pygad.cnn** -network using the genetic algorithm built in the **pygad** module. - -Supported Layers -================ - -Each layer supported by the **pygad.cnn** module has a corresponding -class. The layers and their classes are: - -1. **Input**: Implemented using the ``pygad.cnn.Input2D`` class. - -2. **Convolution**: Implemented using the ``pygad.cnn.Conv2D`` class. - -3. **Max Pooling**: Implemented using the ``pygad.cnn.MaxPooling2D`` - class. - -4. **Average Pooling**: Implemented using the - ``pygad.cnn.AveragePooling2D`` class. - -5. **Flatten**: Implemented using the ``pygad.cnn.Flatten`` class. - -6. **ReLU**: Implemented using the ``pygad.cnn.ReLU`` class. - -7. **Sigmoid**: Implemented using the ``pygad.cnn.Sigmoid`` class. - -8. **Dense** (Fully Connected): Implemented using the - ``pygad.cnn.Dense`` class. - -In the future, more layers will be added. - -Except for the input layer, all of listed layers has 4 instance -attributes that do the same function which are: - -1. ``previous_layer``: A reference to the previous layer in the CNN - architecture. - -2. ``layer_input_size``: The size of the input to the layer. - -3. ``layer_output_size``: The size of the output from the layer. - -4. ``layer_output``: The latest output generated from the layer. It - default to ``None``. - -In addition to such attributes, the layers may have some additional -attributes. The next subsections discuss such layers. - -.. _pygadcnninput2d-class: - -``pygad.cnn.Input2D`` Class ---------------------------- - -The ``pygad.cnn.Input2D`` class creates the input layer for the -convolutional neural network. For each network, there is only a single -input layer. The network architecture must start with an input layer. - -This class has no methods or class attributes. All it has is a -constructor that accepts a parameter named ``input_shape`` representing -the shape of the input. - -The instances from the ``Input2D`` class has the following attributes: - -1. ``input_shape``: The shape of the input to the pygad.cnn. - -2. ``layer_output_size`` - -Here is an example of building an input layer with shape -``(50, 50, 3)``. - -.. code:: python - - input_layer = pygad.cnn.Input2D(input_shape=(50, 50, 3)) - -Here is how to access the attributes within the instance of the -``pygad.cnn.Input2D`` class. - -.. code:: python - - input_shape = input_layer.input_shape - layer_output_size = input_layer.layer_output_size - - print("Input2D Input shape =", input_shape) - print("Input2D Output shape =", layer_output_size) - -This is everything about the input layer. - -.. _pygadcnnconv2d-class: - -``pygad.cnn.Conv2D`` Class --------------------------- - -Using the ``pygad.cnn.Conv2D`` class, convolution (conv) layers can be -created. To create a convolution layer, just create a new instance of -the class. The constructor accepts the following parameters: - -- ``num_filters``: Number of filters. - -- ``kernel_size``: Filter kernel size. - -- ``previous_layer``: A reference to the previous layer. Using the - ``previous_layer`` attribute, a linked list is created that connects - all network layers. For more information about this attribute, please - check the **previous_layer** attribute section of the ``pygad.nn`` - module documentation. - -- ``activation_function=None``: A string representing the activation - function to be used in this layer. Defaults to ``None`` which means - no activation function is applied while applying the convolution - layer. An activation layer can be added separately in this case. The - supported activation functions in the conv layer are ``relu`` and - ``sigmoid``. - -Within the constructor, the accepted parameters are used as instance -attributes. Besides the parameters, some new instance attributes are -created which are: - -- ``filter_bank_size``: Size of the filter bank in this layer. - -- ``initial_weights``: The initial weights for the conv layer. - -- ``trained_weights``: The trained weights of the conv layer. This - attribute is initialized by the value in the ``initial_weights`` - attribute. - -- ``layer_input_size`` - -- ``layer_output_size`` - -- ``layer_output`` - -Here is an example for creating a conv layer with 2 filters and a kernel -size of 3. Note that the ``previous_layer`` parameter is assigned to the -input layer ``input_layer``. - -.. code:: python - - conv_layer = pygad.cnn.Conv2D(num_filters=2, - kernel_size=3, - previous_layer=input_layer, - activation_function=None) - -Here is how to access some attributes in the dense layer: - -.. code:: python - - filter_bank_size = conv_layer.filter_bank_size - conv_initail_weights = conv_layer.initial_weights - - print("Filter bank size attributes =", filter_bank_size) - print("Initial weights of the conv layer :", conv_initail_weights) - -Because ``conv_layer`` holds a reference to the input layer, then the -number of input neurons can be accessed. - -.. code:: python - - input_layer = conv_layer.previous_layer - input_shape = input_layer.num_neurons - - print("Input shape =", input_shape) - -Here is another conv layer where its ``previous_layer`` attribute points -to the previously created conv layer and it uses the ``ReLU`` activation -function. - -.. code:: python - - conv_layer2 = pygad.cnn.Conv2D(num_filters=2, - kernel_size=3, - previous_layer=conv_layer, - activation_function="relu") - -Because ``conv_layer2`` holds a reference to ``conv_layer`` in its -``previous_layer`` attribute, then the attributes in ``conv_layer`` can -be accessed. - -.. code:: python - - conv_layer = conv_layer2.previous_layer - filter_bank_size = conv_layer.filter_bank_size - - print("Filter bank size attributes =", filter_bank_size) - -After getting the reference to ``conv_layer``, we can use it to access -the number of input neurons. - -.. code:: python - - conv_layer = conv_layer2.previous_layer - input_layer = conv_layer.previous_layer - input_shape = input_layer.num_neurons - - print("Input shape =", input_shape) - -.. _pygadcnnmaxpooling2d-class: - -``pygad.cnn.MaxPooling2D`` Class --------------------------------- - -The ``pygad.cnn.MaxPooling2D`` class builds a max pooling layer for the -CNN architecture. The constructor of this class accepts the following -parameter: - -- ``pool_size``: Size of the window. - -- ``previous_layer``: A reference to the previous layer in the CNN - architecture. - -- ``stride=2``: A stride that default to 2. - -Within the constructor, the accepted parameters are used as instance -attributes. Besides the parameters, some new instance attributes are -created which are: - -- ``layer_input_size`` - -- ``layer_output_size`` - -- ``layer_output`` - -.. _pygadcnnaveragepooling2d-class: - -``pygad.cnn.AveragePooling2D`` Class ------------------------------------- - -The ``pygad.cnn.AveragePooling2D`` class is similar to the -``pygad.cnn.MaxPooling2D`` class except that it applies the max pooling -operation rather than average pooling. - -.. _pygadcnnflatten-class: - -``pygad.cnn.Flatten`` Class ---------------------------- - -The ``pygad.cnn.Flatten`` class implements the flatten layer which -converts the output of the previous layer into a 1D vector. The -constructor accepts only the ``previous_layer`` parameter. - -The following instance attributes exist: - -- ``previous_layer`` - -- ``layer_input_size`` - -- ``layer_output_size`` - -- ``layer_output`` - -.. _pygadcnnrelu-class: - -``pygad.cnn.ReLU`` Class ------------------------- - -The ``pygad.cnn.ReLU`` class implements the ReLU layer which applies the -ReLU activation function to the output of the previous layer. - -The constructor accepts only the ``previous_layer`` parameter. - -The following instance attributes exist: - -- ``previous_layer`` - -- ``layer_input_size`` - -- ``layer_output_size`` - -- ``layer_output`` - -.. _pygadcnnsigmoid-class: - -``pygad.cnn.Sigmoid`` Class ---------------------------- - -The ``pygad.cnn.Sigmoid`` class is similar to the ``pygad.cnn.ReLU`` -class except that it applies the sigmoid function rather than the ReLU -function. - -.. _pygadcnndense-class: - -``pygad.cnn.Dense`` Class -------------------------- - -The ``pygad.cnn.Dense`` class implement the dense layer. Its constructor -accepts the following parameters: - -- ``num_neurons``: Number of neurons in the dense layer. - -- ``previous_layer``: A reference to the previous layer. - -- ``activation_function``: A string representing the activation - function to be used in this layer. Defaults to ``"sigmoid"``. - Currently, the supported activation functions in the dense layer are - ``"sigmoid"``, ``"relu"``, and ``softmax``. - -Within the constructor, the accepted parameters are used as instance -attributes. Besides the parameters, some new instance attributes are -created which are: - -- ``initial_weights``: The initial weights for the dense layer. - -- ``trained_weights``: The trained weights of the dense layer. This - attribute is initialized by the value in the ``initial_weights`` - attribute. - -- ``layer_input_size`` - -- ``layer_output_size`` - -- ``layer_output`` - -.. _pygadcnnmodel-class: - -``pygad.cnn.Model`` Class -========================= - -An instance of the ``pygad.cnn.Model`` class represents a CNN model. The -constructor of this class accepts the following parameters: - -- ``last_layer``: A reference to the last layer in the CNN architecture - (i.e. dense layer). - -- ``epochs=10``: Number of epochs. - -- ``learning_rate=0.01``: Learning rate. - -Within the constructor, the accepted parameters are used as instance -attributes. Besides the parameters, a new instance attribute named -``network_layers`` is created which holds a list with references to the -CNN layers. Such a list is returned using the ``get_layers()`` method in -the ``pygad.cnn.Model`` class. - -There are a number of methods in the ``pygad.cnn.Model`` class which -serves in training, testing, and retrieving information about the model. -These methods are discussed in the next subsections. - -.. _getlayers: - -``get_layers()`` ----------------- - -Creates a list of all layers in the CNN model. It accepts no parameters. - -``train()`` ------------ - -Trains the CNN model. - -Accepts the following parameters: - -- ``train_inputs``: Training data inputs. - -- ``train_outputs``: Training data outputs. - -This method trains the CNN model according to the number of epochs -specified in the constructor of the ``pygad.cnn.Model`` class. - -It is important to note that no learning algorithm is used for training -the pygad.cnn. Just the learning rate is used for making some changes -which is better than leaving the weights unchanged. - -.. _feedsample: - -``feed_sample()`` ------------------ - -Feeds a sample in the CNN layers and returns results of the last layer -in the pygad.cnn. - -.. _updateweights: - -``update_weights()`` --------------------- - -Updates the CNN weights using the learning rate. It is important to note -that no learning algorithm is used for training the pygad.cnn. Just the -learning rate is used for making some changes which is better than -leaving the weights unchanged. - -``predict()`` -------------- - -Uses the trained CNN for making predictions. - -Accepts the following parameter: - -- ``data_inputs``: The inputs to predict their label. - -It returns a list holding the samples predictions. - -``summary()`` -------------- - -Prints a summary of the CNN architecture. - -Supported Activation Functions -============================== - -The supported activation functions in the convolution layer are: - -1. Sigmoid: Implemented using the ``pygad.cnn.sigmoid()`` function. - -2. Rectified Linear Unit (ReLU): Implemented using the - ``pygad.cnn.relu()`` function. - -The dense layer supports these functions besides the ``softmax`` -function implemented in the ``pygad.cnn.softmax()`` function. - -Steps to Build a Neural Network -=============================== - -This section discusses how to use the ``pygad.cnn`` module for building -a neural network. The summary of the steps are as follows: - -- Reading the Data - -- Building the CNN Architecture - -- Building Model - -- Model Summary - -- Training the CNN - -- Making Predictions - -- Calculating Some Statistics - -Reading the Data ----------------- - -Before building the network architecture, the first thing to do is to -prepare the data that will be used for training the network. - -In this example, 4 classes of the **Fruits360** dataset are used for -preparing the training data. The 4 classes are: - -1. `Apple - Braeburn `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/apple - -2. `Lemon - Meyer `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/lemon - -3. `Mango `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/mango - -4. `Raspberry `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/raspberry - -Just 20 samples from each of the 4 classes are saved into a NumPy array -available in the -`dataset_inputs.npy `__ -file: -https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy - -The shape of this array is ``(80, 100, 100, 3)`` where the shape of the -single image is ``(100, 100, 3)``. - -The -`dataset_outputs.npy `__ -file -(https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy) -has the class labels for the 4 classes: - -1. `Apple - Braeburn `__: - Class label is **0** - -2. `Lemon - Meyer `__: - Class label is **1** - -3. `Mango `__: - Class label is **2** - -4. `Raspberry `__: - Class label is **3** - -Simply, download and reach the 2 files to return the NumPy arrays -according to the next 2 lines: - -.. code:: python - - train_inputs = numpy.load("dataset_inputs.npy") - train_outputs = numpy.load("dataset_outputs.npy") - -After the data is prepared, next is to create the network architecture. - -Building the Network Architecture ---------------------------------- - -The input layer is created by instantiating the ``pygad.cnn.Input2D`` -class according to the next code. A network can only have a single input -layer. - -.. code:: python - - import pygad.cnn - sample_shape = train_inputs.shape[1:] - - input_layer = pygad.cnn.Input2D(input_shape=sample_shape) - -After the input layer is created, next is to create a number of layers -layers according to the next code. Normally, the last dense layer is -regarded as the output layer. Note that the output layer has a number of -neurons equal to the number of classes in the dataset which is 4. - -.. code:: python - - conv_layer1 = pygad.cnn.Conv2D(num_filters=2, - kernel_size=3, - previous_layer=input_layer, - activation_function=None) - relu_layer1 = pygad.cnn.Sigmoid(previous_layer=conv_layer1) - average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, - previous_layer=relu_layer1, - stride=2) - - conv_layer2 = pygad.cnn.Conv2D(num_filters=3, - kernel_size=3, - previous_layer=average_pooling_layer, - activation_function=None) - relu_layer2 = pygad.cnn.ReLU(previous_layer=conv_layer2) - max_pooling_layer = pygad.cnn.MaxPooling2D(pool_size=2, - previous_layer=relu_layer2, - stride=2) - - conv_layer3 = pygad.cnn.Conv2D(num_filters=1, - kernel_size=3, - previous_layer=max_pooling_layer, - activation_function=None) - relu_layer3 = pygad.cnn.ReLU(previous_layer=conv_layer3) - pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, - previous_layer=relu_layer3, - stride=2) - - flatten_layer = pygad.cnn.Flatten(previous_layer=pooling_layer) - dense_layer1 = pygad.cnn.Dense(num_neurons=100, - previous_layer=flatten_layer, - activation_function="relu") - dense_layer2 = pygad.cnn.Dense(num_neurons=4, - previous_layer=dense_layer1, - activation_function="softmax") - -After the network architecture is prepared, the next step is to create a -CNN model. - -Building Model --------------- - -The CNN model is created as an instance of the ``pygad.cnn.Model`` -class. Here is an example. - -.. code:: python - - model = pygad.cnn.Model(last_layer=dense_layer2, - epochs=5, - learning_rate=0.01) - -After the model is created, a summary of the model architecture can be -printed. - -Model Summary -------------- - -The ``summary()`` method in the ``pygad.cnn.Model`` class prints a -summary of the CNN model. - -.. code:: python - - model.summary() - -.. code:: python - - ----------Network Architecture---------- - - - - - - - - - - - - - ---------------------------------------- - -Training the Network --------------------- - -After the model and the data are prepared, then the model can be trained -using the the ``pygad.cnn.train()`` method. - -.. code:: python - - model.train(train_inputs=train_inputs, - train_outputs=train_outputs) - -After training the network, the next step is to make predictions. - -Making Predictions ------------------- - -The ``pygad.cnn.predict()`` method uses the trained network for making -predictions. Here is an example. - -.. code:: python - - predictions = model.predict(data_inputs=train_inputs) - -It is not expected to have high accuracy in the predictions because no -training algorithm is used. - -Calculating Some Statistics ---------------------------- - -Based on the predictions the network made, some statistics can be -calculated such as the number of correct and wrong predictions in -addition to the classification accuracy. - -.. code:: python - - num_wrong = numpy.where(predictions != train_outputs)[0] - num_correct = train_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/train_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -It is very important to note that it is not expected that the -classification accuracy is high because no training algorithm is used. -Please check the documentation of the ``pygad.gacnn`` module for -training the CNN using the genetic algorithm. - -Examples -======== - -This section gives the complete code of some examples that build neural -networks using ``pygad.cnn``. Each subsection builds a different -network. - -Image Classification --------------------- - -This example is discussed in the **Steps to Build a Convolutional Neural -Network** section and its complete code is listed below. - -Remember to either download or create the -`dataset_features.npy `__ -and -`dataset_outputs.npy `__ -files before running this code. - -.. code:: python - - import numpy - import pygad.cnn - - """ - Convolutional neural network implementation using NumPy - A tutorial that helps to get started (Building Convolutional Neural Network using NumPy from Scratch) available in these links: - https://www.linkedin.com/pulse/building-convolutional-neural-network-using-numpy-from-ahmed-gad - https://towardsdatascience.com/building-convolutional-neural-network-using-numpy-from-scratch-b30aac50e50a - https://www.kdnuggets.com/2018/04/building-convolutional-neural-network-numpy-scratch.html - It is also translated into Chinese: http://m.aliyun.com/yunqi/articles/585741 - """ - - train_inputs = numpy.load("dataset_inputs.npy") - train_outputs = numpy.load("dataset_outputs.npy") - - sample_shape = train_inputs.shape[1:] - num_classes = 4 - - input_layer = pygad.cnn.Input2D(input_shape=sample_shape) - conv_layer1 = pygad.cnn.Conv2D(num_filters=2, - kernel_size=3, - previous_layer=input_layer, - activation_function=None) - relu_layer1 = pygad.cnn.Sigmoid(previous_layer=conv_layer1) - average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, - previous_layer=relu_layer1, - stride=2) - - conv_layer2 = pygad.cnn.Conv2D(num_filters=3, - kernel_size=3, - previous_layer=average_pooling_layer, - activation_function=None) - relu_layer2 = pygad.cnn.ReLU(previous_layer=conv_layer2) - max_pooling_layer = pygad.cnn.MaxPooling2D(pool_size=2, - previous_layer=relu_layer2, - stride=2) - - conv_layer3 = pygad.cnn.Conv2D(num_filters=1, - kernel_size=3, - previous_layer=max_pooling_layer, - activation_function=None) - relu_layer3 = pygad.cnn.ReLU(previous_layer=conv_layer3) - pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, - previous_layer=relu_layer3, - stride=2) - - flatten_layer = pygad.cnn.Flatten(previous_layer=pooling_layer) - dense_layer1 = pygad.cnn.Dense(num_neurons=100, - previous_layer=flatten_layer, - activation_function="relu") - dense_layer2 = pygad.cnn.Dense(num_neurons=num_classes, - previous_layer=dense_layer1, - activation_function="softmax") - - model = pygad.cnn.Model(last_layer=dense_layer2, - epochs=1, - learning_rate=0.01) - - model.summary() - - model.train(train_inputs=train_inputs, - train_outputs=train_outputs) - - predictions = model.predict(data_inputs=train_inputs) - print(predictions) - - num_wrong = numpy.where(predictions != train_outputs)[0] - num_correct = train_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/train_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) +.. _pygadcnn-module: + +``pygad.cnn`` Module +==================== + +This section of the PyGAD's library documentation discusses the +**pygad.cnn** module. + +Using the **pygad.cnn** module, convolutional neural networks (CNNs) are +created. The purpose of this module is to only implement the **forward +pass** of a convolutional neural network without using a training +algorithm. The **pygad.cnn** module builds the network layers, +implements the activations functions, trains the network, makes +predictions, and more. + +Later, the **pygad.gacnn** module is used to train the **pygad.cnn** +network using the genetic algorithm built in the **pygad** module. + +Supported Layers +================ + +Each layer supported by the **pygad.cnn** module has a corresponding +class. The layers and their classes are: + +1. **Input**: Implemented using the ``pygad.cnn.Input2D`` class. + +2. **Convolution**: Implemented using the ``pygad.cnn.Conv2D`` class. + +3. **Max Pooling**: Implemented using the ``pygad.cnn.MaxPooling2D`` + class. + +4. **Average Pooling**: Implemented using the + ``pygad.cnn.AveragePooling2D`` class. + +5. **Flatten**: Implemented using the ``pygad.cnn.Flatten`` class. + +6. **ReLU**: Implemented using the ``pygad.cnn.ReLU`` class. + +7. **Sigmoid**: Implemented using the ``pygad.cnn.Sigmoid`` class. + +8. **Dense** (Fully Connected): Implemented using the + ``pygad.cnn.Dense`` class. + +In the future, more layers will be added. + +Except for the input layer, all of listed layers has 4 instance +attributes that do the same function which are: + +1. ``previous_layer``: A reference to the previous layer in the CNN + architecture. + +2. ``layer_input_size``: The size of the input to the layer. + +3. ``layer_output_size``: The size of the output from the layer. + +4. ``layer_output``: The latest output generated from the layer. It + default to ``None``. + +In addition to such attributes, the layers may have some additional +attributes. The next subsections discuss such layers. + +.. _pygadcnninput2d-class: + +``pygad.cnn.Input2D`` Class +--------------------------- + +The ``pygad.cnn.Input2D`` class creates the input layer for the +convolutional neural network. For each network, there is only a single +input layer. The network architecture must start with an input layer. + +This class has no methods or class attributes. All it has is a +constructor that accepts a parameter named ``input_shape`` representing +the shape of the input. + +The instances from the ``Input2D`` class has the following attributes: + +1. ``input_shape``: The shape of the input to the pygad.cnn. + +2. ``layer_output_size`` + +Here is an example of building an input layer with shape +``(50, 50, 3)``. + +.. code:: python + + input_layer = pygad.cnn.Input2D(input_shape=(50, 50, 3)) + +Here is how to access the attributes within the instance of the +``pygad.cnn.Input2D`` class. + +.. code:: python + + input_shape = input_layer.input_shape + layer_output_size = input_layer.layer_output_size + + print("Input2D Input shape =", input_shape) + print("Input2D Output shape =", layer_output_size) + +This is everything about the input layer. + +.. _pygadcnnconv2d-class: + +``pygad.cnn.Conv2D`` Class +-------------------------- + +Using the ``pygad.cnn.Conv2D`` class, convolution (conv) layers can be +created. To create a convolution layer, just create a new instance of +the class. The constructor accepts the following parameters: + +- ``num_filters``: Number of filters. + +- ``kernel_size``: Filter kernel size. + +- ``previous_layer``: A reference to the previous layer. Using the + ``previous_layer`` attribute, a linked list is created that connects + all network layers. For more information about this attribute, please + check the **previous_layer** attribute section of the ``pygad.nn`` + module documentation. + +- ``activation_function=None``: A string representing the activation + function to be used in this layer. Defaults to ``None`` which means + no activation function is applied while applying the convolution + layer. An activation layer can be added separately in this case. The + supported activation functions in the conv layer are ``relu`` and + ``sigmoid``. + +Within the constructor, the accepted parameters are used as instance +attributes. Besides the parameters, some new instance attributes are +created which are: + +- ``filter_bank_size``: Size of the filter bank in this layer. + +- ``initial_weights``: The initial weights for the conv layer. + +- ``trained_weights``: The trained weights of the conv layer. This + attribute is initialized by the value in the ``initial_weights`` + attribute. + +- ``layer_input_size`` + +- ``layer_output_size`` + +- ``layer_output`` + +Here is an example for creating a conv layer with 2 filters and a kernel +size of 3. Note that the ``previous_layer`` parameter is assigned to the +input layer ``input_layer``. + +.. code:: python + + conv_layer = pygad.cnn.Conv2D(num_filters=2, + kernel_size=3, + previous_layer=input_layer, + activation_function=None) + +Here is how to access some attributes in the dense layer: + +.. code:: python + + filter_bank_size = conv_layer.filter_bank_size + conv_initail_weights = conv_layer.initial_weights + + print("Filter bank size attributes =", filter_bank_size) + print("Initial weights of the conv layer :", conv_initail_weights) + +Because ``conv_layer`` holds a reference to the input layer, then the +number of input neurons can be accessed. + +.. code:: python + + input_layer = conv_layer.previous_layer + input_shape = input_layer.num_neurons + + print("Input shape =", input_shape) + +Here is another conv layer where its ``previous_layer`` attribute points +to the previously created conv layer and it uses the ``ReLU`` activation +function. + +.. code:: python + + conv_layer2 = pygad.cnn.Conv2D(num_filters=2, + kernel_size=3, + previous_layer=conv_layer, + activation_function="relu") + +Because ``conv_layer2`` holds a reference to ``conv_layer`` in its +``previous_layer`` attribute, then the attributes in ``conv_layer`` can +be accessed. + +.. code:: python + + conv_layer = conv_layer2.previous_layer + filter_bank_size = conv_layer.filter_bank_size + + print("Filter bank size attributes =", filter_bank_size) + +After getting the reference to ``conv_layer``, we can use it to access +the number of input neurons. + +.. code:: python + + conv_layer = conv_layer2.previous_layer + input_layer = conv_layer.previous_layer + input_shape = input_layer.num_neurons + + print("Input shape =", input_shape) + +.. _pygadcnnmaxpooling2d-class: + +``pygad.cnn.MaxPooling2D`` Class +-------------------------------- + +The ``pygad.cnn.MaxPooling2D`` class builds a max pooling layer for the +CNN architecture. The constructor of this class accepts the following +parameter: + +- ``pool_size``: Size of the window. + +- ``previous_layer``: A reference to the previous layer in the CNN + architecture. + +- ``stride=2``: A stride that default to 2. + +Within the constructor, the accepted parameters are used as instance +attributes. Besides the parameters, some new instance attributes are +created which are: + +- ``layer_input_size`` + +- ``layer_output_size`` + +- ``layer_output`` + +.. _pygadcnnaveragepooling2d-class: + +``pygad.cnn.AveragePooling2D`` Class +------------------------------------ + +The ``pygad.cnn.AveragePooling2D`` class is similar to the +``pygad.cnn.MaxPooling2D`` class except that it applies the max pooling +operation rather than average pooling. + +.. _pygadcnnflatten-class: + +``pygad.cnn.Flatten`` Class +--------------------------- + +The ``pygad.cnn.Flatten`` class implements the flatten layer which +converts the output of the previous layer into a 1D vector. The +constructor accepts only the ``previous_layer`` parameter. + +The following instance attributes exist: + +- ``previous_layer`` + +- ``layer_input_size`` + +- ``layer_output_size`` + +- ``layer_output`` + +.. _pygadcnnrelu-class: + +``pygad.cnn.ReLU`` Class +------------------------ + +The ``pygad.cnn.ReLU`` class implements the ReLU layer which applies the +ReLU activation function to the output of the previous layer. + +The constructor accepts only the ``previous_layer`` parameter. + +The following instance attributes exist: + +- ``previous_layer`` + +- ``layer_input_size`` + +- ``layer_output_size`` + +- ``layer_output`` + +.. _pygadcnnsigmoid-class: + +``pygad.cnn.Sigmoid`` Class +--------------------------- + +The ``pygad.cnn.Sigmoid`` class is similar to the ``pygad.cnn.ReLU`` +class except that it applies the sigmoid function rather than the ReLU +function. + +.. _pygadcnndense-class: + +``pygad.cnn.Dense`` Class +------------------------- + +The ``pygad.cnn.Dense`` class implement the dense layer. Its constructor +accepts the following parameters: + +- ``num_neurons``: Number of neurons in the dense layer. + +- ``previous_layer``: A reference to the previous layer. + +- ``activation_function``: A string representing the activation + function to be used in this layer. Defaults to ``"sigmoid"``. + Currently, the supported activation functions in the dense layer are + ``"sigmoid"``, ``"relu"``, and ``softmax``. + +Within the constructor, the accepted parameters are used as instance +attributes. Besides the parameters, some new instance attributes are +created which are: + +- ``initial_weights``: The initial weights for the dense layer. + +- ``trained_weights``: The trained weights of the dense layer. This + attribute is initialized by the value in the ``initial_weights`` + attribute. + +- ``layer_input_size`` + +- ``layer_output_size`` + +- ``layer_output`` + +.. _pygadcnnmodel-class: + +``pygad.cnn.Model`` Class +========================= + +An instance of the ``pygad.cnn.Model`` class represents a CNN model. The +constructor of this class accepts the following parameters: + +- ``last_layer``: A reference to the last layer in the CNN architecture + (i.e. dense layer). + +- ``epochs=10``: Number of epochs. + +- ``learning_rate=0.01``: Learning rate. + +Within the constructor, the accepted parameters are used as instance +attributes. Besides the parameters, a new instance attribute named +``network_layers`` is created which holds a list with references to the +CNN layers. Such a list is returned using the ``get_layers()`` method in +the ``pygad.cnn.Model`` class. + +There are a number of methods in the ``pygad.cnn.Model`` class which +serves in training, testing, and retrieving information about the model. +These methods are discussed in the next subsections. + +.. _getlayers: + +``get_layers()`` +---------------- + +Creates a list of all layers in the CNN model. It accepts no parameters. + +``train()`` +----------- + +Trains the CNN model. + +Accepts the following parameters: + +- ``train_inputs``: Training data inputs. + +- ``train_outputs``: Training data outputs. + +This method trains the CNN model according to the number of epochs +specified in the constructor of the ``pygad.cnn.Model`` class. + +It is important to note that no learning algorithm is used for training +the pygad.cnn. Just the learning rate is used for making some changes +which is better than leaving the weights unchanged. + +.. _feedsample: + +``feed_sample()`` +----------------- + +Feeds a sample in the CNN layers and returns results of the last layer +in the pygad.cnn. + +.. _updateweights: + +``update_weights()`` +-------------------- + +Updates the CNN weights using the learning rate. It is important to note +that no learning algorithm is used for training the pygad.cnn. Just the +learning rate is used for making some changes which is better than +leaving the weights unchanged. + +``predict()`` +------------- + +Uses the trained CNN for making predictions. + +Accepts the following parameter: + +- ``data_inputs``: The inputs to predict their label. + +It returns a list holding the samples predictions. + +``summary()`` +------------- + +Prints a summary of the CNN architecture. + +Supported Activation Functions +============================== + +The supported activation functions in the convolution layer are: + +1. Sigmoid: Implemented using the ``pygad.cnn.sigmoid()`` function. + +2. Rectified Linear Unit (ReLU): Implemented using the + ``pygad.cnn.relu()`` function. + +The dense layer supports these functions besides the ``softmax`` +function implemented in the ``pygad.cnn.softmax()`` function. + +Steps to Build a Neural Network +=============================== + +This section discusses how to use the ``pygad.cnn`` module for building +a neural network. The summary of the steps are as follows: + +- Reading the Data + +- Building the CNN Architecture + +- Building Model + +- Model Summary + +- Training the CNN + +- Making Predictions + +- Calculating Some Statistics + +Reading the Data +---------------- + +Before building the network architecture, the first thing to do is to +prepare the data that will be used for training the network. + +In this example, 4 classes of the **Fruits360** dataset are used for +preparing the training data. The 4 classes are: + +1. `Apple + Braeburn `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/apple + +2. `Lemon + Meyer `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/lemon + +3. `Mango `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/mango + +4. `Raspberry `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/raspberry + +Just 20 samples from each of the 4 classes are saved into a NumPy array +available in the +`dataset_inputs.npy `__ +file: +https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy + +The shape of this array is ``(80, 100, 100, 3)`` where the shape of the +single image is ``(100, 100, 3)``. + +The +`dataset_outputs.npy `__ +file +(https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy) +has the class labels for the 4 classes: + +1. `Apple + Braeburn `__: + Class label is **0** + +2. `Lemon + Meyer `__: + Class label is **1** + +3. `Mango `__: + Class label is **2** + +4. `Raspberry `__: + Class label is **3** + +Simply, download and reach the 2 files to return the NumPy arrays +according to the next 2 lines: + +.. code:: python + + train_inputs = numpy.load("dataset_inputs.npy") + train_outputs = numpy.load("dataset_outputs.npy") + +After the data is prepared, next is to create the network architecture. + +Building the Network Architecture +--------------------------------- + +The input layer is created by instantiating the ``pygad.cnn.Input2D`` +class according to the next code. A network can only have a single input +layer. + +.. code:: python + + import pygad.cnn + sample_shape = train_inputs.shape[1:] + + input_layer = pygad.cnn.Input2D(input_shape=sample_shape) + +After the input layer is created, next is to create a number of layers +layers according to the next code. Normally, the last dense layer is +regarded as the output layer. Note that the output layer has a number of +neurons equal to the number of classes in the dataset which is 4. + +.. code:: python + + conv_layer1 = pygad.cnn.Conv2D(num_filters=2, + kernel_size=3, + previous_layer=input_layer, + activation_function=None) + relu_layer1 = pygad.cnn.Sigmoid(previous_layer=conv_layer1) + average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, + previous_layer=relu_layer1, + stride=2) + + conv_layer2 = pygad.cnn.Conv2D(num_filters=3, + kernel_size=3, + previous_layer=average_pooling_layer, + activation_function=None) + relu_layer2 = pygad.cnn.ReLU(previous_layer=conv_layer2) + max_pooling_layer = pygad.cnn.MaxPooling2D(pool_size=2, + previous_layer=relu_layer2, + stride=2) + + conv_layer3 = pygad.cnn.Conv2D(num_filters=1, + kernel_size=3, + previous_layer=max_pooling_layer, + activation_function=None) + relu_layer3 = pygad.cnn.ReLU(previous_layer=conv_layer3) + pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, + previous_layer=relu_layer3, + stride=2) + + flatten_layer = pygad.cnn.Flatten(previous_layer=pooling_layer) + dense_layer1 = pygad.cnn.Dense(num_neurons=100, + previous_layer=flatten_layer, + activation_function="relu") + dense_layer2 = pygad.cnn.Dense(num_neurons=4, + previous_layer=dense_layer1, + activation_function="softmax") + +After the network architecture is prepared, the next step is to create a +CNN model. + +Building Model +-------------- + +The CNN model is created as an instance of the ``pygad.cnn.Model`` +class. Here is an example. + +.. code:: python + + model = pygad.cnn.Model(last_layer=dense_layer2, + epochs=5, + learning_rate=0.01) + +After the model is created, a summary of the model architecture can be +printed. + +Model Summary +------------- + +The ``summary()`` method in the ``pygad.cnn.Model`` class prints a +summary of the CNN model. + +.. code:: python + + model.summary() + +.. code:: python + + ----------Network Architecture---------- + + + + + + + + + + + + + ---------------------------------------- + +Training the Network +-------------------- + +After the model and the data are prepared, then the model can be trained +using the the ``pygad.cnn.train()`` method. + +.. code:: python + + model.train(train_inputs=train_inputs, + train_outputs=train_outputs) + +After training the network, the next step is to make predictions. + +Making Predictions +------------------ + +The ``pygad.cnn.predict()`` method uses the trained network for making +predictions. Here is an example. + +.. code:: python + + predictions = model.predict(data_inputs=train_inputs) + +It is not expected to have high accuracy in the predictions because no +training algorithm is used. + +Calculating Some Statistics +--------------------------- + +Based on the predictions the network made, some statistics can be +calculated such as the number of correct and wrong predictions in +addition to the classification accuracy. + +.. code:: python + + num_wrong = numpy.where(predictions != train_outputs)[0] + num_correct = train_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/train_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +It is very important to note that it is not expected that the +classification accuracy is high because no training algorithm is used. +Please check the documentation of the ``pygad.gacnn`` module for +training the CNN using the genetic algorithm. + +Examples +======== + +This section gives the complete code of some examples that build neural +networks using ``pygad.cnn``. Each subsection builds a different +network. + +Image Classification +-------------------- + +This example is discussed in the **Steps to Build a Convolutional Neural +Network** section and its complete code is listed below. + +Remember to either download or create the +`dataset_features.npy `__ +and +`dataset_outputs.npy `__ +files before running this code. + +.. code:: python + + import numpy + import pygad.cnn + + """ + Convolutional neural network implementation using NumPy + A tutorial that helps to get started (Building Convolutional Neural Network using NumPy from Scratch) available in these links: + https://www.linkedin.com/pulse/building-convolutional-neural-network-using-numpy-from-ahmed-gad + https://towardsdatascience.com/building-convolutional-neural-network-using-numpy-from-scratch-b30aac50e50a + https://www.kdnuggets.com/2018/04/building-convolutional-neural-network-numpy-scratch.html + It is also translated into Chinese: http://m.aliyun.com/yunqi/articles/585741 + """ + + train_inputs = numpy.load("dataset_inputs.npy") + train_outputs = numpy.load("dataset_outputs.npy") + + sample_shape = train_inputs.shape[1:] + num_classes = 4 + + input_layer = pygad.cnn.Input2D(input_shape=sample_shape) + conv_layer1 = pygad.cnn.Conv2D(num_filters=2, + kernel_size=3, + previous_layer=input_layer, + activation_function=None) + relu_layer1 = pygad.cnn.Sigmoid(previous_layer=conv_layer1) + average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, + previous_layer=relu_layer1, + stride=2) + + conv_layer2 = pygad.cnn.Conv2D(num_filters=3, + kernel_size=3, + previous_layer=average_pooling_layer, + activation_function=None) + relu_layer2 = pygad.cnn.ReLU(previous_layer=conv_layer2) + max_pooling_layer = pygad.cnn.MaxPooling2D(pool_size=2, + previous_layer=relu_layer2, + stride=2) + + conv_layer3 = pygad.cnn.Conv2D(num_filters=1, + kernel_size=3, + previous_layer=max_pooling_layer, + activation_function=None) + relu_layer3 = pygad.cnn.ReLU(previous_layer=conv_layer3) + pooling_layer = pygad.cnn.AveragePooling2D(pool_size=2, + previous_layer=relu_layer3, + stride=2) + + flatten_layer = pygad.cnn.Flatten(previous_layer=pooling_layer) + dense_layer1 = pygad.cnn.Dense(num_neurons=100, + previous_layer=flatten_layer, + activation_function="relu") + dense_layer2 = pygad.cnn.Dense(num_neurons=num_classes, + previous_layer=dense_layer1, + activation_function="softmax") + + model = pygad.cnn.Model(last_layer=dense_layer2, + epochs=1, + learning_rate=0.01) + + model.summary() + + model.train(train_inputs=train_inputs, + train_outputs=train_outputs) + + predictions = model.predict(data_inputs=train_inputs) + print(predictions) + + num_wrong = numpy.where(predictions != train_outputs)[0] + num_correct = train_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/train_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) diff --git a/docs/source/README_pygad_gacnn_ReadTheDocs.rst b/docs/source/gacnn.rst similarity index 97% rename from docs/source/README_pygad_gacnn_ReadTheDocs.rst rename to docs/source/gacnn.rst index 39bbc6e..c9b6336 100644 --- a/docs/source/README_pygad_gacnn_ReadTheDocs.rst +++ b/docs/source/gacnn.rst @@ -1,662 +1,662 @@ -.. _pygadgacnn-module: - -``pygad.gacnn`` Module -====================== - -This section of the PyGAD's library documentation discusses the -**pygad.gacnn** module. - -The ``pygad.gacnn`` module trains convolutional neural networks using -the genetic algorithm. It makes use of the 2 modules ``pygad`` and -``pygad.cnn``. - -.. _pygadgacnngacnn-class: - -``pygad.gacnn.GACNN`` Class -=========================== - -The ``pygad.gacnn`` module has a class named ``pygad.gacnn.GACNN`` for -training convolutional neural networks (CNNs) using the genetic -algorithm. The constructor, methods, function, and attributes within the -class are discussed in this section. - -.. _init: - -``__init__()`` --------------- - -In order to train a CNN using the genetic algorithm, the first thing to -do is to create an instance of the ``pygad.gacnn.GACNN`` class. - -The ``pygad.gacnn.GACNN`` class constructor accepts the following -parameters: - -- ``model``: model: An instance of the pygad.cnn.Model class - representing the architecture of all solutions in the population. - -- ``num_solutions``: Number of CNNs (i.e. solutions) in the population. - Based on the value passed to this parameter, a number of identical - CNNs are created where their parameters are optimized using the - genetic algorithm. - -Instance Attributes -------------------- - -All the parameters in the ``pygad.gacnn.GACNN`` class constructor are -used as instance attributes. Besides such attributes, there is an extra -attribute added to the instances from the ``pygad.gacnn.GACNN`` class -which is: - -- ``population_networks``: A list holding references to all the - solutions (i.e. CNNs) used in the population. - -Methods in the GACNN Class --------------------------- - -This section discusses the methods available for instances of the -``pygad.gacnn.GACNN`` class. - -.. _createpopulation: - -``create_population()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -The ``create_population()`` method creates the initial population of the -genetic algorithm as a list of CNNs (i.e. solutions). All the networks -are copied from the CNN model passed to constructor of the GACNN class. - -The list of networks is assigned to the ``population_networks`` -attribute of the instance. - -.. _updatepopulationtrainedweights: - -``update_population_trained_weights()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The ``update_population_trained_weights()`` method updates the -``trained_weights`` attribute of the layers of each network (check the -documentation of the ``pygad.cnn`` module) for more information) -according to the weights passed in the ``population_trained_weights`` -parameter. - -Accepts the following parameters: - -- ``population_trained_weights``: A list holding the trained weights of - all networks as matrices. Such matrices are to be assigned to the - ``trained_weights`` attribute of all layers of all networks. - -.. _functions-in-the-pygadgacnn-module: - -Functions in the ``pygad.gacnn`` Module -======================================= - -This section discusses the functions in the ``pygad.gacnn`` module. - -.. _pygadgacnnpopulationasvectors: - -``pygad.gacnn.population_as_vectors()`` ----------------------------------------- - -Accepts the population as a list of references to the -``pygad.cnn.Model`` class and returns a list holding all weights of the -layers of each solution (i.e. network) in the population as a vector. - -For example, if the population has 6 solutions (i.e. networks), this -function accepts references to such networks and returns a list with 6 -vectors, one for each network (i.e. solution). Each vector holds the -weights for all layers for a single network. - -Accepts the following parameters: - -- ``population_networks``: A list holding references to the - ``pygad.cnn.Model`` class of the networks used in the population. - -Returns a list holding the weights vectors for all solutions (i.e. -networks). - -.. _pygadgacnnpopulationasmatrices: - -``pygad.gacnn.population_as_matrices()`` ----------------------------------------- - -Accepts the population as both networks and weights vectors and returns -the weights of all layers of each solution (i.e. network) in the -population as a matrix. - -For example, if the population has 6 solutions (i.e. networks), this -function returns a list with 6 matrices, one for each network holding -its weights for all layers. - -Accepts the following parameters: - -- ``population_networks``: A list holding references to the - ``pygad.cnn.Model`` class of the networks used in the population. - -- ``population_vectors``: A list holding the weights of all networks as - vectors. Such vectors are to be converted into matrices. - -Returns a list holding the weights matrices for all solutions (i.e. -networks). - -Steps to Build and Train CNN using Genetic Algorithm -==================================================== - -The steps to use this project for building and training a neural network -using the genetic algorithm are as follows: - -- Prepare the training data. - -- Create an instance of the ``pygad.gacnn.GACNN`` class. - -- Fetch the population weights as vectors. - -- Prepare the fitness function. - -- Prepare the generation callback function. - -- Create an instance of the ``pygad.GA`` class. - -- Run the created instance of the ``pygad.GA`` class. - -- Plot the Fitness Values - -- Information about the best solution. - -- Making predictions using the trained weights. - -- Calculating some statistics. - -Let's start covering all of these steps. - -Prepare the Training Data -------------------------- - -Before building and training neural networks, the training data (input -and output) is to be prepared. The inputs and the outputs of the -training data are NumPy arrays. - -The data used in this example is available as 2 files: - -1. `dataset_inputs.npy `__: - Data inputs. - https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy - -2. `dataset_outputs.npy `__: - Class labels. - https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy - -The data consists of 4 classes of images. The image shape is -``(100, 100, 3)`` and there are 20 images per class. For more -information about the dataset, check the **Reading the Data** section of -the ``pygad.cnn`` module. - -Simply download these 2 files and read them according to the next code. - -.. code:: python - - import numpy - - train_inputs = numpy.load("dataset_inputs.npy") - train_outputs = numpy.load("dataset_outputs.npy") - -For the output array, each element must be a single number representing -the class label of the sample. The class labels must start at ``0``. So, -if there are 80 samples, then the shape of the output array is ``(80)``. -If there are 5 classes in the data, then the values of all the 200 -elements in the output array must range from 0 to 4 inclusive. -Generally, the class labels start from ``0`` to ``N-1`` where ``N`` is -the number of classes. - -Note that the project only supports that each sample is assigned to only -one class. - -Building the Network Architecture ---------------------------------- - -Here is an example for a CNN architecture. - -.. code:: python - - import pygad.cnn - - input_layer = pygad.cnn.Input2D(input_shape=(80, 80, 3)) - conv_layer = pygad.cnn.Conv2D(num_filters=2, - kernel_size=3, - previous_layer=input_layer, - activation_function="relu") - average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=5, - previous_layer=conv_layer, - stride=3) - - flatten_layer = pygad.cnn.Flatten(previous_layer=average_pooling_layer) - dense_layer = pygad.cnn.Dense(num_neurons=4, - previous_layer=flatten_layer, - activation_function="softmax") - -After the network architecture is prepared, the next step is to create a -CNN model. - -Building Model --------------- - -The CNN model is created as an instance of the ``pygad.cnn.Model`` -class. Here is an example. - -.. code:: python - - model = pygad.cnn.Model(last_layer=dense_layer, - epochs=5, - learning_rate=0.01) - -After the model is created, a summary of the model architecture can be -printed. - -Model Summary -------------- - -The ``summary()`` method in the ``pygad.cnn.Model`` class prints a -summary of the CNN model. - -.. code:: python - - model.summary() - -.. code:: python - - ----------Network Architecture---------- - - - - - ---------------------------------------- - -The next step is to create an instance of the ``pygad.gacnn.GACNN`` -class. - -.. _create-an-instance-of-the-pygadgacnngacnn-class: - -Create an Instance of the ``pygad.gacnn.GACNN`` Class ------------------------------------------------------ - -After preparing the input data and building the CNN model, an instance -of the ``pygad.gacnn.GACNN`` class is created by passing the appropriate -parameters. - -Here is an example where the ``num_solutions`` parameter is set to 4 -which means the genetic algorithm population will have 6 solutions (i.e. -networks). All of these 6 CNNs will have the same architectures as -specified by the ``model`` parameter. - -.. code:: python - - import pygad.gacnn - - GACNN_instance = pygad.gacnn.GACNN(model=model, - num_solutions=4) - -After creating the instance of the ``pygad.gacnn.GACNN`` class, next is -to fetch the weights of the population as a list of vectors. - -Fetch the Population Weights as Vectors ---------------------------------------- - -For the genetic algorithm, the parameters (i.e. genes) of each solution -are represented as a single vector. - -For this task, the weights of each CNN must be available as a single -vector. In other words, the weights of all layers of a CNN must be -grouped into a vector. - -To create a list holding the population weights as vectors, one for each -network, the ``pygad.gacnn.population_as_vectors()`` function is used. - -.. code:: python - - population_vectors = gacnn.population_as_vectors(population_networks=GACNN_instance.population_networks) - -Such population of vectors is used as the initial population. - -.. code:: python - - initial_population = population_vectors.copy() - -After preparing the population weights as a set of vectors, next is to -prepare 2 functions which are: - -1. Fitness function. - -2. Callback function after each generation. - -Prepare the Fitness Function ----------------------------- - -The PyGAD library works by allowing the users to customize the genetic -algorithm for their own problems. Because the problems differ in how the -fitness values are calculated, then PyGAD allows the user to use a -custom function as a maximization fitness function. This function must -accept 2 positional parameters representing the following: - -- The solution. - -- The solution index in the population. - -The fitness function must return a single number representing the -fitness. The higher the fitness value, the better the solution. - -Here is the implementation of the fitness function for training a CNN. - -It uses the ``pygad.cnn.predict()`` function to predict the class labels -based on the current solution's weights. The ``pygad.cnn.predict()`` -function uses the trained weights available in the ``trained_weights`` -attribute of each layer of the network for making predictions. - -Based on such predictions, the classification accuracy is calculated. -This accuracy is used as the fitness value of the solution. Finally, the -fitness value is returned. - -.. code:: python - - def fitness_func(ga_instance, solution, sol_idx): - global GACNN_instance, data_inputs, data_outputs - - predictions = GACNN_instance.population_networks[sol_idx].predict(data_inputs=data_inputs) - correct_predictions = numpy.where(predictions == data_outputs)[0].size - solution_fitness = (correct_predictions/data_outputs.size)*100 - - return solution_fitness - -Prepare the Generation Callback Function ----------------------------------------- - -After each generation of the genetic algorithm, the fitness function -will be called to calculate the fitness value of each solution. Within -the fitness function, the ``pygad.cnn.predict()`` function is used for -predicting the outputs based on the current solution's -``trained_weights`` attribute. Thus, it is required that such an -attribute is updated by weights evolved by the genetic algorithm after -each generation. - -PyGAD has a parameter accepted by the ``pygad.GA`` class constructor -named ``on_generation``. It could be assigned to a function that is -called after each generation. The function must accept a single -parameter representing the instance of the ``pygad.GA`` class. - -This callback function can be used to update the ``trained_weights`` -attribute of layers of each network in the population. - -Here is the implementation for a function that updates the -``trained_weights`` attribute of the layers of the population networks. - -It works by converting the current population from the vector form to -the matric form using the ``pygad.gacnn.population_as_matrices()`` -function. It accepts the population as vectors and returns it as -matrices. - -The population matrices are then passed to the -``update_population_trained_weights()`` method in the ``pygad.gacnn`` -module to update the ``trained_weights`` attribute of all layers for all -solutions within the population. - -.. code:: python - - def callback_generation(ga_instance): - global GACNN_instance, last_fitness - - population_matrices = gacnn.population_as_matrices(population_networks=GACNN_instance.population_networks, population_vectors=ga_instance.population) - GACNN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - -After preparing the fitness and callback function, next is to create an -instance of the ``pygad.GA`` class. - -.. _create-an-instance-of-the-pygadga-class: - -Create an Instance of the ``pygad.GA`` Class --------------------------------------------- - -Once the parameters of the genetic algorithm are prepared, an instance -of the ``pygad.GA`` class can be created. Here is an example where the -number of generations is 10. - -.. code:: python - - import pygad - - num_parents_mating = 4 - - num_generations = 10 - - mutation_percent_genes = 5 - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - on_generation=callback_generation) - -The last step for training the neural networks using the genetic -algorithm is calling the ``run()`` method. - -.. _run-the-created-instance-of-the-pygadga-class: - -Run the Created Instance of the ``pygad.GA`` Class --------------------------------------------------- - -By calling the ``run()`` method from the ``pygad.GA`` instance, the -genetic algorithm will iterate through the number of generations -specified in its ``num_generations`` parameter. - -.. code:: python - - ga_instance.run() - -Plot the Fitness Values ------------------------ - -After the ``run()`` method completes, the ``plot_fitness()`` method can -be called to show how the fitness values evolve by generation. - -.. code:: python - - ga_instance.plot_fitness() - -.. figure:: https://user-images.githubusercontent.com/16560492/83429675-ab744580-a434-11ea-8f21-9d3804b50d15.png - :alt: - -Information about the Best Solution ------------------------------------ - -The following information about the best solution in the last population -is returned using the ``best_solution()`` method in the ``pygad.GA`` -class. - -- Solution - -- Fitness value of the solution - -- Index of the solution within the population - -Here is how such information is returned. - -.. code:: python - - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - -.. code:: - - ... - Fitness value of the best solution = 83.75 - Index of the best solution : 0 - Best fitness value reached after 4 generations. - -Making Predictions using the Trained Weights --------------------------------------------- - -The ``pygad.cnn.predict()`` function can be used to make predictions -using the trained network. As printed, the network is able to predict -the labels correctly. - -.. code:: python - - predictions = pygad.cnn.predict(last_layer=GANN_instance.population_networks[solution_idx], data_inputs=data_inputs) - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - -Calculating Some Statistics ---------------------------- - -Based on the predictions the network made, some statistics can be -calculated such as the number of correct and wrong predictions in -addition to the classification accuracy. - -.. code:: python - - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -.. code:: - - Number of correct classifications : 67. - Number of wrong classifications : 13. - Classification accuracy : 83.75. - -Examples -======== - -This section gives the complete code of some examples that build and -train neural networks using the genetic algorithm. Each subsection -builds a different network. - -Image Classification --------------------- - -This example is discussed in the **Steps to Build and Train CNN using -Genetic Algorithm** section that builds the an image classifier. Its -complete code is listed below. - -.. code:: python - - import numpy - import pygad.cnn - import pygad.gacnn - import pygad - - """ - Convolutional neural network implementation using NumPy - A tutorial that helps to get started (Building Convolutional Neural Network using NumPy from Scratch) available in these links: - https://www.linkedin.com/pulse/building-convolutional-neural-network-using-numpy-from-ahmed-gad - https://towardsdatascience.com/building-convolutional-neural-network-using-numpy-from-scratch-b30aac50e50a - https://www.kdnuggets.com/2018/04/building-convolutional-neural-network-numpy-scratch.html - It is also translated into Chinese: http://m.aliyun.com/yunqi/articles/585741 - """ - - def fitness_func(ga_instance, solution, sol_idx): - global GACNN_instance, data_inputs, data_outputs - - predictions = GACNN_instance.population_networks[sol_idx].predict(data_inputs=data_inputs) - correct_predictions = numpy.where(predictions == data_outputs)[0].size - solution_fitness = (correct_predictions/data_outputs.size)*100 - - return solution_fitness - - def callback_generation(ga_instance): - global GACNN_instance, last_fitness - - population_matrices = pygad.gacnn.population_as_matrices(population_networks=GACNN_instance.population_networks, - population_vectors=ga_instance.population) - - GACNN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solutions_fitness)) - - data_inputs = numpy.load("dataset_inputs.npy") - data_outputs = numpy.load("dataset_outputs.npy") - - sample_shape = data_inputs.shape[1:] - num_classes = 4 - - data_inputs = data_inputs - data_outputs = data_outputs - - input_layer = pygad.cnn.Input2D(input_shape=sample_shape) - conv_layer1 = pygad.cnn.Conv2D(num_filters=2, - kernel_size=3, - previous_layer=input_layer, - activation_function="relu") - average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=5, - previous_layer=conv_layer1, - stride=3) - - flatten_layer = pygad.cnn.Flatten(previous_layer=average_pooling_layer) - dense_layer2 = pygad.cnn.Dense(num_neurons=num_classes, - previous_layer=flatten_layer, - activation_function="softmax") - - model = pygad.cnn.Model(last_layer=dense_layer2, - epochs=1, - learning_rate=0.01) - - model.summary() - - - GACNN_instance = pygad.gacnn.GACNN(model=model, - num_solutions=4) - - # GACNN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. - # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. - population_vectors = pygad.gacnn.population_as_vectors(population_networks=GACNN_instance.population_networks) - - # To prepare the initial population, there are 2 ways: - # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. - # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. - initial_population = population_vectors.copy() - - num_parents_mating = 2 # Number of solutions to be selected as parents in the mating pool. - - num_generations = 10 # Number of generations. - - mutation_percent_genes = 0.1 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness() - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - # Predicting the outputs of the data using the best solution. - predictions = GACNN_instance.population_networks[solution_idx].predict(data_inputs=data_inputs) - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - - # Calculating some statistics - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) +.. _pygadgacnn-module: + +``pygad.gacnn`` Module +====================== + +This section of the PyGAD's library documentation discusses the +**pygad.gacnn** module. + +The ``pygad.gacnn`` module trains convolutional neural networks using +the genetic algorithm. It makes use of the 2 modules ``pygad`` and +``pygad.cnn``. + +.. _pygadgacnngacnn-class: + +``pygad.gacnn.GACNN`` Class +=========================== + +The ``pygad.gacnn`` module has a class named ``pygad.gacnn.GACNN`` for +training convolutional neural networks (CNNs) using the genetic +algorithm. The constructor, methods, function, and attributes within the +class are discussed in this section. + +.. _init: + +``__init__()`` +-------------- + +In order to train a CNN using the genetic algorithm, the first thing to +do is to create an instance of the ``pygad.gacnn.GACNN`` class. + +The ``pygad.gacnn.GACNN`` class constructor accepts the following +parameters: + +- ``model``: model: An instance of the pygad.cnn.Model class + representing the architecture of all solutions in the population. + +- ``num_solutions``: Number of CNNs (i.e. solutions) in the population. + Based on the value passed to this parameter, a number of identical + CNNs are created where their parameters are optimized using the + genetic algorithm. + +Instance Attributes +------------------- + +All the parameters in the ``pygad.gacnn.GACNN`` class constructor are +used as instance attributes. Besides such attributes, there is an extra +attribute added to the instances from the ``pygad.gacnn.GACNN`` class +which is: + +- ``population_networks``: A list holding references to all the + solutions (i.e. CNNs) used in the population. + +Methods in the GACNN Class +-------------------------- + +This section discusses the methods available for instances of the +``pygad.gacnn.GACNN`` class. + +.. _createpopulation: + +``create_population()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +The ``create_population()`` method creates the initial population of the +genetic algorithm as a list of CNNs (i.e. solutions). All the networks +are copied from the CNN model passed to constructor of the GACNN class. + +The list of networks is assigned to the ``population_networks`` +attribute of the instance. + +.. _updatepopulationtrainedweights: + +``update_population_trained_weights()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``update_population_trained_weights()`` method updates the +``trained_weights`` attribute of the layers of each network (check the +documentation of the ``pygad.cnn`` module) for more information) +according to the weights passed in the ``population_trained_weights`` +parameter. + +Accepts the following parameters: + +- ``population_trained_weights``: A list holding the trained weights of + all networks as matrices. Such matrices are to be assigned to the + ``trained_weights`` attribute of all layers of all networks. + +.. _functions-in-the-pygadgacnn-module: + +Functions in the ``pygad.gacnn`` Module +======================================= + +This section discusses the functions in the ``pygad.gacnn`` module. + +.. _pygadgacnnpopulationasvectors: + +``pygad.gacnn.population_as_vectors()`` +---------------------------------------- + +Accepts the population as a list of references to the +``pygad.cnn.Model`` class and returns a list holding all weights of the +layers of each solution (i.e. network) in the population as a vector. + +For example, if the population has 6 solutions (i.e. networks), this +function accepts references to such networks and returns a list with 6 +vectors, one for each network (i.e. solution). Each vector holds the +weights for all layers for a single network. + +Accepts the following parameters: + +- ``population_networks``: A list holding references to the + ``pygad.cnn.Model`` class of the networks used in the population. + +Returns a list holding the weights vectors for all solutions (i.e. +networks). + +.. _pygadgacnnpopulationasmatrices: + +``pygad.gacnn.population_as_matrices()`` +---------------------------------------- + +Accepts the population as both networks and weights vectors and returns +the weights of all layers of each solution (i.e. network) in the +population as a matrix. + +For example, if the population has 6 solutions (i.e. networks), this +function returns a list with 6 matrices, one for each network holding +its weights for all layers. + +Accepts the following parameters: + +- ``population_networks``: A list holding references to the + ``pygad.cnn.Model`` class of the networks used in the population. + +- ``population_vectors``: A list holding the weights of all networks as + vectors. Such vectors are to be converted into matrices. + +Returns a list holding the weights matrices for all solutions (i.e. +networks). + +Steps to Build and Train CNN using Genetic Algorithm +==================================================== + +The steps to use this project for building and training a neural network +using the genetic algorithm are as follows: + +- Prepare the training data. + +- Create an instance of the ``pygad.gacnn.GACNN`` class. + +- Fetch the population weights as vectors. + +- Prepare the fitness function. + +- Prepare the generation callback function. + +- Create an instance of the ``pygad.GA`` class. + +- Run the created instance of the ``pygad.GA`` class. + +- Plot the Fitness Values + +- Information about the best solution. + +- Making predictions using the trained weights. + +- Calculating some statistics. + +Let's start covering all of these steps. + +Prepare the Training Data +------------------------- + +Before building and training neural networks, the training data (input +and output) is to be prepared. The inputs and the outputs of the +training data are NumPy arrays. + +The data used in this example is available as 2 files: + +1. `dataset_inputs.npy `__: + Data inputs. + https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy + +2. `dataset_outputs.npy `__: + Class labels. + https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy + +The data consists of 4 classes of images. The image shape is +``(100, 100, 3)`` and there are 20 images per class. For more +information about the dataset, check the **Reading the Data** section of +the ``pygad.cnn`` module. + +Simply download these 2 files and read them according to the next code. + +.. code:: python + + import numpy + + train_inputs = numpy.load("dataset_inputs.npy") + train_outputs = numpy.load("dataset_outputs.npy") + +For the output array, each element must be a single number representing +the class label of the sample. The class labels must start at ``0``. So, +if there are 80 samples, then the shape of the output array is ``(80)``. +If there are 5 classes in the data, then the values of all the 200 +elements in the output array must range from 0 to 4 inclusive. +Generally, the class labels start from ``0`` to ``N-1`` where ``N`` is +the number of classes. + +Note that the project only supports that each sample is assigned to only +one class. + +Building the Network Architecture +--------------------------------- + +Here is an example for a CNN architecture. + +.. code:: python + + import pygad.cnn + + input_layer = pygad.cnn.Input2D(input_shape=(80, 80, 3)) + conv_layer = pygad.cnn.Conv2D(num_filters=2, + kernel_size=3, + previous_layer=input_layer, + activation_function="relu") + average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=5, + previous_layer=conv_layer, + stride=3) + + flatten_layer = pygad.cnn.Flatten(previous_layer=average_pooling_layer) + dense_layer = pygad.cnn.Dense(num_neurons=4, + previous_layer=flatten_layer, + activation_function="softmax") + +After the network architecture is prepared, the next step is to create a +CNN model. + +Building Model +-------------- + +The CNN model is created as an instance of the ``pygad.cnn.Model`` +class. Here is an example. + +.. code:: python + + model = pygad.cnn.Model(last_layer=dense_layer, + epochs=5, + learning_rate=0.01) + +After the model is created, a summary of the model architecture can be +printed. + +Model Summary +------------- + +The ``summary()`` method in the ``pygad.cnn.Model`` class prints a +summary of the CNN model. + +.. code:: python + + model.summary() + +.. code:: python + + ----------Network Architecture---------- + + + + + ---------------------------------------- + +The next step is to create an instance of the ``pygad.gacnn.GACNN`` +class. + +.. _create-an-instance-of-the-pygadgacnngacnn-class: + +Create an Instance of the ``pygad.gacnn.GACNN`` Class +----------------------------------------------------- + +After preparing the input data and building the CNN model, an instance +of the ``pygad.gacnn.GACNN`` class is created by passing the appropriate +parameters. + +Here is an example where the ``num_solutions`` parameter is set to 4 +which means the genetic algorithm population will have 6 solutions (i.e. +networks). All of these 6 CNNs will have the same architectures as +specified by the ``model`` parameter. + +.. code:: python + + import pygad.gacnn + + GACNN_instance = pygad.gacnn.GACNN(model=model, + num_solutions=4) + +After creating the instance of the ``pygad.gacnn.GACNN`` class, next is +to fetch the weights of the population as a list of vectors. + +Fetch the Population Weights as Vectors +--------------------------------------- + +For the genetic algorithm, the parameters (i.e. genes) of each solution +are represented as a single vector. + +For this task, the weights of each CNN must be available as a single +vector. In other words, the weights of all layers of a CNN must be +grouped into a vector. + +To create a list holding the population weights as vectors, one for each +network, the ``pygad.gacnn.population_as_vectors()`` function is used. + +.. code:: python + + population_vectors = gacnn.population_as_vectors(population_networks=GACNN_instance.population_networks) + +Such population of vectors is used as the initial population. + +.. code:: python + + initial_population = population_vectors.copy() + +After preparing the population weights as a set of vectors, next is to +prepare 2 functions which are: + +1. Fitness function. + +2. Callback function after each generation. + +Prepare the Fitness Function +---------------------------- + +The PyGAD library works by allowing the users to customize the genetic +algorithm for their own problems. Because the problems differ in how the +fitness values are calculated, then PyGAD allows the user to use a +custom function as a maximization fitness function. This function must +accept 2 positional parameters representing the following: + +- The solution. + +- The solution index in the population. + +The fitness function must return a single number representing the +fitness. The higher the fitness value, the better the solution. + +Here is the implementation of the fitness function for training a CNN. + +It uses the ``pygad.cnn.predict()`` function to predict the class labels +based on the current solution's weights. The ``pygad.cnn.predict()`` +function uses the trained weights available in the ``trained_weights`` +attribute of each layer of the network for making predictions. + +Based on such predictions, the classification accuracy is calculated. +This accuracy is used as the fitness value of the solution. Finally, the +fitness value is returned. + +.. code:: python + + def fitness_func(ga_instance, solution, sol_idx): + global GACNN_instance, data_inputs, data_outputs + + predictions = GACNN_instance.population_networks[sol_idx].predict(data_inputs=data_inputs) + correct_predictions = numpy.where(predictions == data_outputs)[0].size + solution_fitness = (correct_predictions/data_outputs.size)*100 + + return solution_fitness + +Prepare the Generation Callback Function +---------------------------------------- + +After each generation of the genetic algorithm, the fitness function +will be called to calculate the fitness value of each solution. Within +the fitness function, the ``pygad.cnn.predict()`` function is used for +predicting the outputs based on the current solution's +``trained_weights`` attribute. Thus, it is required that such an +attribute is updated by weights evolved by the genetic algorithm after +each generation. + +PyGAD has a parameter accepted by the ``pygad.GA`` class constructor +named ``on_generation``. It could be assigned to a function that is +called after each generation. The function must accept a single +parameter representing the instance of the ``pygad.GA`` class. + +This callback function can be used to update the ``trained_weights`` +attribute of layers of each network in the population. + +Here is the implementation for a function that updates the +``trained_weights`` attribute of the layers of the population networks. + +It works by converting the current population from the vector form to +the matric form using the ``pygad.gacnn.population_as_matrices()`` +function. It accepts the population as vectors and returns it as +matrices. + +The population matrices are then passed to the +``update_population_trained_weights()`` method in the ``pygad.gacnn`` +module to update the ``trained_weights`` attribute of all layers for all +solutions within the population. + +.. code:: python + + def callback_generation(ga_instance): + global GACNN_instance, last_fitness + + population_matrices = gacnn.population_as_matrices(population_networks=GACNN_instance.population_networks, population_vectors=ga_instance.population) + GACNN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + +After preparing the fitness and callback function, next is to create an +instance of the ``pygad.GA`` class. + +.. _create-an-instance-of-the-pygadga-class: + +Create an Instance of the ``pygad.GA`` Class +-------------------------------------------- + +Once the parameters of the genetic algorithm are prepared, an instance +of the ``pygad.GA`` class can be created. Here is an example where the +number of generations is 10. + +.. code:: python + + import pygad + + num_parents_mating = 4 + + num_generations = 10 + + mutation_percent_genes = 5 + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + on_generation=callback_generation) + +The last step for training the neural networks using the genetic +algorithm is calling the ``run()`` method. + +.. _run-the-created-instance-of-the-pygadga-class: + +Run the Created Instance of the ``pygad.GA`` Class +-------------------------------------------------- + +By calling the ``run()`` method from the ``pygad.GA`` instance, the +genetic algorithm will iterate through the number of generations +specified in its ``num_generations`` parameter. + +.. code:: python + + ga_instance.run() + +Plot the Fitness Values +----------------------- + +After the ``run()`` method completes, the ``plot_fitness()`` method can +be called to show how the fitness values evolve by generation. + +.. code:: python + + ga_instance.plot_fitness() + +.. figure:: https://user-images.githubusercontent.com/16560492/83429675-ab744580-a434-11ea-8f21-9d3804b50d15.png + :alt: + +Information about the Best Solution +----------------------------------- + +The following information about the best solution in the last population +is returned using the ``best_solution()`` method in the ``pygad.GA`` +class. + +- Solution + +- Fitness value of the solution + +- Index of the solution within the population + +Here is how such information is returned. + +.. code:: python + + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +.. code:: + + ... + Fitness value of the best solution = 83.75 + Index of the best solution : 0 + Best fitness value reached after 4 generations. + +Making Predictions using the Trained Weights +-------------------------------------------- + +The ``pygad.cnn.predict()`` function can be used to make predictions +using the trained network. As printed, the network is able to predict +the labels correctly. + +.. code:: python + + predictions = pygad.cnn.predict(last_layer=GANN_instance.population_networks[solution_idx], data_inputs=data_inputs) + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + +Calculating Some Statistics +--------------------------- + +Based on the predictions the network made, some statistics can be +calculated such as the number of correct and wrong predictions in +addition to the classification accuracy. + +.. code:: python + + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +.. code:: + + Number of correct classifications : 67. + Number of wrong classifications : 13. + Classification accuracy : 83.75. + +Examples +======== + +This section gives the complete code of some examples that build and +train neural networks using the genetic algorithm. Each subsection +builds a different network. + +Image Classification +-------------------- + +This example is discussed in the **Steps to Build and Train CNN using +Genetic Algorithm** section that builds the an image classifier. Its +complete code is listed below. + +.. code:: python + + import numpy + import pygad.cnn + import pygad.gacnn + import pygad + + """ + Convolutional neural network implementation using NumPy + A tutorial that helps to get started (Building Convolutional Neural Network using NumPy from Scratch) available in these links: + https://www.linkedin.com/pulse/building-convolutional-neural-network-using-numpy-from-ahmed-gad + https://towardsdatascience.com/building-convolutional-neural-network-using-numpy-from-scratch-b30aac50e50a + https://www.kdnuggets.com/2018/04/building-convolutional-neural-network-numpy-scratch.html + It is also translated into Chinese: http://m.aliyun.com/yunqi/articles/585741 + """ + + def fitness_func(ga_instance, solution, sol_idx): + global GACNN_instance, data_inputs, data_outputs + + predictions = GACNN_instance.population_networks[sol_idx].predict(data_inputs=data_inputs) + correct_predictions = numpy.where(predictions == data_outputs)[0].size + solution_fitness = (correct_predictions/data_outputs.size)*100 + + return solution_fitness + + def callback_generation(ga_instance): + global GACNN_instance, last_fitness + + population_matrices = pygad.gacnn.population_as_matrices(population_networks=GACNN_instance.population_networks, + population_vectors=ga_instance.population) + + GACNN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solutions_fitness)) + + data_inputs = numpy.load("dataset_inputs.npy") + data_outputs = numpy.load("dataset_outputs.npy") + + sample_shape = data_inputs.shape[1:] + num_classes = 4 + + data_inputs = data_inputs + data_outputs = data_outputs + + input_layer = pygad.cnn.Input2D(input_shape=sample_shape) + conv_layer1 = pygad.cnn.Conv2D(num_filters=2, + kernel_size=3, + previous_layer=input_layer, + activation_function="relu") + average_pooling_layer = pygad.cnn.AveragePooling2D(pool_size=5, + previous_layer=conv_layer1, + stride=3) + + flatten_layer = pygad.cnn.Flatten(previous_layer=average_pooling_layer) + dense_layer2 = pygad.cnn.Dense(num_neurons=num_classes, + previous_layer=flatten_layer, + activation_function="softmax") + + model = pygad.cnn.Model(last_layer=dense_layer2, + epochs=1, + learning_rate=0.01) + + model.summary() + + + GACNN_instance = pygad.gacnn.GACNN(model=model, + num_solutions=4) + + # GACNN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. + # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. + population_vectors = pygad.gacnn.population_as_vectors(population_networks=GACNN_instance.population_networks) + + # To prepare the initial population, there are 2 ways: + # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. + # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. + initial_population = population_vectors.copy() + + num_parents_mating = 2 # Number of solutions to be selected as parents in the mating pool. + + num_generations = 10 # Number of generations. + + mutation_percent_genes = 0.1 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness() + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + # Predicting the outputs of the data using the best solution. + predictions = GACNN_instance.population_networks[solution_idx].predict(data_inputs=data_inputs) + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + + # Calculating some statistics + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) diff --git a/docs/source/README_pygad_gann_ReadTheDocs.rst b/docs/source/gann.rst similarity index 97% rename from docs/source/README_pygad_gann_ReadTheDocs.rst rename to docs/source/gann.rst index ac95e61..8b243e7 100644 --- a/docs/source/README_pygad_gann_ReadTheDocs.rst +++ b/docs/source/gann.rst @@ -1,1262 +1,1262 @@ -.. _pygadgann-module: - -``pygad.gann`` Module -===================== - -This section of the PyGAD's library documentation discusses the -**pygad.gann** module. - -The ``pygad.gann`` module trains neural networks (for either -classification or regression) using the genetic algorithm. It makes use -of the 2 modules ``pygad`` and ``pygad.nn``. - -.. _pygadganngann-class: - -``pygad.gann.GANN`` Class -========================= - -The ``pygad.gann`` module has a class named ``pygad.gann.GANN`` for -training neural networks using the genetic algorithm. The constructor, -methods, function, and attributes within the class are discussed in this -section. - -.. _init: - -``__init__()`` --------------- - -In order to train a neural network using the genetic algorithm, the -first thing to do is to create an instance of the ``pygad.gann.GANN`` -class. - -The ``pygad.gann.GANN`` class constructor accepts the following -parameters: - -- ``num_solutions``: Number of neural networks (i.e. solutions) in the - population. Based on the value passed to this parameter, a number of - identical neural networks are created where their parameters are - optimized using the genetic algorithm. - -- ``num_neurons_input``: Number of neurons in the input layer. - -- ``num_neurons_output``: Number of neurons in the output layer. - -- ``num_neurons_hidden_layers=[]``: A list holding the number of - neurons in the hidden layer(s). If empty ``[]``, then no hidden - layers are used. For each ``int`` value it holds, then a hidden layer - is created with a number of hidden neurons specified by the - corresponding ``int`` value. For example, - ``num_neurons_hidden_layers=[10]`` creates a single hidden layer with - **10** neurons. ``num_neurons_hidden_layers=[10, 5]`` creates 2 - hidden layers with 10 neurons for the first and 5 neurons for the - second hidden layer. - -- ``output_activation="softmax"``: The name of the activation function - of the output layer which defaults to ``"softmax"``. - -- ``hidden_activations="relu"``: The name(s) of the activation - function(s) of the hidden layer(s). It defaults to ``"relu"``. If - passed as a string, this means the specified activation function will - be used across all the hidden layers. If passed as a list, then it - must have the same length as the length of the - ``num_neurons_hidden_layers`` list. An exception is raised if their - lengths are different. When ``hidden_activations`` is a list, a - one-to-one mapping between the ``num_neurons_hidden_layers`` and - ``hidden_activations`` lists occurs. - -In order to validate the parameters passed to the ``pygad.gann.GANN`` -class constructor, the ``pygad.gann.validate_network_parameters()`` -function is called. - -Instance Attributes -------------------- - -All the parameters in the ``pygad.gann.GANN`` class constructor are used -as instance attributes. Besides such attributes, there are other -attributes added to the instances from the ``pygad.gann.GANN`` class -which are: - -- ``parameters_validated``: If ``True``, then the parameters passed to - the GANN class constructor are valid. Its initial value is ``False``. - -- ``population_networks``: A list holding references to all the - solutions (i.e. neural networks) used in the population. - -Methods in the GANN Class -------------------------- - -This section discusses the methods available for instances of the -``pygad.gann.GANN`` class. - -.. _createpopulation: - -``create_population()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -The ``create_population()`` method creates the initial population of the -genetic algorithm as a list of neural networks (i.e. solutions). For -each network to be created, the ``pygad.gann.create_network()`` function -is called. - -Each element in the list holds a reference to the last (i.e. output) -layer for the network. The method does not accept any parameter and it -accesses all the required details from the ``pygad.gann.GANN`` instance. - -The method returns the list holding the references to the networks. This -list is later assigned to the ``population_networks`` attribute of the -instance. - -.. _updatepopulationtrainedweights: - -``update_population_trained_weights()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The ``update_population_trained_weights()`` method updates the -``trained_weights`` attribute of the layers of each network (check the -`documentation of the pygad.nn.DenseLayer -class `__ for -more information) according to the weights passed in the -``population_trained_weights`` parameter. - -Accepts the following parameters: - -- ``population_trained_weights``: A list holding the trained weights of - all networks as matrices. Such matrices are to be assigned to the - ``trained_weights`` attribute of all layers of all networks. - -.. _functions-in-the-pygadgann-module: - -Functions in the ``pygad.gann`` Module -====================================== - -This section discusses the functions in the ``pygad.gann`` module. - -.. _pygadgannvalidatenetworkparameters: - -``pygad.gann.validate_network_parameters()`` --------------------------------------------- - -Validates the parameters passed to the constructor of the -``pygad.gann.GANN`` class. If at least one an invalid parameter exists, -an exception is raised and the execution stops. - -The function accepts the same parameters passed to the constructor of -the ``pygad.gann.GANN`` class. Please check the documentation of such -parameters in the section discussing the class constructor. - -The reason why this function sets a default value to the -``num_solutions`` parameter is differentiating whether a population of -networks or a single network is to be created. If ``None``, then a -single network will be created. If not ``None``, then a population of -networks is to be created. - -If the value passed to the ``hidden_activations`` parameter is a string, -not a list, then a list is created by replicating the passed name of the -activation function a number of times equal to the number of hidden -layers (i.e. the length of the ``num_neurons_hidden_layers`` parameter). - -Returns a list holding the name(s) of the activation function(s) of the -hidden layer(s). - -.. _pygadganncreatenetwork: - -``pygad.gann.create_network()`` -------------------------------- - -Creates a neural network as a linked list between the input, hidden, and -output layers where the layer at index N (which is the last/output -layer) references the layer at index N-1 (which is a hidden layer) using -its previous_layer attribute. The input layer does not reference any -layer because it is the last layer in the linked list. - -In addition to the ``parameters_validated`` parameter, this function -accepts the same parameters passed to the constructor of the -``pygad.gann.GANN`` class except for the ``num_solutions`` parameter -because only a single network is created out of the ``create_network()`` -function. - -``parameters_validated``: If ``False``, then the parameters are not -validated and a call to the ``validate_network_parameters()`` function -is made. - -Returns the reference to the last layer in the network architecture -which is the output layer. Based on such a reference, all network layers -can be fetched. - -.. _pygadgannpopulationasvectors: - -``pygad.gann.population_as_vectors()`` ---------------------------------------- - -Accepts the population as networks and returns a list holding all -weights of the layers of each solution (i.e. network) in the population -as a vector. - -For example, if the population has 6 solutions (i.e. networks), this -function accepts references to such networks and returns a list with 6 -vectors, one for each network (i.e. solution). Each vector holds the -weights for all layers for a single network. - -Accepts the following parameters: - -- ``population_networks``: A list holding references to the output - (last) layers of the neural networks used in the population. - -Returns a list holding the weights vectors for all solutions (i.e. -networks). - -.. _pygadgannpopulationasmatrices: - -``pygad.gann.population_as_matrices()`` ---------------------------------------- - -Accepts the population as both networks and weights vectors and returns -the weights of all layers of each solution (i.e. network) in the -population as a matrix. - -For example, if the population has 6 solutions (i.e. networks), this -function returns a list with 6 matrices, one for each network holding -its weights for all layers. - -Accepts the following parameters: - -- ``population_networks``: A list holding references to the output - (last) layers of the neural networks used in the population. - -- ``population_vectors``: A list holding the weights of all networks as - vectors. Such vectors are to be converted into matrices. - -Returns a list holding the weights matrices for all solutions (i.e. -networks). - -Steps to Build and Train Neural Networks using Genetic Algorithm -================================================================ - -The steps to use this project for building and training a neural network -using the genetic algorithm are as follows: - -- Prepare the training data. - -- Create an instance of the ``pygad.gann.GANN`` class. - -- Fetch the population weights as vectors. - -- Prepare the fitness function. - -- Prepare the generation callback function. - -- Create an instance of the ``pygad.GA`` class. - -- Run the created instance of the ``pygad.GA`` class. - -- Plot the Fitness Values - -- Information about the best solution. - -- Making predictions using the trained weights. - -- Calculating some statistics. - -Let's start covering all of these steps. - -Prepare the Training Data -------------------------- - -Before building and training neural networks, the training data (input -and output) is to be prepared. The inputs and the outputs of the -training data are NumPy arrays. - -Here is an example of preparing the training data for the XOR problem. - -For the input array, each element must be a list representing the inputs -(i.e. features) for the sample. If there are 200 samples and each sample -has 50 features, then the shape of the inputs array is ``(200, 50)``. -The variable ``num_inputs`` holds the length of each sample which is 2 -in this example. - -.. code:: python - - data_inputs = numpy.array([[1, 1], - [1, 0], - [0, 1], - [0, 0]]) - - data_outputs = numpy.array([0, - 1, - 1, - 0]) - - num_inputs = data_inputs.shape[1] - -For the output array, each element must be a single number representing -the class label of the sample. The class labels must start at ``0``. So, -if there are 200 samples, then the shape of the output array is -``(200)``. If there are 5 classes in the data, then the values of all -the 200 elements in the output array must range from 0 to 4 inclusive. -Generally, the class labels start from ``0`` to ``N-1`` where ``N`` is -the number of classes. - -For the XOR example, there are 2 classes and thus their labels are 0 and -1. The ``num_classes`` variable is assigned to 2. - -Note that the project only supports classification problems where each -sample is assigned to only one class. - -.. _create-an-instance-of-the-pygadganngann-class: - -Create an Instance of the ``pygad.gann.GANN`` Class ---------------------------------------------------- - -After preparing the input data, an instance of the ``pygad.gann.GANN`` -class is created by passing the appropriate parameters. - -Here is an example that creates a network for the XOR problem. The -``num_solutions`` parameter is set to 6 which means the genetic -algorithm population will have 6 solutions (i.e. networks). All of these -6 neural networks will have the same architectures as specified by the -other parameters. - -The output layer has 2 neurons because there are only 2 classes (0 and -1). - -.. code:: python - - import pygad.gann - import pygad.nn - - num_solutions = 6 - GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, - num_neurons_input=num_inputs, - num_neurons_hidden_layers=[2], - num_neurons_output=2, - hidden_activations=["relu"], - output_activation="softmax") - -The architecture of the created network has the following layers: - -- An input layer with 2 neurons (i.e. inputs) - -- A single hidden layer with 2 neurons. - -- An output layer with 2 neurons (i.e. classes). - -The weights of the network are as follows: - -- Between the input and the hidden layer, there is a weights matrix of - size equal to ``(number inputs x number of hidden neurons) = (2x2)``. - -- Between the hidden and the output layer, there is a weights matrix of - size equal to - ``(number of hidden neurons x number of outputs) = (2x2)``. - -The activation function used for the output layer is ``softmax``. The -``relu`` activation function is used for the hidden layer. - -After creating the instance of the ``pygad.gann.GANN`` class next is to -fetch the weights of the population as a list of vectors. - -Fetch the Population Weights as Vectors ---------------------------------------- - -For the genetic algorithm, the parameters (i.e. genes) of each solution -are represented as a single vector. - -For the task of training the network for the XOR problem, the weights of -each network in the population are not represented as a vector but 2 -matrices each of size 2x2. - -To create a list holding the population weights as vectors, one for each -network, the ``pygad.gann.population_as_vectors()`` function is used. - -.. code:: python - - population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) - -After preparing the population weights as a set of vectors, next is to -prepare 2 functions which are: - -1. Fitness function. - -2. Callback function after each generation. - -Prepare the Fitness Function ----------------------------- - -The PyGAD library works by allowing the users to customize the genetic -algorithm for their own problems. Because the problems differ in how the -fitness values are calculated, then PyGAD allows the user to use a -custom function as a maximization fitness function. This function must -accept 2 positional parameters representing the following: - -- The solution. - -- The solution index in the population. - -The fitness function must return a single number representing the -fitness. The higher the fitness value, the better the solution. - -Here is the implementation of the fitness function for training a neural -network. It uses the ``pygad.nn.predict()`` function to predict the -class labels based on the current solution's weights. The -``pygad.nn.predict()`` function uses the trained weights available in -the ``trained_weights`` attribute of each layer of the network for -making predictions. - -Based on such predictions, the classification accuracy is calculated. -This accuracy is used as the fitness value of the solution. Finally, the -fitness value is returned. - -.. code:: python - - def fitness_func(ga_instance, solution, sol_idx): - global GANN_instance, data_inputs, data_outputs - - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], - data_inputs=data_inputs) - correct_predictions = numpy.where(predictions == data_outputs)[0].size - solution_fitness = (correct_predictions/data_outputs.size)*100 - - return solution_fitness - -Prepare the Generation Callback Function ----------------------------------------- - -After each generation of the genetic algorithm, the fitness function -will be called to calculate the fitness value of each solution. Within -the fitness function, the ``pygad.nn.predict()`` function is used for -predicting the outputs based on the current solution's -``trained_weights`` attribute. Thus, it is required that such an -attribute is updated by weights evolved by the genetic algorithm after -each generation. - -PyGAD 2.0.0 and higher has a new parameter accepted by the ``pygad.GA`` -class constructor named ``on_generation``. It could be assigned to a -function that is called after each generation. The function must accept -a single parameter representing the instance of the ``pygad.GA`` class. - -This callback function can be used to update the ``trained_weights`` -attribute of layers of each network in the population. - -Here is the implementation for a function that updates the -``trained_weights`` attribute of the layers of the population networks. - -It works by converting the current population from the vector form to -the matric form using the ``pygad.gann.population_as_matrices()`` -function. It accepts the population as vectors and returns it as -matrices. - -The population matrices are then passed to the -``update_population_trained_weights()`` method in the ``pygad.gann`` -module to update the ``trained_weights`` attribute of all layers for all -solutions within the population. - -.. code:: python - - def callback_generation(ga_instance): - global GANN_instance - - population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, population_vectors=ga_instance.population) - GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - -After preparing the fitness and callback function, next is to create an -instance of the ``pygad.GA`` class. - -.. _create-an-instance-of-the-pygadga-class: - -Create an Instance of the ``pygad.GA`` Class --------------------------------------------- - -Once the parameters of the genetic algorithm are prepared, an instance -of the ``pygad.GA`` class can be created. - -Here is an example. - -.. code:: python - - initial_population = population_vectors.copy() - - num_parents_mating = 4 - - num_generations = 500 - - mutation_percent_genes = 5 - - parent_selection_type = "sss" - - crossover_type = "single_point" - - mutation_type = "random" - - keep_parents = 1 - - init_range_low = -2 - init_range_high = 5 - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - init_range_low=init_range_low, - init_range_high=init_range_high, - parent_selection_type=parent_selection_type, - crossover_type=crossover_type, - mutation_type=mutation_type, - keep_parents=keep_parents, - on_generation=callback_generation) - -The last step for training the neural networks using the genetic -algorithm is calling the ``run()`` method. - -.. _run-the-created-instance-of-the-pygadga-class: - -Run the Created Instance of the ``pygad.GA`` Class --------------------------------------------------- - -By calling the ``run()`` method from the ``pygad.GA`` instance, the -genetic algorithm will iterate through the number of generations -specified in its ``num_generations`` parameter. - -.. code:: python - - ga_instance.run() - -Plot the Fitness Values ------------------------ - -After the ``run()`` method completes, the ``plot_fitness()`` method can -be called to show how the fitness values evolve by generation. A fitness -value (i.e. accuracy) of 100 is reached after around 180 generations. - -.. code:: python - - ga_instance.plot_fitness() - -.. figure:: https://user-images.githubusercontent.com/16560492/82078638-c11e0700-96e1-11ea-8aa9-c36761c5e9c7.png - :alt: - -By running the code again, a different initial population is created and -thus a classification accuracy of 100 can be reached using a less number -of generations. On the other hand, a different initial population might -cause 100% accuracy to be reached using more generations or not reached -at all. - -Information about the Best Solution ------------------------------------ - -The following information about the best solution in the last population -is returned using the ``best_solution()`` method in the ``pygad.GA`` -class. - -- Solution - -- Fitness value of the solution - -- Index of the solution within the population - -Here is how such information is returned. The fitness value (i.e. -accuracy) is 100. - -.. code:: python - - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - -.. code:: - - Parameters of the best solution : [3.55081391 -3.21562011 -14.2617784 0.68044231 -1.41258145 -3.2979315 1.58136006 -7.83726169] - Fitness value of the best solution = 100.0 - Index of the best solution : 0 - -Using the ``best_solution_generation`` attribute of the instance from -the ``pygad.GA`` class, the generation number at which the **best -fitness** is reached could be fetched. According to the result, the best -fitness value is reached after 182 generations. - -.. code:: python - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - -.. code:: - - Best solution reached after 182 generations. - -Making Predictions using the Trained Weights --------------------------------------------- - -The ``pygad.nn.predict()`` function can be used to make predictions -using the trained network. As printed, the network is able to predict -the labels correctly. - -.. code:: python - - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], data_inputs=data_inputs) - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - -.. code:: - - Predictions of the trained network : [0. 1. 1. 0.] - -Calculating Some Statistics ---------------------------- - -Based on the predictions the network made, some statistics can be -calculated such as the number of correct and wrong predictions in -addition to the classification accuracy. - -.. code:: python - - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -.. code:: - - Number of correct classifications : 4 - print("Number of wrong classifications : 0 - Classification accuracy : 100 - -Examples -======== - -This section gives the complete code of some examples that build and -train neural networks using the genetic algorithm. Each subsection -builds a different network. - -XOR Classification ------------------- - -This example is discussed in the **Steps to Build and Train Neural -Networks using Genetic Algorithm** section that builds the XOR gate and -its complete code is listed below. - -.. code:: python - - import numpy - import pygad - import pygad.nn - import pygad.gann - - def fitness_func(ga_instance, solution, sol_idx): - global GANN_instance, data_inputs, data_outputs - - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], - data_inputs=data_inputs) - correct_predictions = numpy.where(predictions == data_outputs)[0].size - solution_fitness = (correct_predictions/data_outputs.size)*100 - - return solution_fitness - - def callback_generation(ga_instance): - global GANN_instance, last_fitness - - population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, - population_vectors=ga_instance.population) - - GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) - - last_fitness = ga_instance.best_solution()[1].copy() - - # Holds the fitness value of the previous generation. - last_fitness = 0 - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.array([[1, 1], - [1, 0], - [0, 1], - [0, 0]]) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.array([0, - 1, - 1, - 0]) - - # The length of the input vector for each sample (i.e. number of neurons in the input layer). - num_inputs = data_inputs.shape[1] - # The number of neurons in the output layer (i.e. number of classes). - num_classes = 2 - - # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. - num_solutions = 6 # A solution or a network can be used interchangeably. - GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, - num_neurons_input=num_inputs, - num_neurons_hidden_layers=[2], - num_neurons_output=num_classes, - hidden_activations=["relu"], - output_activation="softmax") - - # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. - # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. - population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) - - # To prepare the initial population, there are 2 ways: - # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. - # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. - initial_population = population_vectors.copy() - - num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. - - num_generations = 500 # Number of generations. - - mutation_percent_genes = 5 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. - - parent_selection_type = "sss" # Type of parent selection. - - crossover_type = "single_point" # Type of the crossover operator. - - mutation_type = "random" # Type of the mutation operator. - - keep_parents = 1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. - - init_range_low = -2 - init_range_high = 5 - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - init_range_low=init_range_low, - init_range_high=init_range_high, - parent_selection_type=parent_selection_type, - crossover_type=crossover_type, - mutation_type=mutation_type, - keep_parents=keep_parents, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness() - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - # Predicting the outputs of the data using the best solution. - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], - data_inputs=data_inputs) - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - - # Calculating some statistics - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -Image Classification --------------------- - -In the documentation of the ``pygad.nn`` module, a neural network is -created for classifying images from the Fruits360 dataset without being -trained using an optimization algorithm. This section discusses how to -train such a classifier using the genetic algorithm with the help of the -``pygad.gann`` module. - -Please make sure that the training data files -`dataset_features.npy `__ -and -`outputs.npy `__ -are available. For downloading them, use these links: - -1. `dataset_features.npy `__: - The features - https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy - -2. `outputs.npy `__: - The class labels - https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy - -After the data is available, here is the complete code that builds and -trains a neural network using the genetic algorithm for classifying -images from 4 classes of the Fruits360 dataset. - -Because there are 4 classes, the output layer is assigned has 4 neurons -according to the ``num_neurons_output`` parameter of the -``pygad.gann.GANN`` class constructor. - -.. code:: python - - import numpy - import pygad - import pygad.nn - import pygad.gann - - def fitness_func(ga_instance, solution, sol_idx): - global GANN_instance, data_inputs, data_outputs - - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], - data_inputs=data_inputs) - correct_predictions = numpy.where(predictions == data_outputs)[0].size - solution_fitness = (correct_predictions/data_outputs.size)*100 - - return solution_fitness - - def callback_generation(ga_instance): - global GANN_instance, last_fitness - - population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, - population_vectors=ga_instance.population) - - GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) - - last_fitness = ga_instance.best_solution()[1].copy() - - # Holds the fitness value of the previous generation. - last_fitness = 0 - - # Reading the input data. - data_inputs = numpy.load("dataset_features.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy - - # Optional step of filtering the input data using the standard deviation. - features_STDs = numpy.std(a=data_inputs, axis=0) - data_inputs = data_inputs[:, features_STDs>50] - - # Reading the output data. - data_outputs = numpy.load("outputs.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy - - # The length of the input vector for each sample (i.e. number of neurons in the input layer). - num_inputs = data_inputs.shape[1] - # The number of neurons in the output layer (i.e. number of classes). - num_classes = 4 - - # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. - num_solutions = 8 # A solution or a network can be used interchangeably. - GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, - num_neurons_input=num_inputs, - num_neurons_hidden_layers=[150, 50], - num_neurons_output=num_classes, - hidden_activations=["relu", "relu"], - output_activation="softmax") - - # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. - # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. - population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) - - # To prepare the initial population, there are 2 ways: - # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. - # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. - initial_population = population_vectors.copy() - - num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. - - num_generations = 500 # Number of generations. - - mutation_percent_genes = 10 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. - - parent_selection_type = "sss" # Type of parent selection. - - crossover_type = "single_point" # Type of the crossover operator. - - mutation_type = "random" # Type of the mutation operator. - - keep_parents = -1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - parent_selection_type=parent_selection_type, - crossover_type=crossover_type, - mutation_type=mutation_type, - keep_parents=keep_parents, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness() - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - # Predicting the outputs of the data using the best solution. - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], - data_inputs=data_inputs) - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - - # Calculating some statistics - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -After training completes, here are the outputs of the print statements. -The number of wrong classifications is only 1 and the accuracy is -99.949%. This accuracy is reached after 482 generations. - -.. code:: - - Fitness value of the best solution = 99.94903160040775 - Index of the best solution : 0 - Best fitness value reached after 482 generations. - Number of correct classifications : 1961. - Number of wrong classifications : 1. - Classification accuracy : 99.94903160040775. - -The next figure shows how fitness value evolves by generation. - -.. figure:: https://user-images.githubusercontent.com/16560492/82152993-21898180-9865-11ea-8387-b995f88b83f7.png - :alt: - -Regression Example 1 --------------------- - -To train a neural network for regression, follow these instructions: - -1. Set the ``output_activation`` parameter in the constructor of the - ``pygad.gann.GANN`` class to ``"None"``. It is possible to use the - ReLU function if all outputs are nonnegative. - -.. code:: python - - GANN_instance = pygad.gann.GANN(... - output_activation="None") - -1. Wherever the ``pygad.nn.predict()`` function is used, set the - ``problem_type`` parameter to ``"regression"``. - -.. code:: python - - predictions = pygad.nn.predict(..., - problem_type="regression") - -1. Design the fitness function to calculate the error (e.g. mean - absolute error). - -.. code:: python - - def fitness_func(ga_instance, solution, sol_idx): - ... - - predictions = pygad.nn.predict(..., - problem_type="regression") - - solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) - - return solution_fitness - -The next code builds a complete example for building a neural network -for regression. - -.. code:: python - - import numpy - import pygad - import pygad.nn - import pygad.gann - - def fitness_func(ga_instance, solution, sol_idx): - global GANN_instance, data_inputs, data_outputs - - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], - data_inputs=data_inputs, problem_type="regression") - solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) - - return solution_fitness - - def callback_generation(ga_instance): - global GANN_instance, last_fitness - - population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, - population_vectors=ga_instance.population) - - GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) - - last_fitness = ga_instance.best_solution()[1].copy() - - # Holds the fitness value of the previous generation. - last_fitness = 0 - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.array([[2, 5, -3, 0.1], - [8, 15, 20, 13]]) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.array([0.1, - 1.5]) - - # The length of the input vector for each sample (i.e. number of neurons in the input layer). - num_inputs = data_inputs.shape[1] - - # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. - num_solutions = 6 # A solution or a network can be used interchangeably. - GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, - num_neurons_input=num_inputs, - num_neurons_hidden_layers=[2], - num_neurons_output=1, - hidden_activations=["relu"], - output_activation="None") - - # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. - # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. - population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) - - # To prepare the initial population, there are 2 ways: - # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. - # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. - initial_population = population_vectors.copy() - - num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. - - num_generations = 500 # Number of generations. - - mutation_percent_genes = 5 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. - - parent_selection_type = "sss" # Type of parent selection. - - crossover_type = "single_point" # Type of the crossover operator. - - mutation_type = "random" # Type of the mutation operator. - - keep_parents = 1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. - - init_range_low = -1 - init_range_high = 1 - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - init_range_low=init_range_low, - init_range_high=init_range_high, - parent_selection_type=parent_selection_type, - crossover_type=crossover_type, - mutation_type=mutation_type, - keep_parents=keep_parents, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness() - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - # Predicting the outputs of the data using the best solution. - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], - data_inputs=data_inputs, - problem_type="regression") - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - - # Calculating some statistics - abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) - print("Absolute error : {abs_error}.".format(abs_error=abs_error)) - -The next figure shows how the fitness value changes for the generations -used. - -.. figure:: https://user-images.githubusercontent.com/16560492/92948154-3cf24b00-f459-11ea-94ea-952b66ab2145.png - :alt: - -Regression Example 2 - Fish Weight Prediction ---------------------------------------------- - -This example uses the Fish Market Dataset available at Kaggle -(https://www.kaggle.com/aungpyaeap/fish-market). Simply download the CSV -dataset from `this -link `__ -(https://www.kaggle.com/aungpyaeap/fish-market/download). The dataset is -also available at the `GitHub project of the pygad.gann -module `__: -https://github.com/ahmedfgad/NeuralGenetic - -Using the Pandas library, the dataset is read using the ``read_csv()`` -function. - -.. code:: python - - data = numpy.array(pandas.read_csv("Fish.csv")) - -The last 5 columns in the dataset are used as inputs and the **Weight** -column is used as output. - -.. code:: python - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) # Fish Weight - -Note how the activation function at the last layer is set to ``"None"``. -Moreover, the ``problem_type`` parameter in the ``pygad.nn.train()`` and -``pygad.nn.predict()`` functions is set to ``"regression"``. Remember to -design an appropriate fitness function for the regression problem. In -this example, the fitness value is calculated based on the mean absolute -error. - -.. code:: python - - solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) - -Here is the complete code. - -.. code:: python - - import numpy - import pygad - import pygad.nn - import pygad.gann - import pandas - - def fitness_func(ga_instance, solution, sol_idx): - global GANN_instance, data_inputs, data_outputs - - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], - data_inputs=data_inputs, problem_type="regression") - solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) - - return solution_fitness - - def callback_generation(ga_instance): - global GANN_instance, last_fitness - - population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, - population_vectors=ga_instance.population) - - GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) - - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) - - last_fitness = ga_instance.best_solution()[1].copy() - - # Holds the fitness value of the previous generation. - last_fitness = 0 - - data = numpy.array(pandas.read_csv("Fish.csv")) - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) - - # The length of the input vector for each sample (i.e. number of neurons in the input layer). - num_inputs = data_inputs.shape[1] - - # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. - num_solutions = 6 # A solution or a network can be used interchangeably. - GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, - num_neurons_input=num_inputs, - num_neurons_hidden_layers=[2], - num_neurons_output=1, - hidden_activations=["relu"], - output_activation="None") - - # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. - # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. - population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) - - # To prepare the initial population, there are 2 ways: - # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. - # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. - initial_population = population_vectors.copy() - - num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. - - num_generations = 500 # Number of generations. - - mutation_percent_genes = 5 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. - - parent_selection_type = "sss" # Type of parent selection. - - crossover_type = "single_point" # Type of the crossover operator. - - mutation_type = "random" # Type of the mutation operator. - - keep_parents = 1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. - - init_range_low = -1 - init_range_high = 1 - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - mutation_percent_genes=mutation_percent_genes, - init_range_low=init_range_low, - init_range_high=init_range_high, - parent_selection_type=parent_selection_type, - crossover_type=crossover_type, - mutation_type=mutation_type, - keep_parents=keep_parents, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness() - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - # Predicting the outputs of the data using the best solution. - predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], - data_inputs=data_inputs, - problem_type="regression") - print("Predictions of the trained network : {predictions}".format(predictions=predictions)) - - # Calculating some statistics - abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) - print("Absolute error : {abs_error}.".format(abs_error=abs_error)) - -The next figure shows how the fitness value changes for the 500 -generations used. - -.. figure:: https://user-images.githubusercontent.com/16560492/92948486-bbe78380-f459-11ea-9e31-0d4c7269d606.png - :alt: +.. _pygadgann-module: + +``pygad.gann`` Module +===================== + +This section of the PyGAD's library documentation discusses the +**pygad.gann** module. + +The ``pygad.gann`` module trains neural networks (for either +classification or regression) using the genetic algorithm. It makes use +of the 2 modules ``pygad`` and ``pygad.nn``. + +.. _pygadganngann-class: + +``pygad.gann.GANN`` Class +========================= + +The ``pygad.gann`` module has a class named ``pygad.gann.GANN`` for +training neural networks using the genetic algorithm. The constructor, +methods, function, and attributes within the class are discussed in this +section. + +.. _init: + +``__init__()`` +-------------- + +In order to train a neural network using the genetic algorithm, the +first thing to do is to create an instance of the ``pygad.gann.GANN`` +class. + +The ``pygad.gann.GANN`` class constructor accepts the following +parameters: + +- ``num_solutions``: Number of neural networks (i.e. solutions) in the + population. Based on the value passed to this parameter, a number of + identical neural networks are created where their parameters are + optimized using the genetic algorithm. + +- ``num_neurons_input``: Number of neurons in the input layer. + +- ``num_neurons_output``: Number of neurons in the output layer. + +- ``num_neurons_hidden_layers=[]``: A list holding the number of + neurons in the hidden layer(s). If empty ``[]``, then no hidden + layers are used. For each ``int`` value it holds, then a hidden layer + is created with a number of hidden neurons specified by the + corresponding ``int`` value. For example, + ``num_neurons_hidden_layers=[10]`` creates a single hidden layer with + **10** neurons. ``num_neurons_hidden_layers=[10, 5]`` creates 2 + hidden layers with 10 neurons for the first and 5 neurons for the + second hidden layer. + +- ``output_activation="softmax"``: The name of the activation function + of the output layer which defaults to ``"softmax"``. + +- ``hidden_activations="relu"``: The name(s) of the activation + function(s) of the hidden layer(s). It defaults to ``"relu"``. If + passed as a string, this means the specified activation function will + be used across all the hidden layers. If passed as a list, then it + must have the same length as the length of the + ``num_neurons_hidden_layers`` list. An exception is raised if their + lengths are different. When ``hidden_activations`` is a list, a + one-to-one mapping between the ``num_neurons_hidden_layers`` and + ``hidden_activations`` lists occurs. + +In order to validate the parameters passed to the ``pygad.gann.GANN`` +class constructor, the ``pygad.gann.validate_network_parameters()`` +function is called. + +Instance Attributes +------------------- + +All the parameters in the ``pygad.gann.GANN`` class constructor are used +as instance attributes. Besides such attributes, there are other +attributes added to the instances from the ``pygad.gann.GANN`` class +which are: + +- ``parameters_validated``: If ``True``, then the parameters passed to + the GANN class constructor are valid. Its initial value is ``False``. + +- ``population_networks``: A list holding references to all the + solutions (i.e. neural networks) used in the population. + +Methods in the GANN Class +------------------------- + +This section discusses the methods available for instances of the +``pygad.gann.GANN`` class. + +.. _createpopulation: + +``create_population()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +The ``create_population()`` method creates the initial population of the +genetic algorithm as a list of neural networks (i.e. solutions). For +each network to be created, the ``pygad.gann.create_network()`` function +is called. + +Each element in the list holds a reference to the last (i.e. output) +layer for the network. The method does not accept any parameter and it +accesses all the required details from the ``pygad.gann.GANN`` instance. + +The method returns the list holding the references to the networks. This +list is later assigned to the ``population_networks`` attribute of the +instance. + +.. _updatepopulationtrainedweights: + +``update_population_trained_weights()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``update_population_trained_weights()`` method updates the +``trained_weights`` attribute of the layers of each network (check the +`documentation of the pygad.nn.DenseLayer +class `__ for +more information) according to the weights passed in the +``population_trained_weights`` parameter. + +Accepts the following parameters: + +- ``population_trained_weights``: A list holding the trained weights of + all networks as matrices. Such matrices are to be assigned to the + ``trained_weights`` attribute of all layers of all networks. + +.. _functions-in-the-pygadgann-module: + +Functions in the ``pygad.gann`` Module +====================================== + +This section discusses the functions in the ``pygad.gann`` module. + +.. _pygadgannvalidatenetworkparameters: + +``pygad.gann.validate_network_parameters()`` +-------------------------------------------- + +Validates the parameters passed to the constructor of the +``pygad.gann.GANN`` class. If at least one an invalid parameter exists, +an exception is raised and the execution stops. + +The function accepts the same parameters passed to the constructor of +the ``pygad.gann.GANN`` class. Please check the documentation of such +parameters in the section discussing the class constructor. + +The reason why this function sets a default value to the +``num_solutions`` parameter is differentiating whether a population of +networks or a single network is to be created. If ``None``, then a +single network will be created. If not ``None``, then a population of +networks is to be created. + +If the value passed to the ``hidden_activations`` parameter is a string, +not a list, then a list is created by replicating the passed name of the +activation function a number of times equal to the number of hidden +layers (i.e. the length of the ``num_neurons_hidden_layers`` parameter). + +Returns a list holding the name(s) of the activation function(s) of the +hidden layer(s). + +.. _pygadganncreatenetwork: + +``pygad.gann.create_network()`` +------------------------------- + +Creates a neural network as a linked list between the input, hidden, and +output layers where the layer at index N (which is the last/output +layer) references the layer at index N-1 (which is a hidden layer) using +its previous_layer attribute. The input layer does not reference any +layer because it is the last layer in the linked list. + +In addition to the ``parameters_validated`` parameter, this function +accepts the same parameters passed to the constructor of the +``pygad.gann.GANN`` class except for the ``num_solutions`` parameter +because only a single network is created out of the ``create_network()`` +function. + +``parameters_validated``: If ``False``, then the parameters are not +validated and a call to the ``validate_network_parameters()`` function +is made. + +Returns the reference to the last layer in the network architecture +which is the output layer. Based on such a reference, all network layers +can be fetched. + +.. _pygadgannpopulationasvectors: + +``pygad.gann.population_as_vectors()`` +--------------------------------------- + +Accepts the population as networks and returns a list holding all +weights of the layers of each solution (i.e. network) in the population +as a vector. + +For example, if the population has 6 solutions (i.e. networks), this +function accepts references to such networks and returns a list with 6 +vectors, one for each network (i.e. solution). Each vector holds the +weights for all layers for a single network. + +Accepts the following parameters: + +- ``population_networks``: A list holding references to the output + (last) layers of the neural networks used in the population. + +Returns a list holding the weights vectors for all solutions (i.e. +networks). + +.. _pygadgannpopulationasmatrices: + +``pygad.gann.population_as_matrices()`` +--------------------------------------- + +Accepts the population as both networks and weights vectors and returns +the weights of all layers of each solution (i.e. network) in the +population as a matrix. + +For example, if the population has 6 solutions (i.e. networks), this +function returns a list with 6 matrices, one for each network holding +its weights for all layers. + +Accepts the following parameters: + +- ``population_networks``: A list holding references to the output + (last) layers of the neural networks used in the population. + +- ``population_vectors``: A list holding the weights of all networks as + vectors. Such vectors are to be converted into matrices. + +Returns a list holding the weights matrices for all solutions (i.e. +networks). + +Steps to Build and Train Neural Networks using Genetic Algorithm +================================================================ + +The steps to use this project for building and training a neural network +using the genetic algorithm are as follows: + +- Prepare the training data. + +- Create an instance of the ``pygad.gann.GANN`` class. + +- Fetch the population weights as vectors. + +- Prepare the fitness function. + +- Prepare the generation callback function. + +- Create an instance of the ``pygad.GA`` class. + +- Run the created instance of the ``pygad.GA`` class. + +- Plot the Fitness Values + +- Information about the best solution. + +- Making predictions using the trained weights. + +- Calculating some statistics. + +Let's start covering all of these steps. + +Prepare the Training Data +------------------------- + +Before building and training neural networks, the training data (input +and output) is to be prepared. The inputs and the outputs of the +training data are NumPy arrays. + +Here is an example of preparing the training data for the XOR problem. + +For the input array, each element must be a list representing the inputs +(i.e. features) for the sample. If there are 200 samples and each sample +has 50 features, then the shape of the inputs array is ``(200, 50)``. +The variable ``num_inputs`` holds the length of each sample which is 2 +in this example. + +.. code:: python + + data_inputs = numpy.array([[1, 1], + [1, 0], + [0, 1], + [0, 0]]) + + data_outputs = numpy.array([0, + 1, + 1, + 0]) + + num_inputs = data_inputs.shape[1] + +For the output array, each element must be a single number representing +the class label of the sample. The class labels must start at ``0``. So, +if there are 200 samples, then the shape of the output array is +``(200)``. If there are 5 classes in the data, then the values of all +the 200 elements in the output array must range from 0 to 4 inclusive. +Generally, the class labels start from ``0`` to ``N-1`` where ``N`` is +the number of classes. + +For the XOR example, there are 2 classes and thus their labels are 0 and +1. The ``num_classes`` variable is assigned to 2. + +Note that the project only supports classification problems where each +sample is assigned to only one class. + +.. _create-an-instance-of-the-pygadganngann-class: + +Create an Instance of the ``pygad.gann.GANN`` Class +--------------------------------------------------- + +After preparing the input data, an instance of the ``pygad.gann.GANN`` +class is created by passing the appropriate parameters. + +Here is an example that creates a network for the XOR problem. The +``num_solutions`` parameter is set to 6 which means the genetic +algorithm population will have 6 solutions (i.e. networks). All of these +6 neural networks will have the same architectures as specified by the +other parameters. + +The output layer has 2 neurons because there are only 2 classes (0 and +1). + +.. code:: python + + import pygad.gann + import pygad.nn + + num_solutions = 6 + GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, + num_neurons_input=num_inputs, + num_neurons_hidden_layers=[2], + num_neurons_output=2, + hidden_activations=["relu"], + output_activation="softmax") + +The architecture of the created network has the following layers: + +- An input layer with 2 neurons (i.e. inputs) + +- A single hidden layer with 2 neurons. + +- An output layer with 2 neurons (i.e. classes). + +The weights of the network are as follows: + +- Between the input and the hidden layer, there is a weights matrix of + size equal to ``(number inputs x number of hidden neurons) = (2x2)``. + +- Between the hidden and the output layer, there is a weights matrix of + size equal to + ``(number of hidden neurons x number of outputs) = (2x2)``. + +The activation function used for the output layer is ``softmax``. The +``relu`` activation function is used for the hidden layer. + +After creating the instance of the ``pygad.gann.GANN`` class next is to +fetch the weights of the population as a list of vectors. + +Fetch the Population Weights as Vectors +--------------------------------------- + +For the genetic algorithm, the parameters (i.e. genes) of each solution +are represented as a single vector. + +For the task of training the network for the XOR problem, the weights of +each network in the population are not represented as a vector but 2 +matrices each of size 2x2. + +To create a list holding the population weights as vectors, one for each +network, the ``pygad.gann.population_as_vectors()`` function is used. + +.. code:: python + + population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) + +After preparing the population weights as a set of vectors, next is to +prepare 2 functions which are: + +1. Fitness function. + +2. Callback function after each generation. + +Prepare the Fitness Function +---------------------------- + +The PyGAD library works by allowing the users to customize the genetic +algorithm for their own problems. Because the problems differ in how the +fitness values are calculated, then PyGAD allows the user to use a +custom function as a maximization fitness function. This function must +accept 2 positional parameters representing the following: + +- The solution. + +- The solution index in the population. + +The fitness function must return a single number representing the +fitness. The higher the fitness value, the better the solution. + +Here is the implementation of the fitness function for training a neural +network. It uses the ``pygad.nn.predict()`` function to predict the +class labels based on the current solution's weights. The +``pygad.nn.predict()`` function uses the trained weights available in +the ``trained_weights`` attribute of each layer of the network for +making predictions. + +Based on such predictions, the classification accuracy is calculated. +This accuracy is used as the fitness value of the solution. Finally, the +fitness value is returned. + +.. code:: python + + def fitness_func(ga_instance, solution, sol_idx): + global GANN_instance, data_inputs, data_outputs + + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], + data_inputs=data_inputs) + correct_predictions = numpy.where(predictions == data_outputs)[0].size + solution_fitness = (correct_predictions/data_outputs.size)*100 + + return solution_fitness + +Prepare the Generation Callback Function +---------------------------------------- + +After each generation of the genetic algorithm, the fitness function +will be called to calculate the fitness value of each solution. Within +the fitness function, the ``pygad.nn.predict()`` function is used for +predicting the outputs based on the current solution's +``trained_weights`` attribute. Thus, it is required that such an +attribute is updated by weights evolved by the genetic algorithm after +each generation. + +PyGAD 2.0.0 and higher has a new parameter accepted by the ``pygad.GA`` +class constructor named ``on_generation``. It could be assigned to a +function that is called after each generation. The function must accept +a single parameter representing the instance of the ``pygad.GA`` class. + +This callback function can be used to update the ``trained_weights`` +attribute of layers of each network in the population. + +Here is the implementation for a function that updates the +``trained_weights`` attribute of the layers of the population networks. + +It works by converting the current population from the vector form to +the matric form using the ``pygad.gann.population_as_matrices()`` +function. It accepts the population as vectors and returns it as +matrices. + +The population matrices are then passed to the +``update_population_trained_weights()`` method in the ``pygad.gann`` +module to update the ``trained_weights`` attribute of all layers for all +solutions within the population. + +.. code:: python + + def callback_generation(ga_instance): + global GANN_instance + + population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, population_vectors=ga_instance.population) + GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +After preparing the fitness and callback function, next is to create an +instance of the ``pygad.GA`` class. + +.. _create-an-instance-of-the-pygadga-class: + +Create an Instance of the ``pygad.GA`` Class +-------------------------------------------- + +Once the parameters of the genetic algorithm are prepared, an instance +of the ``pygad.GA`` class can be created. + +Here is an example. + +.. code:: python + + initial_population = population_vectors.copy() + + num_parents_mating = 4 + + num_generations = 500 + + mutation_percent_genes = 5 + + parent_selection_type = "sss" + + crossover_type = "single_point" + + mutation_type = "random" + + keep_parents = 1 + + init_range_low = -2 + init_range_high = 5 + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + init_range_low=init_range_low, + init_range_high=init_range_high, + parent_selection_type=parent_selection_type, + crossover_type=crossover_type, + mutation_type=mutation_type, + keep_parents=keep_parents, + on_generation=callback_generation) + +The last step for training the neural networks using the genetic +algorithm is calling the ``run()`` method. + +.. _run-the-created-instance-of-the-pygadga-class: + +Run the Created Instance of the ``pygad.GA`` Class +-------------------------------------------------- + +By calling the ``run()`` method from the ``pygad.GA`` instance, the +genetic algorithm will iterate through the number of generations +specified in its ``num_generations`` parameter. + +.. code:: python + + ga_instance.run() + +Plot the Fitness Values +----------------------- + +After the ``run()`` method completes, the ``plot_fitness()`` method can +be called to show how the fitness values evolve by generation. A fitness +value (i.e. accuracy) of 100 is reached after around 180 generations. + +.. code:: python + + ga_instance.plot_fitness() + +.. figure:: https://user-images.githubusercontent.com/16560492/82078638-c11e0700-96e1-11ea-8aa9-c36761c5e9c7.png + :alt: + +By running the code again, a different initial population is created and +thus a classification accuracy of 100 can be reached using a less number +of generations. On the other hand, a different initial population might +cause 100% accuracy to be reached using more generations or not reached +at all. + +Information about the Best Solution +----------------------------------- + +The following information about the best solution in the last population +is returned using the ``best_solution()`` method in the ``pygad.GA`` +class. + +- Solution + +- Fitness value of the solution + +- Index of the solution within the population + +Here is how such information is returned. The fitness value (i.e. +accuracy) is 100. + +.. code:: python + + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +.. code:: + + Parameters of the best solution : [3.55081391 -3.21562011 -14.2617784 0.68044231 -1.41258145 -3.2979315 1.58136006 -7.83726169] + Fitness value of the best solution = 100.0 + Index of the best solution : 0 + +Using the ``best_solution_generation`` attribute of the instance from +the ``pygad.GA`` class, the generation number at which the **best +fitness** is reached could be fetched. According to the result, the best +fitness value is reached after 182 generations. + +.. code:: python + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + +.. code:: + + Best solution reached after 182 generations. + +Making Predictions using the Trained Weights +-------------------------------------------- + +The ``pygad.nn.predict()`` function can be used to make predictions +using the trained network. As printed, the network is able to predict +the labels correctly. + +.. code:: python + + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], data_inputs=data_inputs) + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + +.. code:: + + Predictions of the trained network : [0. 1. 1. 0.] + +Calculating Some Statistics +--------------------------- + +Based on the predictions the network made, some statistics can be +calculated such as the number of correct and wrong predictions in +addition to the classification accuracy. + +.. code:: python + + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +.. code:: + + Number of correct classifications : 4 + print("Number of wrong classifications : 0 + Classification accuracy : 100 + +Examples +======== + +This section gives the complete code of some examples that build and +train neural networks using the genetic algorithm. Each subsection +builds a different network. + +XOR Classification +------------------ + +This example is discussed in the **Steps to Build and Train Neural +Networks using Genetic Algorithm** section that builds the XOR gate and +its complete code is listed below. + +.. code:: python + + import numpy + import pygad + import pygad.nn + import pygad.gann + + def fitness_func(ga_instance, solution, sol_idx): + global GANN_instance, data_inputs, data_outputs + + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], + data_inputs=data_inputs) + correct_predictions = numpy.where(predictions == data_outputs)[0].size + solution_fitness = (correct_predictions/data_outputs.size)*100 + + return solution_fitness + + def callback_generation(ga_instance): + global GANN_instance, last_fitness + + population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, + population_vectors=ga_instance.population) + + GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) + + last_fitness = ga_instance.best_solution()[1].copy() + + # Holds the fitness value of the previous generation. + last_fitness = 0 + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.array([[1, 1], + [1, 0], + [0, 1], + [0, 0]]) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.array([0, + 1, + 1, + 0]) + + # The length of the input vector for each sample (i.e. number of neurons in the input layer). + num_inputs = data_inputs.shape[1] + # The number of neurons in the output layer (i.e. number of classes). + num_classes = 2 + + # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. + num_solutions = 6 # A solution or a network can be used interchangeably. + GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, + num_neurons_input=num_inputs, + num_neurons_hidden_layers=[2], + num_neurons_output=num_classes, + hidden_activations=["relu"], + output_activation="softmax") + + # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. + # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. + population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) + + # To prepare the initial population, there are 2 ways: + # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. + # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. + initial_population = population_vectors.copy() + + num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. + + num_generations = 500 # Number of generations. + + mutation_percent_genes = 5 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. + + parent_selection_type = "sss" # Type of parent selection. + + crossover_type = "single_point" # Type of the crossover operator. + + mutation_type = "random" # Type of the mutation operator. + + keep_parents = 1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. + + init_range_low = -2 + init_range_high = 5 + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + init_range_low=init_range_low, + init_range_high=init_range_high, + parent_selection_type=parent_selection_type, + crossover_type=crossover_type, + mutation_type=mutation_type, + keep_parents=keep_parents, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness() + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + # Predicting the outputs of the data using the best solution. + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], + data_inputs=data_inputs) + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + + # Calculating some statistics + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +Image Classification +-------------------- + +In the documentation of the ``pygad.nn`` module, a neural network is +created for classifying images from the Fruits360 dataset without being +trained using an optimization algorithm. This section discusses how to +train such a classifier using the genetic algorithm with the help of the +``pygad.gann`` module. + +Please make sure that the training data files +`dataset_features.npy `__ +and +`outputs.npy `__ +are available. For downloading them, use these links: + +1. `dataset_features.npy `__: + The features + https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy + +2. `outputs.npy `__: + The class labels + https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy + +After the data is available, here is the complete code that builds and +trains a neural network using the genetic algorithm for classifying +images from 4 classes of the Fruits360 dataset. + +Because there are 4 classes, the output layer is assigned has 4 neurons +according to the ``num_neurons_output`` parameter of the +``pygad.gann.GANN`` class constructor. + +.. code:: python + + import numpy + import pygad + import pygad.nn + import pygad.gann + + def fitness_func(ga_instance, solution, sol_idx): + global GANN_instance, data_inputs, data_outputs + + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], + data_inputs=data_inputs) + correct_predictions = numpy.where(predictions == data_outputs)[0].size + solution_fitness = (correct_predictions/data_outputs.size)*100 + + return solution_fitness + + def callback_generation(ga_instance): + global GANN_instance, last_fitness + + population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, + population_vectors=ga_instance.population) + + GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) + + last_fitness = ga_instance.best_solution()[1].copy() + + # Holds the fitness value of the previous generation. + last_fitness = 0 + + # Reading the input data. + data_inputs = numpy.load("dataset_features.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy + + # Optional step of filtering the input data using the standard deviation. + features_STDs = numpy.std(a=data_inputs, axis=0) + data_inputs = data_inputs[:, features_STDs>50] + + # Reading the output data. + data_outputs = numpy.load("outputs.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy + + # The length of the input vector for each sample (i.e. number of neurons in the input layer). + num_inputs = data_inputs.shape[1] + # The number of neurons in the output layer (i.e. number of classes). + num_classes = 4 + + # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. + num_solutions = 8 # A solution or a network can be used interchangeably. + GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, + num_neurons_input=num_inputs, + num_neurons_hidden_layers=[150, 50], + num_neurons_output=num_classes, + hidden_activations=["relu", "relu"], + output_activation="softmax") + + # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. + # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. + population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) + + # To prepare the initial population, there are 2 ways: + # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. + # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. + initial_population = population_vectors.copy() + + num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. + + num_generations = 500 # Number of generations. + + mutation_percent_genes = 10 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. + + parent_selection_type = "sss" # Type of parent selection. + + crossover_type = "single_point" # Type of the crossover operator. + + mutation_type = "random" # Type of the mutation operator. + + keep_parents = -1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + parent_selection_type=parent_selection_type, + crossover_type=crossover_type, + mutation_type=mutation_type, + keep_parents=keep_parents, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness() + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + # Predicting the outputs of the data using the best solution. + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], + data_inputs=data_inputs) + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + + # Calculating some statistics + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +After training completes, here are the outputs of the print statements. +The number of wrong classifications is only 1 and the accuracy is +99.949%. This accuracy is reached after 482 generations. + +.. code:: + + Fitness value of the best solution = 99.94903160040775 + Index of the best solution : 0 + Best fitness value reached after 482 generations. + Number of correct classifications : 1961. + Number of wrong classifications : 1. + Classification accuracy : 99.94903160040775. + +The next figure shows how fitness value evolves by generation. + +.. figure:: https://user-images.githubusercontent.com/16560492/82152993-21898180-9865-11ea-8387-b995f88b83f7.png + :alt: + +Regression Example 1 +-------------------- + +To train a neural network for regression, follow these instructions: + +1. Set the ``output_activation`` parameter in the constructor of the + ``pygad.gann.GANN`` class to ``"None"``. It is possible to use the + ReLU function if all outputs are nonnegative. + +.. code:: python + + GANN_instance = pygad.gann.GANN(... + output_activation="None") + +1. Wherever the ``pygad.nn.predict()`` function is used, set the + ``problem_type`` parameter to ``"regression"``. + +.. code:: python + + predictions = pygad.nn.predict(..., + problem_type="regression") + +1. Design the fitness function to calculate the error (e.g. mean + absolute error). + +.. code:: python + + def fitness_func(ga_instance, solution, sol_idx): + ... + + predictions = pygad.nn.predict(..., + problem_type="regression") + + solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) + + return solution_fitness + +The next code builds a complete example for building a neural network +for regression. + +.. code:: python + + import numpy + import pygad + import pygad.nn + import pygad.gann + + def fitness_func(ga_instance, solution, sol_idx): + global GANN_instance, data_inputs, data_outputs + + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], + data_inputs=data_inputs, problem_type="regression") + solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) + + return solution_fitness + + def callback_generation(ga_instance): + global GANN_instance, last_fitness + + population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, + population_vectors=ga_instance.population) + + GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) + + last_fitness = ga_instance.best_solution()[1].copy() + + # Holds the fitness value of the previous generation. + last_fitness = 0 + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.array([[2, 5, -3, 0.1], + [8, 15, 20, 13]]) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.array([0.1, + 1.5]) + + # The length of the input vector for each sample (i.e. number of neurons in the input layer). + num_inputs = data_inputs.shape[1] + + # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. + num_solutions = 6 # A solution or a network can be used interchangeably. + GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, + num_neurons_input=num_inputs, + num_neurons_hidden_layers=[2], + num_neurons_output=1, + hidden_activations=["relu"], + output_activation="None") + + # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. + # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. + population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) + + # To prepare the initial population, there are 2 ways: + # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. + # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. + initial_population = population_vectors.copy() + + num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. + + num_generations = 500 # Number of generations. + + mutation_percent_genes = 5 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. + + parent_selection_type = "sss" # Type of parent selection. + + crossover_type = "single_point" # Type of the crossover operator. + + mutation_type = "random" # Type of the mutation operator. + + keep_parents = 1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. + + init_range_low = -1 + init_range_high = 1 + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + init_range_low=init_range_low, + init_range_high=init_range_high, + parent_selection_type=parent_selection_type, + crossover_type=crossover_type, + mutation_type=mutation_type, + keep_parents=keep_parents, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness() + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + # Predicting the outputs of the data using the best solution. + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], + data_inputs=data_inputs, + problem_type="regression") + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + + # Calculating some statistics + abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) + print("Absolute error : {abs_error}.".format(abs_error=abs_error)) + +The next figure shows how the fitness value changes for the generations +used. + +.. figure:: https://user-images.githubusercontent.com/16560492/92948154-3cf24b00-f459-11ea-94ea-952b66ab2145.png + :alt: + +Regression Example 2 - Fish Weight Prediction +--------------------------------------------- + +This example uses the Fish Market Dataset available at Kaggle +(https://www.kaggle.com/aungpyaeap/fish-market). Simply download the CSV +dataset from `this +link `__ +(https://www.kaggle.com/aungpyaeap/fish-market/download). The dataset is +also available at the `GitHub project of the pygad.gann +module `__: +https://github.com/ahmedfgad/NeuralGenetic + +Using the Pandas library, the dataset is read using the ``read_csv()`` +function. + +.. code:: python + + data = numpy.array(pandas.read_csv("Fish.csv")) + +The last 5 columns in the dataset are used as inputs and the **Weight** +column is used as output. + +.. code:: python + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) # Fish Weight + +Note how the activation function at the last layer is set to ``"None"``. +Moreover, the ``problem_type`` parameter in the ``pygad.nn.train()`` and +``pygad.nn.predict()`` functions is set to ``"regression"``. Remember to +design an appropriate fitness function for the regression problem. In +this example, the fitness value is calculated based on the mean absolute +error. + +.. code:: python + + solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) + +Here is the complete code. + +.. code:: python + + import numpy + import pygad + import pygad.nn + import pygad.gann + import pandas + + def fitness_func(ga_instance, solution, sol_idx): + global GANN_instance, data_inputs, data_outputs + + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[sol_idx], + data_inputs=data_inputs, problem_type="regression") + solution_fitness = 1.0/numpy.mean(numpy.abs(predictions - data_outputs)) + + return solution_fitness + + def callback_generation(ga_instance): + global GANN_instance, last_fitness + + population_matrices = pygad.gann.population_as_matrices(population_networks=GANN_instance.population_networks, + population_vectors=ga_instance.population) + + GANN_instance.update_population_trained_weights(population_trained_weights=population_matrices) + + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + print("Change = {change}".format(change=ga_instance.best_solution()[1] - last_fitness)) + + last_fitness = ga_instance.best_solution()[1].copy() + + # Holds the fitness value of the previous generation. + last_fitness = 0 + + data = numpy.array(pandas.read_csv("Fish.csv")) + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) + + # The length of the input vector for each sample (i.e. number of neurons in the input layer). + num_inputs = data_inputs.shape[1] + + # Creating an initial population of neural networks. The return of the initial_population() function holds references to the networks, not their weights. Using such references, the weights of all networks can be fetched. + num_solutions = 6 # A solution or a network can be used interchangeably. + GANN_instance = pygad.gann.GANN(num_solutions=num_solutions, + num_neurons_input=num_inputs, + num_neurons_hidden_layers=[2], + num_neurons_output=1, + hidden_activations=["relu"], + output_activation="None") + + # population does not hold the numerical weights of the network instead it holds a list of references to each last layer of each network (i.e. solution) in the population. A solution or a network can be used interchangeably. + # If there is a population with 3 solutions (i.e. networks), then the population is a list with 3 elements. Each element is a reference to the last layer of each network. Using such a reference, all details of the network can be accessed. + population_vectors = pygad.gann.population_as_vectors(population_networks=GANN_instance.population_networks) + + # To prepare the initial population, there are 2 ways: + # 1) Prepare it yourself and pass it to the initial_population parameter. This way is useful when the user wants to start the genetic algorithm with a custom initial population. + # 2) Assign valid integer values to the sol_per_pop and num_genes parameters. If the initial_population parameter exists, then the sol_per_pop and num_genes parameters are useless. + initial_population = population_vectors.copy() + + num_parents_mating = 4 # Number of solutions to be selected as parents in the mating pool. + + num_generations = 500 # Number of generations. + + mutation_percent_genes = 5 # Percentage of genes to mutate. This parameter has no action if the parameter mutation_num_genes exists. + + parent_selection_type = "sss" # Type of parent selection. + + crossover_type = "single_point" # Type of the crossover operator. + + mutation_type = "random" # Type of the mutation operator. + + keep_parents = 1 # Number of parents to keep in the next population. -1 means keep all parents and 0 means keep nothing. + + init_range_low = -1 + init_range_high = 1 + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + mutation_percent_genes=mutation_percent_genes, + init_range_low=init_range_low, + init_range_high=init_range_high, + parent_selection_type=parent_selection_type, + crossover_type=crossover_type, + mutation_type=mutation_type, + keep_parents=keep_parents, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness() + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + # Predicting the outputs of the data using the best solution. + predictions = pygad.nn.predict(last_layer=GANN_instance.population_networks[solution_idx], + data_inputs=data_inputs, + problem_type="regression") + print("Predictions of the trained network : {predictions}".format(predictions=predictions)) + + # Calculating some statistics + abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) + print("Absolute error : {abs_error}.".format(abs_error=abs_error)) + +The next figure shows how the fitness value changes for the 500 +generations used. + +.. figure:: https://user-images.githubusercontent.com/16560492/92948486-bbe78380-f459-11ea-9e31-0d4c7269d606.png + :alt: diff --git a/docs/source/index.rst b/docs/source/index.rst index b346229..3bd6ad6 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -247,7 +247,7 @@ pygad Module :maxdepth: 4 :caption: pygad Module TOC - README_pygad_ReadTheDocs.rst + pygad.rst @@ -262,7 +262,7 @@ pygad.nn Module :maxdepth: 4 :caption: pygad.nn Module TOC - README_pygad_nn_ReadTheDocs.rst + nn.rst @@ -278,7 +278,7 @@ pygad.gann Module :maxdepth: 4 :caption: pygad.gann Module TOC - README_pygad_gann_ReadTheDocs.rst + gann.rst @@ -298,7 +298,7 @@ pygad.cnn Module :maxdepth: 4 :caption: pygad.cnn Module TOC - README_pygad_cnn_ReadTheDocs.rst + cnn.rst @@ -318,7 +318,7 @@ pygad.gacnn Module :maxdepth: 4 :caption: pygad.gacnn Module TOC - README_pygad_gacnn_ReadTheDocs.rst + gacnn.rst @@ -333,7 +333,7 @@ pygad.kerasga Module :maxdepth: 4 :caption: pygad.kerasga Module TOC - README_pygad_kerasga_ReadTheDocs.rst + kerasga.rst @@ -348,7 +348,7 @@ pygad.torchga Module :maxdepth: 4 :caption: pygad.torchga Module TOC - README_pygad_torchga_ReadTheDocs.rst + torchga.rst @@ -363,7 +363,7 @@ More Information :maxdepth: 4 :caption: More Information - Footer.rst + releases.rst diff --git a/docs/source/README_pygad_kerasga_ReadTheDocs.rst b/docs/source/kerasga.rst similarity index 97% rename from docs/source/README_pygad_kerasga_ReadTheDocs.rst rename to docs/source/kerasga.rst index 9b82467..a602468 100644 --- a/docs/source/README_pygad_kerasga_ReadTheDocs.rst +++ b/docs/source/kerasga.rst @@ -1,971 +1,971 @@ -.. _pygadkerasga-module: - -``pygad.kerasga`` Module -======================== - -This section of the PyGAD's library documentation discusses the -`pygad.kerasga `__ -module. - -The ``pygad.kerarsga`` module has helper a class and 2 functions to -train Keras models using the genetic algorithm (PyGAD). The Keras model -can be built either using the `Sequential -Model `__ or the `Functional -API `__. - -The contents of this module are: - -1. ``KerasGA``: A class for creating an initial population of all - parameters in the Keras model. - -2. ``model_weights_as_vector()``: A function to reshape the Keras model - weights to a single vector. - -3. ``model_weights_as_matrix()``: A function to restore the Keras model - weights from a vector. - -4. ``predict()``: A function to make predictions based on the Keras - model and a solution. - -More details are given in the next sections. - -Steps Summary -============= - -The summary of the steps used to train a Keras model using PyGAD is as -follows: - -1. Create a Keras model. - -2. Create an instance of the ``pygad.kerasga.KerasGA`` class. - -3. Prepare the training data. - -4. Build the fitness function. - -5. Create an instance of the ``pygad.GA`` class. - -6. Run the genetic algorithm. - -Create Keras Model -================== - -Before discussing training a Keras model using PyGAD, the first thing to -do is to create the Keras model. - -According to the `Keras library -documentation `__, there are 3 ways to -build a Keras model: - -1. `Sequential Model `__ - -2. `Functional API `__ - -3. `Model Subclassing `__ - -PyGAD supports training the models created either using the Sequential -Model or the Functional API. - -Here is an example of a model created using the Sequential Model. - -.. code:: python - - import tensorflow.keras - - input_layer = tensorflow.keras.layers.Input(3) - dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu") - output_layer = tensorflow.keras.layers.Dense(1, activation="linear") - - model = tensorflow.keras.Sequential() - model.add(input_layer) - model.add(dense_layer1) - model.add(output_layer) - -This is the same model created using the Functional API. - -.. code:: python - - input_layer = tensorflow.keras.layers.Input(3) - dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) - output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - -Feel free to add the layers of your choice. - -.. _pygadkerasgakerasga-class: - -``pygad.kerasga.KerasGA`` Class -=============================== - -The ``pygad.kerasga`` module has a class named ``KerasGA`` for creating -an initial population for the genetic algorithm based on a Keras model. -The constructor, methods, and attributes within the class are discussed -in this section. - -.. _init: - -``__init__()`` --------------- - -The ``pygad.kerasga.KerasGA`` class constructor accepts the following -parameters: - -- ``model``: An instance of the Keras model. - -- ``num_solutions``: Number of solutions in the population. Each - solution has different parameters of the model. - -Instance Attributes -------------------- - -All parameters in the ``pygad.kerasga.KerasGA`` class constructor are -used as instance attributes in addition to adding a new attribute called -``population_weights``. - -Here is a list of all instance attributes: - -- ``model`` - -- ``num_solutions`` - -- ``population_weights``: A nested list holding the weights of all - solutions in the population. - -Methods in the ``KerasGA`` Class --------------------------------- - -This section discusses the methods available for instances of the -``pygad.kerasga.KerasGA`` class. - -.. _createpopulation: - -``create_population()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -The ``create_population()`` method creates the initial population of the -genetic algorithm as a list of solutions where each solution represents -different model parameters. The list of networks is assigned to the -``population_weights`` attribute of the instance. - -.. _functions-in-the-pygadkerasga-module: - -Functions in the ``pygad.kerasga`` Module -========================================= - -This section discusses the functions in the ``pygad.kerasga`` module. - -.. _pygadkerasgamodelweightsasvector: - -``pygad.kerasga.model_weights_as_vector()`` --------------------------------------------- - -The ``model_weights_as_vector()`` function accepts a single parameter -named ``model`` representing the Keras model. It returns a vector -holding all model weights. The reason for representing the model weights -as a vector is that the genetic algorithm expects all parameters of any -solution to be in a 1D vector form. - -This function filters the layers based on the ``trainable`` attribute to -see whether the layer weights are trained or not. For each layer, if its -``trainable=False``, then its weights will not be evolved using the -genetic algorithm. Otherwise, it will be represented in the chromosome -and evolved. - -The function accepts the following parameters: - -- ``model``: The Keras model. - -It returns a 1D vector holding the model weights. - -.. _pygadkerasgamodelweightsasmatrix: - -``pygad.kerasga.model_weights_as_matrix()`` -------------------------------------------- - -The ``model_weights_as_matrix()`` function accepts the following -parameters: - -1. ``model``: The Keras model. - -2. ``weights_vector``: The model parameters as a vector. - -It returns the restored model weights after reshaping the vector. - -.. _pygadkerasgapredict: - -``pygad.kerasga.predict()`` ---------------------------- - -The ``predict()`` function makes a prediction based on a solution. It -accepts the following parameters: - -1. ``model``: The Keras model. - -2. ``solution``: The solution evolved. - -3. ``data``: The test data inputs. - -It returns the predictions for the data samples. - -Examples -======== - -This section gives the complete code of some examples that build and -train a Keras model using PyGAD. Each subsection builds a different -network. - -Example 1: Regression Example ------------------------------ - -The next code builds a simple Keras model for regression. The next -subsections discuss each part in the code. - -.. code:: python - - import tensorflow.keras - import pygad.kerasga - import numpy - import pygad - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, keras_ga, model - - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - - mae = tensorflow.keras.losses.MeanAbsoluteError() - abs_error = mae(data_outputs, predictions).numpy() + 0.00000001 - solution_fitness = 1.0/abs_error - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - input_layer = tensorflow.keras.layers.Input(3) - dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) - output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - - keras_ga = pygad.kerasga.KerasGA(model=model, - num_solutions=10) - - # Data inputs - data_inputs = numpy.array([[0.02, 0.1, 0.15], - [0.7, 0.6, 0.8], - [1.5, 1.2, 1.7], - [3.2, 2.9, 3.1]]) - - # Data outputs - data_outputs = numpy.array([[0.1], - [0.6], - [1.3], - [2.5]]) - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 250 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = keras_ga.population_weights # Initial population of network weights - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make prediction based on the best solution. - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - print("Predictions : \n", predictions) - - mae = tensorflow.keras.losses.MeanAbsoluteError() - abs_error = mae(data_outputs, predictions).numpy() - print("Absolute Error : ", abs_error) - -Create a Keras Model -~~~~~~~~~~~~~~~~~~~~ - -According to the steps mentioned previously, the first step is to create -a Keras model. Here is the code that builds the model using the -Functional API. - -.. code:: python - - import tensorflow.keras - - input_layer = tensorflow.keras.layers.Input(3) - dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) - output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - -The model can also be build using the Keras Sequential Model API. - -.. code:: python - - input_layer = tensorflow.keras.layers.Input(3) - dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu") - output_layer = tensorflow.keras.layers.Dense(1, activation="linear") - - model = tensorflow.keras.Sequential() - model.add(input_layer) - model.add(dense_layer1) - model.add(output_layer) - -.. _create-an-instance-of-the-pygadkerasgakerasga-class: - -Create an Instance of the ``pygad.kerasga.KerasGA`` Class -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The second step is to create an instance of the -``pygad.kerasga.KerasGA`` class. There are 10 solutions per population. -Change this number according to your needs. - -.. code:: python - - import pygad.kerasga - - keras_ga = pygad.kerasga.KerasGA(model=model, - num_solutions=10) - -.. _prepare-the-training-data-1: - -Prepare the Training Data -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The third step is to prepare the training data inputs and outputs. Here -is an example where there are 4 samples. Each sample has 3 inputs and 1 -output. - -.. code:: python - - import numpy - - # Data inputs - data_inputs = numpy.array([[0.02, 0.1, 0.15], - [0.7, 0.6, 0.8], - [1.5, 1.2, 1.7], - [3.2, 2.9, 3.1]]) - - # Data outputs - data_outputs = numpy.array([[0.1], - [0.6], - [1.3], - [2.5]]) - -Build the Fitness Function -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The fourth step is to build the fitness function. This function must -accept 2 parameters representing the solution and its index within the -population. - -The next fitness function returns the model predictions based on the -current solution using the ``predict()`` function. Then, it calculates -the mean absolute error (MAE) of the Keras model based on the parameters -in the solution. The reciprocal of the MAE is used as the fitness value. -Feel free to use any other loss function to calculate the fitness value. - -.. code:: python - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, keras_ga, model - - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - - mae = tensorflow.keras.losses.MeanAbsoluteError() - abs_error = mae(data_outputs, predictions).numpy() + 0.00000001 - solution_fitness = 1.0/abs_error - - return solution_fitness - -.. _create-an-instance-of-the-pygadga-class: - -Create an Instance of the ``pygad.GA`` Class -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The fifth step is to instantiate the ``pygad.GA`` class. Note how the -``initial_population`` parameter is assigned to the initial weights of -the Keras models. - -For more information, please check the `parameters this class -accepts `__. - -.. code:: python - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 250 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = keras_ga.population_weights # Initial population of network weights - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - -Run the Genetic Algorithm -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The sixth and last step is to run the genetic algorithm by calling the -``run()`` method. - -.. code:: python - - ga_instance.run() - -After the PyGAD completes its execution, then there is a figure that -shows how the fitness value changes by generation. Call the -``plot_fitness()`` method to show the figure. - -.. code:: python - - ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) - -Here is the figure. - -.. figure:: https://user-images.githubusercontent.com/16560492/93722638-ac261880-fb98-11ea-95d3-e773deb034f4.png - :alt: - -To get information about the best solution found by PyGAD, use the -``best_solution()`` method. - -.. code:: python - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - -.. code:: python - - Fitness value of the best solution = 72.77768757825352 - Index of the best solution : 0 - -The next code makes prediction using the ``predict()`` function to -return the model predictions based on the best solution. - -.. code:: python - - # Fetch the parameters of the best solution. - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - print("Predictions : \n", predictions) - -.. code:: python - - Predictions : - [[0.09935353] - [0.63082725] - [1.2765523 ] - [2.4999595 ]] - -The next code measures the trained model error. - -.. code:: python - - mae = tensorflow.keras.losses.MeanAbsoluteError() - abs_error = mae(data_outputs, predictions).numpy() - print("Absolute Error : ", abs_error) - -.. code:: - - Absolute Error : 0.013740465 - -Example 2: XOR Binary Classification ------------------------------------- - -The next code creates a Keras model to build the XOR binary -classification problem. Let's highlight the changes compared to the -previous example. - -.. code:: python - - import tensorflow.keras - import pygad.kerasga - import numpy - import pygad - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, keras_ga, model - - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - - bce = tensorflow.keras.losses.BinaryCrossentropy() - solution_fitness = 1.0 / (bce(data_outputs, predictions).numpy() + 0.00000001) - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Build the keras model using the functional API. - input_layer = tensorflow.keras.layers.Input(2) - dense_layer = tensorflow.keras.layers.Dense(4, activation="relu")(input_layer) - output_layer = tensorflow.keras.layers.Dense(2, activation="softmax")(dense_layer) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - - # Create an instance of the pygad.kerasga.KerasGA class to build the initial population. - keras_ga = pygad.kerasga.KerasGA(model=model, - num_solutions=10) - - # XOR problem inputs - data_inputs = numpy.array([[0, 0], - [0, 1], - [1, 0], - [1, 1]]) - - # XOR problem outputs - data_outputs = numpy.array([[1, 0], - [0, 1], - [0, 1], - [1, 0]]) - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 250 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = keras_ga.population_weights # Initial population of network weights. - - # Create an instance of the pygad.GA class - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - # Start the genetic algorithm evolution. - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make predictions based on the best solution. - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - print("Predictions : \n", predictions) - - # Calculate the binary crossentropy for the trained model. - bce = tensorflow.keras.losses.BinaryCrossentropy() - print("Binary Crossentropy : ", bce(data_outputs, predictions).numpy()) - - # Calculate the classification accuracy for the trained model. - ba = tensorflow.keras.metrics.BinaryAccuracy() - ba.update_state(data_outputs, predictions) - accuracy = ba.result().numpy() - print("Accuracy : ", accuracy) - -Compared to the previous regression example, here are the changes: - -- The Keras model is changed according to the nature of the problem. - Now, it has 2 inputs and 2 outputs with an in-between hidden layer of - 4 neurons. - -.. code:: python - - # Build the keras model using the functional API. - input_layer = tensorflow.keras.layers.Input(2) - dense_layer = tensorflow.keras.layers.Dense(4, activation="relu")(input_layer) - output_layer = tensorflow.keras.layers.Dense(2, activation="softmax")(dense_layer) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - -- The train data is changed. Note that the output of each sample is a - 1D vector of 2 values, 1 for each class. - -.. code:: python - - # XOR problem inputs - data_inputs = numpy.array([[0, 0], - [0, 1], - [1, 0], - [1, 1]]) - - # XOR problem outputs - data_outputs = numpy.array([[1, 0], - [0, 1], - [0, 1], - [1, 0]]) - -- The fitness value is calculated based on the binary cross entropy. - -.. code:: python - - bce = tensorflow.keras.losses.BinaryCrossentropy() - solution_fitness = 1.0 / (bce(data_outputs, predictions).numpy() + 0.00000001) - -After the previous code completes, the next figure shows how the fitness -value change by generation. - -.. figure:: https://user-images.githubusercontent.com/16560492/93722639-b811da80-fb98-11ea-8951-f13a7a266c04.png - :alt: - -Here is some information about the trained model. Its fitness value is -``739.24``, loss is ``0.0013527311`` and accuracy is 100%. - -.. code:: python - - Fitness value of the best solution = 739.2397344644013 - Index of the best solution : 7 - - Predictions : - [[9.9694413e-01 3.0558957e-03] - [5.0176249e-04 9.9949825e-01] - [1.8470541e-03 9.9815291e-01] - [9.9999976e-01 2.0538971e-07]] - - Binary Crossentropy : 0.0013527311 - - Accuracy : 1.0 - -Example 3: Image Multi-Class Classification (Dense Layers) ----------------------------------------------------------- - -Here is the code. - -.. code:: python - - import tensorflow.keras - import pygad.kerasga - import numpy - import pygad - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, keras_ga, model - - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - - cce = tensorflow.keras.losses.CategoricalCrossentropy() - solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Build the keras model using the functional API. - input_layer = tensorflow.keras.layers.Input(360) - dense_layer = tensorflow.keras.layers.Dense(50, activation="relu")(input_layer) - output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - - # Create an instance of the pygad.kerasga.KerasGA class to build the initial population. - keras_ga = pygad.kerasga.KerasGA(model=model, - num_solutions=10) - - # Data inputs - data_inputs = numpy.load("dataset_features.npy") - - # Data outputs - data_outputs = numpy.load("outputs.npy") - data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 100 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = keras_ga.population_weights # Initial population of network weights. - - # Create an instance of the pygad.GA class - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - # Start the genetic algorithm evolution. - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make predictions based on the best solution. - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - # print("Predictions : \n", predictions) - - # Calculate the categorical crossentropy for the trained model. - cce = tensorflow.keras.losses.CategoricalCrossentropy() - print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) - - # Calculate the classification accuracy for the trained model. - ca = tensorflow.keras.metrics.CategoricalAccuracy() - ca.update_state(data_outputs, predictions) - accuracy = ca.result().numpy() - print("Accuracy : ", accuracy) - -Compared to the previous binary classification example, this example has -multiple classes (4) and thus the loss is measured using categorical -cross entropy. - -.. code:: python - - cce = tensorflow.keras.losses.CategoricalCrossentropy() - solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) - -.. _prepare-the-training-data-2: - -Prepare the Training Data -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Before building and training neural networks, the training data (input -and output) needs to be prepared. The inputs and the outputs of the -training data are NumPy arrays. - -The data used in this example is available as 2 files: - -1. `dataset_features.npy `__: - Data inputs. - https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy - -2. `outputs.npy `__: - Class labels. - https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy - -The data consists of 4 classes of images. The image shape is -``(100, 100, 3)``. The number of training samples is 1962. The feature -vector extracted from each image has a length 360. - -Simply download these 2 files and read them according to the next code. -Note that the class labels are one-hot encoded using the -``tensorflow.keras.utils.to_categorical()`` function. - -.. code:: python - - import numpy - - data_inputs = numpy.load("dataset_features.npy") - - data_outputs = numpy.load("outputs.npy") - data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) - -The next figure shows how the fitness value changes. - -.. figure:: https://user-images.githubusercontent.com/16560492/93722649-c2cc6f80-fb98-11ea-96e7-3f6ce3cfe1cf.png - :alt: - -Here are some statistics about the trained model. - -.. code:: - - Fitness value of the best solution = 4.197464252185969 - Index of the best solution : 0 - Categorical Crossentropy : 0.23823906 - Accuracy : 0.9852192 - -Example 4: Image Multi-Class Classification (Conv Layers) ---------------------------------------------------------- - -Compared to the previous example that uses only dense layers, this -example uses convolutional layers to classify the same dataset. - -Here is the complete code. - -.. code:: python - - import tensorflow.keras - import pygad.kerasga - import numpy - import pygad - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, keras_ga, model - - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - - cce = tensorflow.keras.losses.CategoricalCrossentropy() - solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Build the keras model using the functional API. - input_layer = tensorflow.keras.layers.Input(shape=(100, 100, 3)) - conv_layer1 = tensorflow.keras.layers.Conv2D(filters=5, - kernel_size=7, - activation="relu")(input_layer) - max_pool1 = tensorflow.keras.layers.MaxPooling2D(pool_size=(5,5), - strides=5)(conv_layer1) - conv_layer2 = tensorflow.keras.layers.Conv2D(filters=3, - kernel_size=3, - activation="relu")(max_pool1) - flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2) - dense_layer = tensorflow.keras.layers.Dense(15, activation="relu")(flatten_layer) - output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - - # Create an instance of the pygad.kerasga.KerasGA class to build the initial population. - keras_ga = pygad.kerasga.KerasGA(model=model, - num_solutions=10) - - # Data inputs - data_inputs = numpy.load("dataset_inputs.npy") - - # Data outputs - data_outputs = numpy.load("dataset_outputs.npy") - data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 200 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = keras_ga.population_weights # Initial population of network weights. - - # Create an instance of the pygad.GA class - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - # Start the genetic algorithm evolution. - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make predictions based on the best solution. - predictions = pygad.kerasga.predict(model=model, - solution=solution, - data=data_inputs) - # print("Predictions : \n", predictions) - - # Calculate the categorical crossentropy for the trained model. - cce = tensorflow.keras.losses.CategoricalCrossentropy() - print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) - - # Calculate the classification accuracy for the trained model. - ca = tensorflow.keras.metrics.CategoricalAccuracy() - ca.update_state(data_outputs, predictions) - accuracy = ca.result().numpy() - print("Accuracy : ", accuracy) - -Compared to the previous example, the only change is that the -architecture uses convolutional and max-pooling layers. The shape of -each input sample is 100x100x3. - -.. code:: python - - # Build the keras model using the functional API. - input_layer = tensorflow.keras.layers.Input(shape=(100, 100, 3)) - conv_layer1 = tensorflow.keras.layers.Conv2D(filters=5, - kernel_size=7, - activation="relu")(input_layer) - max_pool1 = tensorflow.keras.layers.MaxPooling2D(pool_size=(5,5), - strides=5)(conv_layer1) - conv_layer2 = tensorflow.keras.layers.Conv2D(filters=3, - kernel_size=3, - activation="relu")(max_pool1) - flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2) - dense_layer = tensorflow.keras.layers.Dense(15, activation="relu")(flatten_layer) - output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) - - model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) - -.. _prepare-the-training-data-3: - -Prepare the Training Data -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The data used in this example is available as 2 files: - -1. `dataset_inputs.npy `__: - Data inputs. - https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy - -2. `dataset_outputs.npy `__: - Class labels. - https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy - -The data consists of 4 classes of images. The image shape is -``(100, 100, 3)`` and there are 20 images per class for a total of 80 -training samples. For more information about the dataset, check the -`Reading the -Data `__ -section of the ``pygad.cnn`` module. - -Simply download these 2 files and read them according to the next code. -Note that the class labels are one-hot encoded using the -``tensorflow.keras.utils.to_categorical()`` function. - -.. code:: python - - import numpy - - data_inputs = numpy.load("dataset_inputs.npy") - - data_outputs = numpy.load("dataset_outputs.npy") - data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) - -The next figure shows how the fitness value changes. - -.. figure:: https://user-images.githubusercontent.com/16560492/93722654-cc55d780-fb98-11ea-8f95-7b65dc67f5c8.png - :alt: - -Here are some statistics about the trained model. The model accuracy is -75% after the 200 generations. Note that just running the code again may -give different results. - -.. code:: - - Fitness value of the best solution = 2.7462310258668805 - Index of the best solution : 0 - Categorical Crossentropy : 0.3641354 - Accuracy : 0.75 - -To improve the model performance, you can do the following: - -- Add more layers - -- Modify the existing layers. - -- Use different parameters for the layers. - -- Use different parameters for the genetic algorithm (e.g. number of - solution, number of generations, etc) +.. _pygadkerasga-module: + +``pygad.kerasga`` Module +======================== + +This section of the PyGAD's library documentation discusses the +`pygad.kerasga `__ +module. + +The ``pygad.kerarsga`` module has helper a class and 2 functions to +train Keras models using the genetic algorithm (PyGAD). The Keras model +can be built either using the `Sequential +Model `__ or the `Functional +API `__. + +The contents of this module are: + +1. ``KerasGA``: A class for creating an initial population of all + parameters in the Keras model. + +2. ``model_weights_as_vector()``: A function to reshape the Keras model + weights to a single vector. + +3. ``model_weights_as_matrix()``: A function to restore the Keras model + weights from a vector. + +4. ``predict()``: A function to make predictions based on the Keras + model and a solution. + +More details are given in the next sections. + +Steps Summary +============= + +The summary of the steps used to train a Keras model using PyGAD is as +follows: + +1. Create a Keras model. + +2. Create an instance of the ``pygad.kerasga.KerasGA`` class. + +3. Prepare the training data. + +4. Build the fitness function. + +5. Create an instance of the ``pygad.GA`` class. + +6. Run the genetic algorithm. + +Create Keras Model +================== + +Before discussing training a Keras model using PyGAD, the first thing to +do is to create the Keras model. + +According to the `Keras library +documentation `__, there are 3 ways to +build a Keras model: + +1. `Sequential Model `__ + +2. `Functional API `__ + +3. `Model Subclassing `__ + +PyGAD supports training the models created either using the Sequential +Model or the Functional API. + +Here is an example of a model created using the Sequential Model. + +.. code:: python + + import tensorflow.keras + + input_layer = tensorflow.keras.layers.Input(3) + dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu") + output_layer = tensorflow.keras.layers.Dense(1, activation="linear") + + model = tensorflow.keras.Sequential() + model.add(input_layer) + model.add(dense_layer1) + model.add(output_layer) + +This is the same model created using the Functional API. + +.. code:: python + + input_layer = tensorflow.keras.layers.Input(3) + dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) + output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +Feel free to add the layers of your choice. + +.. _pygadkerasgakerasga-class: + +``pygad.kerasga.KerasGA`` Class +=============================== + +The ``pygad.kerasga`` module has a class named ``KerasGA`` for creating +an initial population for the genetic algorithm based on a Keras model. +The constructor, methods, and attributes within the class are discussed +in this section. + +.. _init: + +``__init__()`` +-------------- + +The ``pygad.kerasga.KerasGA`` class constructor accepts the following +parameters: + +- ``model``: An instance of the Keras model. + +- ``num_solutions``: Number of solutions in the population. Each + solution has different parameters of the model. + +Instance Attributes +------------------- + +All parameters in the ``pygad.kerasga.KerasGA`` class constructor are +used as instance attributes in addition to adding a new attribute called +``population_weights``. + +Here is a list of all instance attributes: + +- ``model`` + +- ``num_solutions`` + +- ``population_weights``: A nested list holding the weights of all + solutions in the population. + +Methods in the ``KerasGA`` Class +-------------------------------- + +This section discusses the methods available for instances of the +``pygad.kerasga.KerasGA`` class. + +.. _createpopulation: + +``create_population()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +The ``create_population()`` method creates the initial population of the +genetic algorithm as a list of solutions where each solution represents +different model parameters. The list of networks is assigned to the +``population_weights`` attribute of the instance. + +.. _functions-in-the-pygadkerasga-module: + +Functions in the ``pygad.kerasga`` Module +========================================= + +This section discusses the functions in the ``pygad.kerasga`` module. + +.. _pygadkerasgamodelweightsasvector: + +``pygad.kerasga.model_weights_as_vector()`` +-------------------------------------------- + +The ``model_weights_as_vector()`` function accepts a single parameter +named ``model`` representing the Keras model. It returns a vector +holding all model weights. The reason for representing the model weights +as a vector is that the genetic algorithm expects all parameters of any +solution to be in a 1D vector form. + +This function filters the layers based on the ``trainable`` attribute to +see whether the layer weights are trained or not. For each layer, if its +``trainable=False``, then its weights will not be evolved using the +genetic algorithm. Otherwise, it will be represented in the chromosome +and evolved. + +The function accepts the following parameters: + +- ``model``: The Keras model. + +It returns a 1D vector holding the model weights. + +.. _pygadkerasgamodelweightsasmatrix: + +``pygad.kerasga.model_weights_as_matrix()`` +------------------------------------------- + +The ``model_weights_as_matrix()`` function accepts the following +parameters: + +1. ``model``: The Keras model. + +2. ``weights_vector``: The model parameters as a vector. + +It returns the restored model weights after reshaping the vector. + +.. _pygadkerasgapredict: + +``pygad.kerasga.predict()`` +--------------------------- + +The ``predict()`` function makes a prediction based on a solution. It +accepts the following parameters: + +1. ``model``: The Keras model. + +2. ``solution``: The solution evolved. + +3. ``data``: The test data inputs. + +It returns the predictions for the data samples. + +Examples +======== + +This section gives the complete code of some examples that build and +train a Keras model using PyGAD. Each subsection builds a different +network. + +Example 1: Regression Example +----------------------------- + +The next code builds a simple Keras model for regression. The next +subsections discuss each part in the code. + +.. code:: python + + import tensorflow.keras + import pygad.kerasga + import numpy + import pygad + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + mae = tensorflow.keras.losses.MeanAbsoluteError() + abs_error = mae(data_outputs, predictions).numpy() + 0.00000001 + solution_fitness = 1.0/abs_error + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + input_layer = tensorflow.keras.layers.Input(3) + dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) + output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + + keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + + # Data inputs + data_inputs = numpy.array([[0.02, 0.1, 0.15], + [0.7, 0.6, 0.8], + [1.5, 1.2, 1.7], + [3.2, 2.9, 3.1]]) + + # Data outputs + data_outputs = numpy.array([[0.1], + [0.6], + [1.3], + [2.5]]) + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 250 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = keras_ga.population_weights # Initial population of network weights + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make prediction based on the best solution. + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + print("Predictions : \n", predictions) + + mae = tensorflow.keras.losses.MeanAbsoluteError() + abs_error = mae(data_outputs, predictions).numpy() + print("Absolute Error : ", abs_error) + +Create a Keras Model +~~~~~~~~~~~~~~~~~~~~ + +According to the steps mentioned previously, the first step is to create +a Keras model. Here is the code that builds the model using the +Functional API. + +.. code:: python + + import tensorflow.keras + + input_layer = tensorflow.keras.layers.Input(3) + dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) + output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +The model can also be build using the Keras Sequential Model API. + +.. code:: python + + input_layer = tensorflow.keras.layers.Input(3) + dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu") + output_layer = tensorflow.keras.layers.Dense(1, activation="linear") + + model = tensorflow.keras.Sequential() + model.add(input_layer) + model.add(dense_layer1) + model.add(output_layer) + +.. _create-an-instance-of-the-pygadkerasgakerasga-class: + +Create an Instance of the ``pygad.kerasga.KerasGA`` Class +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The second step is to create an instance of the +``pygad.kerasga.KerasGA`` class. There are 10 solutions per population. +Change this number according to your needs. + +.. code:: python + + import pygad.kerasga + + keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +.. _prepare-the-training-data-1: + +Prepare the Training Data +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The third step is to prepare the training data inputs and outputs. Here +is an example where there are 4 samples. Each sample has 3 inputs and 1 +output. + +.. code:: python + + import numpy + + # Data inputs + data_inputs = numpy.array([[0.02, 0.1, 0.15], + [0.7, 0.6, 0.8], + [1.5, 1.2, 1.7], + [3.2, 2.9, 3.1]]) + + # Data outputs + data_outputs = numpy.array([[0.1], + [0.6], + [1.3], + [2.5]]) + +Build the Fitness Function +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The fourth step is to build the fitness function. This function must +accept 2 parameters representing the solution and its index within the +population. + +The next fitness function returns the model predictions based on the +current solution using the ``predict()`` function. Then, it calculates +the mean absolute error (MAE) of the Keras model based on the parameters +in the solution. The reciprocal of the MAE is used as the fitness value. +Feel free to use any other loss function to calculate the fitness value. + +.. code:: python + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + mae = tensorflow.keras.losses.MeanAbsoluteError() + abs_error = mae(data_outputs, predictions).numpy() + 0.00000001 + solution_fitness = 1.0/abs_error + + return solution_fitness + +.. _create-an-instance-of-the-pygadga-class: + +Create an Instance of the ``pygad.GA`` Class +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The fifth step is to instantiate the ``pygad.GA`` class. Note how the +``initial_population`` parameter is assigned to the initial weights of +the Keras models. + +For more information, please check the `parameters this class +accepts `__. + +.. code:: python + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 250 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = keras_ga.population_weights # Initial population of network weights + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + +Run the Genetic Algorithm +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The sixth and last step is to run the genetic algorithm by calling the +``run()`` method. + +.. code:: python + + ga_instance.run() + +After the PyGAD completes its execution, then there is a figure that +shows how the fitness value changes by generation. Call the +``plot_fitness()`` method to show the figure. + +.. code:: python + + ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +Here is the figure. + +.. figure:: https://user-images.githubusercontent.com/16560492/93722638-ac261880-fb98-11ea-95d3-e773deb034f4.png + :alt: + +To get information about the best solution found by PyGAD, use the +``best_solution()`` method. + +.. code:: python + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +.. code:: python + + Fitness value of the best solution = 72.77768757825352 + Index of the best solution : 0 + +The next code makes prediction using the ``predict()`` function to +return the model predictions based on the best solution. + +.. code:: python + + # Fetch the parameters of the best solution. + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + print("Predictions : \n", predictions) + +.. code:: python + + Predictions : + [[0.09935353] + [0.63082725] + [1.2765523 ] + [2.4999595 ]] + +The next code measures the trained model error. + +.. code:: python + + mae = tensorflow.keras.losses.MeanAbsoluteError() + abs_error = mae(data_outputs, predictions).numpy() + print("Absolute Error : ", abs_error) + +.. code:: + + Absolute Error : 0.013740465 + +Example 2: XOR Binary Classification +------------------------------------ + +The next code creates a Keras model to build the XOR binary +classification problem. Let's highlight the changes compared to the +previous example. + +.. code:: python + + import tensorflow.keras + import pygad.kerasga + import numpy + import pygad + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + bce = tensorflow.keras.losses.BinaryCrossentropy() + solution_fitness = 1.0 / (bce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Build the keras model using the functional API. + input_layer = tensorflow.keras.layers.Input(2) + dense_layer = tensorflow.keras.layers.Dense(4, activation="relu")(input_layer) + output_layer = tensorflow.keras.layers.Dense(2, activation="softmax")(dense_layer) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + + # Create an instance of the pygad.kerasga.KerasGA class to build the initial population. + keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + + # XOR problem inputs + data_inputs = numpy.array([[0, 0], + [0, 1], + [1, 0], + [1, 1]]) + + # XOR problem outputs + data_outputs = numpy.array([[1, 0], + [0, 1], + [0, 1], + [1, 0]]) + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 250 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = keras_ga.population_weights # Initial population of network weights. + + # Create an instance of the pygad.GA class + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + # Start the genetic algorithm evolution. + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make predictions based on the best solution. + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + print("Predictions : \n", predictions) + + # Calculate the binary crossentropy for the trained model. + bce = tensorflow.keras.losses.BinaryCrossentropy() + print("Binary Crossentropy : ", bce(data_outputs, predictions).numpy()) + + # Calculate the classification accuracy for the trained model. + ba = tensorflow.keras.metrics.BinaryAccuracy() + ba.update_state(data_outputs, predictions) + accuracy = ba.result().numpy() + print("Accuracy : ", accuracy) + +Compared to the previous regression example, here are the changes: + +- The Keras model is changed according to the nature of the problem. + Now, it has 2 inputs and 2 outputs with an in-between hidden layer of + 4 neurons. + +.. code:: python + + # Build the keras model using the functional API. + input_layer = tensorflow.keras.layers.Input(2) + dense_layer = tensorflow.keras.layers.Dense(4, activation="relu")(input_layer) + output_layer = tensorflow.keras.layers.Dense(2, activation="softmax")(dense_layer) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +- The train data is changed. Note that the output of each sample is a + 1D vector of 2 values, 1 for each class. + +.. code:: python + + # XOR problem inputs + data_inputs = numpy.array([[0, 0], + [0, 1], + [1, 0], + [1, 1]]) + + # XOR problem outputs + data_outputs = numpy.array([[1, 0], + [0, 1], + [0, 1], + [1, 0]]) + +- The fitness value is calculated based on the binary cross entropy. + +.. code:: python + + bce = tensorflow.keras.losses.BinaryCrossentropy() + solution_fitness = 1.0 / (bce(data_outputs, predictions).numpy() + 0.00000001) + +After the previous code completes, the next figure shows how the fitness +value change by generation. + +.. figure:: https://user-images.githubusercontent.com/16560492/93722639-b811da80-fb98-11ea-8951-f13a7a266c04.png + :alt: + +Here is some information about the trained model. Its fitness value is +``739.24``, loss is ``0.0013527311`` and accuracy is 100%. + +.. code:: python + + Fitness value of the best solution = 739.2397344644013 + Index of the best solution : 7 + + Predictions : + [[9.9694413e-01 3.0558957e-03] + [5.0176249e-04 9.9949825e-01] + [1.8470541e-03 9.9815291e-01] + [9.9999976e-01 2.0538971e-07]] + + Binary Crossentropy : 0.0013527311 + + Accuracy : 1.0 + +Example 3: Image Multi-Class Classification (Dense Layers) +---------------------------------------------------------- + +Here is the code. + +.. code:: python + + import tensorflow.keras + import pygad.kerasga + import numpy + import pygad + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Build the keras model using the functional API. + input_layer = tensorflow.keras.layers.Input(360) + dense_layer = tensorflow.keras.layers.Dense(50, activation="relu")(input_layer) + output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + + # Create an instance of the pygad.kerasga.KerasGA class to build the initial population. + keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + + # Data inputs + data_inputs = numpy.load("dataset_features.npy") + + # Data outputs + data_outputs = numpy.load("outputs.npy") + data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 100 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = keras_ga.population_weights # Initial population of network weights. + + # Create an instance of the pygad.GA class + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + # Start the genetic algorithm evolution. + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make predictions based on the best solution. + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + # print("Predictions : \n", predictions) + + # Calculate the categorical crossentropy for the trained model. + cce = tensorflow.keras.losses.CategoricalCrossentropy() + print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) + + # Calculate the classification accuracy for the trained model. + ca = tensorflow.keras.metrics.CategoricalAccuracy() + ca.update_state(data_outputs, predictions) + accuracy = ca.result().numpy() + print("Accuracy : ", accuracy) + +Compared to the previous binary classification example, this example has +multiple classes (4) and thus the loss is measured using categorical +cross entropy. + +.. code:: python + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + +.. _prepare-the-training-data-2: + +Prepare the Training Data +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before building and training neural networks, the training data (input +and output) needs to be prepared. The inputs and the outputs of the +training data are NumPy arrays. + +The data used in this example is available as 2 files: + +1. `dataset_features.npy `__: + Data inputs. + https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy + +2. `outputs.npy `__: + Class labels. + https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy + +The data consists of 4 classes of images. The image shape is +``(100, 100, 3)``. The number of training samples is 1962. The feature +vector extracted from each image has a length 360. + +Simply download these 2 files and read them according to the next code. +Note that the class labels are one-hot encoded using the +``tensorflow.keras.utils.to_categorical()`` function. + +.. code:: python + + import numpy + + data_inputs = numpy.load("dataset_features.npy") + + data_outputs = numpy.load("outputs.npy") + data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) + +The next figure shows how the fitness value changes. + +.. figure:: https://user-images.githubusercontent.com/16560492/93722649-c2cc6f80-fb98-11ea-96e7-3f6ce3cfe1cf.png + :alt: + +Here are some statistics about the trained model. + +.. code:: + + Fitness value of the best solution = 4.197464252185969 + Index of the best solution : 0 + Categorical Crossentropy : 0.23823906 + Accuracy : 0.9852192 + +Example 4: Image Multi-Class Classification (Conv Layers) +--------------------------------------------------------- + +Compared to the previous example that uses only dense layers, this +example uses convolutional layers to classify the same dataset. + +Here is the complete code. + +.. code:: python + + import tensorflow.keras + import pygad.kerasga + import numpy + import pygad + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Build the keras model using the functional API. + input_layer = tensorflow.keras.layers.Input(shape=(100, 100, 3)) + conv_layer1 = tensorflow.keras.layers.Conv2D(filters=5, + kernel_size=7, + activation="relu")(input_layer) + max_pool1 = tensorflow.keras.layers.MaxPooling2D(pool_size=(5,5), + strides=5)(conv_layer1) + conv_layer2 = tensorflow.keras.layers.Conv2D(filters=3, + kernel_size=3, + activation="relu")(max_pool1) + flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2) + dense_layer = tensorflow.keras.layers.Dense(15, activation="relu")(flatten_layer) + output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + + # Create an instance of the pygad.kerasga.KerasGA class to build the initial population. + keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + + # Data inputs + data_inputs = numpy.load("dataset_inputs.npy") + + # Data outputs + data_outputs = numpy.load("dataset_outputs.npy") + data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 200 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = keras_ga.population_weights # Initial population of network weights. + + # Create an instance of the pygad.GA class + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + # Start the genetic algorithm evolution. + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make predictions based on the best solution. + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + # print("Predictions : \n", predictions) + + # Calculate the categorical crossentropy for the trained model. + cce = tensorflow.keras.losses.CategoricalCrossentropy() + print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) + + # Calculate the classification accuracy for the trained model. + ca = tensorflow.keras.metrics.CategoricalAccuracy() + ca.update_state(data_outputs, predictions) + accuracy = ca.result().numpy() + print("Accuracy : ", accuracy) + +Compared to the previous example, the only change is that the +architecture uses convolutional and max-pooling layers. The shape of +each input sample is 100x100x3. + +.. code:: python + + # Build the keras model using the functional API. + input_layer = tensorflow.keras.layers.Input(shape=(100, 100, 3)) + conv_layer1 = tensorflow.keras.layers.Conv2D(filters=5, + kernel_size=7, + activation="relu")(input_layer) + max_pool1 = tensorflow.keras.layers.MaxPooling2D(pool_size=(5,5), + strides=5)(conv_layer1) + conv_layer2 = tensorflow.keras.layers.Conv2D(filters=3, + kernel_size=3, + activation="relu")(max_pool1) + flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2) + dense_layer = tensorflow.keras.layers.Dense(15, activation="relu")(flatten_layer) + output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) + + model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +.. _prepare-the-training-data-3: + +Prepare the Training Data +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The data used in this example is available as 2 files: + +1. `dataset_inputs.npy `__: + Data inputs. + https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy + +2. `dataset_outputs.npy `__: + Class labels. + https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy + +The data consists of 4 classes of images. The image shape is +``(100, 100, 3)`` and there are 20 images per class for a total of 80 +training samples. For more information about the dataset, check the +`Reading the +Data `__ +section of the ``pygad.cnn`` module. + +Simply download these 2 files and read them according to the next code. +Note that the class labels are one-hot encoded using the +``tensorflow.keras.utils.to_categorical()`` function. + +.. code:: python + + import numpy + + data_inputs = numpy.load("dataset_inputs.npy") + + data_outputs = numpy.load("dataset_outputs.npy") + data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) + +The next figure shows how the fitness value changes. + +.. figure:: https://user-images.githubusercontent.com/16560492/93722654-cc55d780-fb98-11ea-8f95-7b65dc67f5c8.png + :alt: + +Here are some statistics about the trained model. The model accuracy is +75% after the 200 generations. Note that just running the code again may +give different results. + +.. code:: + + Fitness value of the best solution = 2.7462310258668805 + Index of the best solution : 0 + Categorical Crossentropy : 0.3641354 + Accuracy : 0.75 + +To improve the model performance, you can do the following: + +- Add more layers + +- Modify the existing layers. + +- Use different parameters for the layers. + +- Use different parameters for the genetic algorithm (e.g. number of + solution, number of generations, etc) diff --git a/docs/source/README_pygad_nn_ReadTheDocs.rst b/docs/source/nn.rst similarity index 97% rename from docs/source/README_pygad_nn_ReadTheDocs.rst rename to docs/source/nn.rst index 5999754..c7902ec 100644 --- a/docs/source/README_pygad_nn_ReadTheDocs.rst +++ b/docs/source/nn.rst @@ -1,976 +1,976 @@ -.. _pygadnn-module: - -``pygad.nn`` Module -=================== - -This section of the PyGAD's library documentation discusses the -**pygad.nn** module. - -Using the **pygad.nn** module, artificial neural networks are created. -The purpose of this module is to only implement the **forward pass** of -a neural network without using a training algorithm. The **pygad.nn** -module builds the network layers, implements the activations functions, -trains the network, makes predictions, and more. - -Later, the **pygad.gann** module is used to train the **pygad.nn** -network using the genetic algorithm built in the **pygad** module. - -Starting from `PyGAD -2.7.1 `__, -the **pygad.nn** module supports both classification and regression -problems. For more information, check the ``problem_type`` parameter in -the ``pygad.nn.train()`` and ``pygad.nn.predict()`` functions. - -Supported Layers -================ - -Each layer supported by the **pygad.nn** module has a corresponding -class. The layers and their classes are: - -1. **Input**: Implemented using the ``pygad.nn.InputLayer`` class. - -2. **Dense** (Fully Connected): Implemented using the - ``pygad.nn.DenseLayer`` class. - -In the future, more layers will be added. The next subsections discuss -such layers. - -.. _pygadnninputlayer-class: - -``pygad.nn.InputLayer`` Class ------------------------------ - -The ``pygad.nn.InputLayer`` class creates the input layer for the neural -network. For each network, there is only a single input layer. The -network architecture must start with an input layer. - -This class has no methods or class attributes. All it has is a -constructor that accepts a parameter named ``num_neurons`` representing -the number of neurons in the input layer. - -An instance attribute named ``num_neurons`` is created within the -constructor to keep such a number. Here is an example of building an -input layer with 20 neurons. - -.. code:: python - - input_layer = pygad.nn.InputLayer(num_neurons=20) - -Here is how the single attribute ``num_neurons`` within the instance of -the ``pygad.nn.InputLayer`` class can be accessed. - -.. code:: python - - num_input_neurons = input_layer.num_neurons - - print("Number of input neurons =", num_input_neurons) - -This is everything about the input layer. - -.. _pygadnndenselayer-class: - -``pygad.nn.DenseLayer`` Class ------------------------------ - -Using the ``pygad.nn.DenseLayer`` class, dense (fully-connected) layers -can be created. To create a dense layer, just create a new instance of -the class. The constructor accepts the following parameters: - -- ``num_neurons``: Number of neurons in the dense layer. - -- ``previous_layer``: A reference to the previous layer. Using the - ``previous_layer`` attribute, a linked list is created that connects - all network layers. - -- ``activation_function``: A string representing the activation - function to be used in this layer. Defaults to ``"sigmoid"``. - Currently, the supported values for the activation functions are - ``"sigmoid"``, ``"relu"``, ``"softmax"`` (supported in PyGAD 2.3.0 - and higher), and ``"None"`` (supported in PyGAD 2.7.0 and higher). - When a layer has its activation function set to ``"None"``, then it - means no activation function is applied. For a **regression - problem**, set the activation function of the output (last) layer to - ``"None"``. If all outputs in the regression problem are nonnegative, - then it is possible to use the ReLU function in the output layer. - -Within the constructor, the accepted parameters are used as instance -attributes. Besides the parameters, some new instance attributes are -created which are: - -- ``initial_weights``: The initial weights for the dense layer. - -- ``trained_weights``: The trained weights of the dense layer. This - attribute is initialized by the value in the ``initial_weights`` - attribute. - -Here is an example for creating a dense layer with 12 neurons. Note that -the ``previous_layer`` parameter is assigned to the input layer -``input_layer``. - -.. code:: python - - dense_layer = pygad.nn.DenseLayer(num_neurons=12, - previous_layer=input_layer, - activation_function="relu") - -Here is how to access some attributes in the dense layer: - -.. code:: python - - num_dense_neurons = dense_layer.num_neurons - dense_initail_weights = dense_layer.initial_weights - - print("Number of dense layer attributes =", num_dense_neurons) - print("Initial weights of the dense layer :", dense_initail_weights) - -Because ``dense_layer`` holds a reference to the input layer, then the -number of input neurons can be accessed. - -.. code:: python - - input_layer = dense_layer.previous_layer - num_input_neurons = input_layer.num_neurons - - print("Number of input neurons =", num_input_neurons) - -Here is another dense layer. This dense layer's ``previous_layer`` -attribute points to the previously created dense layer. - -.. code:: python - - dense_layer2 = pygad.nn.DenseLayer(num_neurons=5, - previous_layer=dense_layer, - activation_function="relu") - -Because ``dense_layer2`` holds a reference to ``dense_layer`` in its -``previous_layer`` attribute, then the number of neurons in -``dense_layer`` can be accessed. - -.. code:: python - - dense_layer = dense_layer2.previous_layer - dense_layer_neurons = dense_layer.num_neurons - - print("Number of dense neurons =", num_input_neurons) - -After getting the reference to ``dense_layer``, we can use it to access -the number of input neurons. - -.. code:: python - - dense_layer = dense_layer2.previous_layer - input_layer = dense_layer.previous_layer - num_input_neurons = input_layer.num_neurons - - print("Number of input neurons =", num_input_neurons) - -Assuming that ``dense_layer2`` is the last dense layer, then it is -regarded as the output layer. - -.. _previouslayer-attribute: - -``previous_layer`` Attribute -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The ``previous_layer`` attribute in the ``pygad.nn.DenseLayer`` class -creates a one way linked list between all the layers in the network -architecture as described by the next figure. - -The last (output) layer indexed N points to layer **N-1**, layer **N-1** -points to the layer **N-2**, the layer **N-2** points to the layer -**N-3**, and so on until reaching the end of the linked list which is -layer 1 (input layer). - -.. figure:: https://user-images.githubusercontent.com/16560492/81918975-816af880-95d7-11ea-83e3-34d14c3316db.jpg - :alt: - -The one way linked list allows returning all properties of all layers in -the network architecture by just passing the last layer in the network. -The linked list moves from the output layer towards the input layer. - -Using the ``previous_layer`` attribute of layer **N**, the layer **N-1** -can be accessed. Using the ``previous_layer`` attribute of layer -**N-1**, layer **N-2** can be accessed. The process continues until -reaching a layer that does not have a ``previous_layer`` attribute -(which is the input layer). - -The properties of the layers include the weights (initial or trained), -activation functions, and more. Here is how a ``while`` loop is used to -iterate through all the layers. The ``while`` loop stops only when the -current layer does not have a ``previous_layer`` attribute. This layer -is the input layer. - -.. code:: python - - layer = dense_layer2 - - while "previous_layer" in layer.__init__.__code__.co_varnames: - print("Number of neurons =", layer.num_neurons) - - # Go to the previous layer. - layer = layer.previous_layer - -Functions to Manipulate Neural Networks -======================================= - -There are a number of functions existing in the ``pygad.nn`` module that -helps to manipulate the neural network. - -.. _pygadnnlayersweights: - -``pygad.nn.layers_weights()`` ------------------------------ - -Creates and returns a list holding the weights matrices of all layers in -the neural network. - -Accepts the following parameters: - -- ``last_layer``: A reference to the last (output) layer in the network - architecture. - -- ``initial``: When ``True`` (default), the function returns the - **initial** weights of the layers using the layers' - ``initial_weights`` attribute. When ``False``, it returns the - **trained** weights of the layers using the layers' - ``trained_weights`` attribute. The initial weights are only needed - before network training starts. The trained weights are needed to - predict the network outputs. - -The function uses a ``while`` loop to iterate through the layers using -their ``previous_layer`` attribute. For each layer, either the initial -weights or the trained weights are returned based on where the -``initial`` parameter is ``True`` or ``False``. - -.. _pygadnnlayersweightsasvector: - -``pygad.nn.layers_weights_as_vector()`` ---------------------------------------- - -Creates and returns a list holding the weights **vectors** of all layers -in the neural network. The weights array of each layer is reshaped to -get a vector. - -This function is similar to the ``layers_weights()`` function except -that it returns the weights of each layer as a vector, not as an array. - -Accepts the following parameters: - -- ``last_layer``: A reference to the last (output) layer in the network - architecture. - -- ``initial``: When ``True`` (default), the function returns the - **initial** weights of the layers using the layers' - ``initial_weights`` attribute. When ``False``, it returns the - **trained** weights of the layers using the layers' - ``trained_weights`` attribute. The initial weights are only needed - before network training starts. The trained weights are needed to - predict the network outputs. - -The function uses a ``while`` loop to iterate through the layers using -their ``previous_layer`` attribute. For each layer, either the initial -weights or the trained weights are returned based on where the -``initial`` parameter is ``True`` or ``False``. - -.. _pygadnnlayersweightsasmatrix: - -``pygad.nn.layers_weights_as_matrix()`` ---------------------------------------- - -Converts the network weights from vectors to matrices. - -Compared to the ``layers_weights_as_vectors()`` function that only -accepts a reference to the last layer and returns the network weights as -vectors, this function accepts a reference to the last layer in addition -to a list holding the weights as vectors. Such vectors are converted -into matrices. - -Accepts the following parameters: - -- ``last_layer``: A reference to the last (output) layer in the network - architecture. - -- ``vector_weights``: The network weights as vectors where the weights - of each layer form a single vector. - -The function uses a ``while`` loop to iterate through the layers using -their ``previous_layer`` attribute. For each layer, the shape of its -weights array is returned. This shape is used to reshape the weights -vector of the layer into a matrix. - -.. _pygadnnlayersactivations: - -``pygad.nn.layers_activations()`` ---------------------------------- - -Creates and returns a list holding the names of the activation functions -of all layers in the neural network. - -Accepts the following parameter: - -- ``last_layer``: A reference to the last (output) layer in the network - architecture. - -The function uses a ``while`` loop to iterate through the layers using -their ``previous_layer`` attribute. For each layer, the name of the -activation function used is returned using the layer's -``activation_function`` attribute. - -.. _pygadnnsigmoid: - -``pygad.nn.sigmoid()`` ----------------------- - -Applies the sigmoid function and returns its result. - -Accepts the following parameters: - -- ``sop``: The input to which the sigmoid function is applied. - -.. _pygadnnrelu: - -``pygad.nn.relu()`` -------------------- - -Applies the rectified linear unit (ReLU) function and returns its -result. - -Accepts the following parameters: - -- ``sop``: The input to which the relu function is applied. - -.. _pygadnnsoftmax: - -``pygad.nn.softmax()`` ----------------------- - -Applies the softmax function and returns its result. - -Accepts the following parameters: - -- ``sop``: The input to which the softmax function is applied. - -.. _pygadnntrain: - -``pygad.nn.train()`` --------------------- - -Trains the neural network. - -Accepts the following parameters: - -- ``num_epochs``: Number of epochs. - -- ``last_layer``: Reference to the last (output) layer in the network - architecture. - -- ``data_inputs``: Data features. - -- ``data_outputs``: Data outputs. - -- ``problem_type``: The type of the problem which can be either - ``"classification"`` or ``"regression"``. Added in PyGAD 2.7.0 and - higher. - -- ``learning_rate``: Learning rate. - -For each epoch, all the data samples are fed to the network to return -their predictions. After each epoch, the weights are updated using only -the learning rate. No learning algorithm is used because the purpose of -this project is to only build the forward pass of training a neural -network. - -.. _pygadnnupdateweights: - -``pygad.nn.update_weights()`` ------------------------------ - -Calculates and returns the updated weights. Even no training algorithm -is used in this project, the weights are updated using the learning -rate. It is not the best way to update the weights but it is better than -keeping it as it is by making some small changes to the weights. - -Accepts the following parameters: - -- ``weights``: The current weights of the network. - -- ``network_error``: The network error. - -- ``learning_rate``: The learning rate. - -.. _pygadnnupdatelayerstrainedweights: - -``pygad.nn.update_layers_trained_weights()`` --------------------------------------------- - -After the network weights are trained, this function updates the -``trained_weights`` attribute of each layer by the weights calculated -after passing all the epochs (such weights are passed in the -``final_weights`` parameter) - -By just passing a reference to the last layer in the network (i.e. -output layer) in addition to the final weights, this function updates -the ``trained_weights`` attribute of all layers. - -Accepts the following parameters: - -- ``last_layer``: A reference to the last (output) layer in the network - architecture. - -- ``final_weights``: An array of weights of all layers in the network - after passing through all the epochs. - -The function uses a ``while`` loop to iterate through the layers using -their ``previous_layer`` attribute. For each layer, its -``trained_weights`` attribute is assigned the weights of the layer from -the ``final_weights`` parameter. - -.. _pygadnnpredict: - -``pygad.nn.predict()`` ----------------------- - -Uses the trained weights for predicting the samples' outputs. It returns -a list of the predicted outputs for all samples. - -Accepts the following parameters: - -- ``last_layer``: A reference to the last (output) layer in the network - architecture. - -- ``data_inputs``: Data features. - -- ``problem_type``: The type of the problem which can be either - ``"classification"`` or ``"regression"``. Added in PyGAD 2.7.0 and - higher. - -All the data samples are fed to the network to return their predictions. - -Helper Functions -================ - -There are functions in the ``pygad.nn`` module that does not directly -manipulate the neural networks. - -.. _pygadnntovector: - -``pygad.nn.to_vector()`` ------------------------- - -Converts a passed NumPy array (of any dimensionality) to its ``array`` -parameter into a 1D vector and returns the vector. - -Accepts the following parameters: - -- ``array``: The NumPy array to be converted into a 1D vector. - -.. _pygadnntoarray: - -``pygad.nn.to_array()`` ------------------------ - -Converts a passed vector to its ``vector`` parameter into a NumPy array -and returns the array. - -Accepts the following parameters: - -- ``vector``: The 1D vector to be converted into an array. - -- ``shape``: The target shape of the array. - -Supported Activation Functions -============================== - -The supported activation functions are: - -1. Sigmoid: Implemented using the ``pygad.nn.sigmoid()`` function. - -2. Rectified Linear Unit (ReLU): Implemented using the - ``pygad.nn.relu()`` function. - -3. Softmax: Implemented using the ``pygad.nn.softmax()`` function. - -Steps to Build a Neural Network -=============================== - -This section discusses how to use the ``pygad.nn`` module for building a -neural network. The summary of the steps are as follows: - -- Reading the Data - -- Building the Network Architecture - -- Training the Network - -- Making Predictions - -- Calculating Some Statistics - -Reading the Data ----------------- - -Before building the network architecture, the first thing to do is to -prepare the data that will be used for training the network. - -In this example, 4 classes of the **Fruits360** dataset are used for -preparing the training data. The 4 classes are: - -1. `Apple - Braeburn `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/apple - -2. `Lemon - Meyer `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/lemon - -3. `Mango `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/mango - -4. `Raspberry `__: - This class's data is available at - https://github.com/ahmedfgad/NumPyANN/tree/master/raspberry - -The features from such 4 classes are extracted according to the next -code. This code reads the raw images of the 4 classes of the dataset, -prepares the features and the outputs as NumPy arrays, and saves the -arrays in 2 files. - -This code extracts a feature vector from each image representing the -color histogram of the HSV space's hue channel. - -.. code:: python - - import numpy - import skimage.io, skimage.color, skimage.feature - import os - - fruits = ["apple", "raspberry", "mango", "lemon"] - # Number of samples in the datset used = 492+490+490+490=1,962 - # 360 is the length of the feature vector. - dataset_features = numpy.zeros(shape=(1962, 360)) - outputs = numpy.zeros(shape=(1962)) - - idx = 0 - class_label = 0 - for fruit_dir in fruits: - curr_dir = os.path.join(os.path.sep, fruit_dir) - all_imgs = os.listdir(os.getcwd()+curr_dir) - for img_file in all_imgs: - if img_file.endswith(".jpg"): # Ensures reading only JPG files. - fruit_data = skimage.io.imread(fname=os.path.sep.join([os.getcwd(), curr_dir, img_file]), as_gray=False) - fruit_data_hsv = skimage.color.rgb2hsv(rgb=fruit_data) - hist = numpy.histogram(a=fruit_data_hsv[:, :, 0], bins=360) - dataset_features[idx, :] = hist[0] - outputs[idx] = class_label - idx = idx + 1 - class_label = class_label + 1 - - # Saving the extracted features and the outputs as NumPy files. - numpy.save("dataset_features.npy", dataset_features) - numpy.save("outputs.npy", outputs) - -To save your time, the training data is already prepared and 2 files -created by the next code are available for download at these links: - -1. `dataset_features.npy `__: - The features - https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy - -2. `outputs.npy `__: - The class labels - https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy - -The -`outputs.npy `__ -file gives the following labels for the 4 classes: - -1. `Apple - Braeburn `__: - Class label is **0** - -2. `Lemon - Meyer `__: - Class label is **1** - -3. `Mango `__: - Class label is **2** - -4. `Raspberry `__: - Class label is **3** - -The project has 4 folders holding the images for the 4 classes. - -After the 2 files are created, then just read them to return the NumPy -arrays according to the next 2 lines: - -.. code:: python - - data_inputs = numpy.load("dataset_features.npy") - data_outputs = numpy.load("outputs.npy") - -After the data is prepared, next is to create the network architecture. - -Building the Network Architecture ---------------------------------- - -The input layer is created by instantiating the ``pygad.nn.InputLayer`` -class according to the next code. A network can only have a single input -layer. - -.. code:: python - - import pygad.nn - num_inputs = data_inputs.shape[1] - - input_layer = pygad.nn.InputLayer(num_inputs) - -After the input layer is created, next is to create a number of dense -layers according to the next code. Normally, the last dense layer is -regarded as the output layer. Note that the output layer has a number of -neurons equal to the number of classes in the dataset which is 4. - -.. code:: python - - hidden_layer = pygad.nn.DenseLayer(num_neurons=HL2_neurons, previous_layer=input_layer, activation_function="relu") - output_layer = pygad.nn.DenseLayer(num_neurons=4, previous_layer=hidden_layer2, activation_function="softmax") - -After both the data and the network architecture are prepared, the next -step is to train the network. - -Training the Network --------------------- - -Here is an example of using the ``pygad.nn.train()`` function. - -.. code:: python - - pygad.nn.train(num_epochs=10, - last_layer=output_layer, - data_inputs=data_inputs, - data_outputs=data_outputs, - learning_rate=0.01) - -After training the network, the next step is to make predictions. - -Making Predictions ------------------- - -The ``pygad.nn.predict()`` function uses the trained network for making -predictions. Here is an example. - -.. code:: python - - predictions = pygad.nn.predict(last_layer=output_layer, data_inputs=data_inputs) - -It is not expected to have high accuracy in the predictions because no -training algorithm is used. - -Calculating Some Statistics ---------------------------- - -Based on the predictions the network made, some statistics can be -calculated such as the number of correct and wrong predictions in -addition to the classification accuracy. - -.. code:: python - - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -It is very important to note that it is not expected that the -classification accuracy is high because no training algorithm is used. -Please check the documentation of the ``pygad.gann`` module for training -the network using the genetic algorithm. - -Examples -======== - -This section gives the complete code of some examples that build neural -networks using ``pygad.nn``. Each subsection builds a different network. - -XOR Classification ------------------- - -This is an example of building a network with 1 hidden layer with 2 -neurons for building a network that simulates the XOR logic gate. -Because the XOR problem has 2 classes (0 and 1), then the output layer -has 2 neurons, one for each class. - -.. code:: python - - import numpy - import pygad.nn - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.array([[1, 1], - [1, 0], - [0, 1], - [0, 0]]) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.array([0, - 1, - 1, - 0]) - - # The number of inputs (i.e. feature vector length) per sample - num_inputs = data_inputs.shape[1] - # Number of outputs per sample - num_outputs = 2 - - HL1_neurons = 2 - - # Building the network architecture. - input_layer = pygad.nn.InputLayer(num_inputs) - hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") - output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="softmax") - - # Training the network. - pygad.nn.train(num_epochs=10, - last_layer=output_layer, - data_inputs=data_inputs, - data_outputs=data_outputs, - learning_rate=0.01) - - # Using the trained network for predictions. - predictions = pygad.nn.predict(last_layer=output_layer, data_inputs=data_inputs) - - # Calculating some statistics - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -Image Classification --------------------- - -This example is discussed in the **Steps to Build a Neural Network** -section and its complete code is listed below. - -Remember to either download or create the -`dataset_features.npy `__ -and -`outputs.npy `__ -files before running this code. - -.. code:: python - - import numpy - import pygad.nn - - # Reading the data features. Check the 'extract_features.py' script for extracting the features & preparing the outputs of the dataset. - data_inputs = numpy.load("dataset_features.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy - - # Optional step for filtering the features using the standard deviation. - features_STDs = numpy.std(a=data_inputs, axis=0) - data_inputs = data_inputs[:, features_STDs > 50] - - # Reading the data outputs. Check the 'extract_features.py' script for extracting the features & preparing the outputs of the dataset. - data_outputs = numpy.load("outputs.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy - - # The number of inputs (i.e. feature vector length) per sample - num_inputs = data_inputs.shape[1] - # Number of outputs per sample - num_outputs = 4 - - HL1_neurons = 150 - HL2_neurons = 60 - - # Building the network architecture. - input_layer = pygad.nn.InputLayer(num_inputs) - hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") - hidden_layer2 = pygad.nn.DenseLayer(num_neurons=HL2_neurons, previous_layer=hidden_layer1, activation_function="relu") - output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer2, activation_function="softmax") - - # Training the network. - pygad.nn.train(num_epochs=10, - last_layer=output_layer, - data_inputs=data_inputs, - data_outputs=data_outputs, - learning_rate=0.01) - - # Using the trained network for predictions. - predictions = pygad.nn.predict(last_layer=output_layer, data_inputs=data_inputs) - - # Calculating some statistics - num_wrong = numpy.where(predictions != data_outputs)[0] - num_correct = data_outputs.size - num_wrong.size - accuracy = 100 * (num_correct/data_outputs.size) - print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) - print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) - print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) - -Regression Example 1 --------------------- - -The next code listing builds a neural network for regression. Here is -what to do to make the code works for regression: - -1. Set the ``problem_type`` parameter in the ``pygad.nn.train()`` and - ``pygad.nn.predict()`` functions to the string ``"regression"``. - -.. code:: python - - pygad.nn.train(..., - problem_type="regression") - - predictions = pygad.nn.predict(..., - problem_type="regression") - -1. Set the activation function for the output layer to the string - ``"None"``. - -.. code:: python - - output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="None") - -1. Calculate the prediction error according to your preferred error - function. Here is how the mean absolute error is calculated. - -.. code:: python - - abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) - print("Absolute error : {abs_error}.".format(abs_error=abs_error)) - -Here is the complete code. Yet, there is no algorithm used to train the -network and thus the network is expected to give bad results. Later, the -``pygad.gann`` module is used to train either a regression or -classification networks. - -.. code:: python - - import numpy - import pygad.nn - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.array([[2, 5, -3, 0.1], - [8, 15, 20, 13]]) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.array([0.1, - 1.5]) - - # The number of inputs (i.e. feature vector length) per sample - num_inputs = data_inputs.shape[1] - # Number of outputs per sample - num_outputs = 1 - - HL1_neurons = 2 - - # Building the network architecture. - input_layer = pygad.nn.InputLayer(num_inputs) - hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") - output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="None") - - # Training the network. - pygad.nn.train(num_epochs=100, - last_layer=output_layer, - data_inputs=data_inputs, - data_outputs=data_outputs, - learning_rate=0.01, - problem_type="regression") - - # Using the trained network for predictions. - predictions = pygad.nn.predict(last_layer=output_layer, - data_inputs=data_inputs, - problem_type="regression") - - # Calculating some statistics - abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) - print("Absolute error : {abs_error}.".format(abs_error=abs_error)) - -Regression Example 2 - Fish Weight Prediction ---------------------------------------------- - -This example uses the Fish Market Dataset available at Kaggle -(https://www.kaggle.com/aungpyaeap/fish-market). Simply download the CSV -dataset from `this -link `__ -(https://www.kaggle.com/aungpyaeap/fish-market/download). The dataset is -also available at the `GitHub project of the pygad.nn -module `__: -https://github.com/ahmedfgad/NumPyANN - -Using the Pandas library, the dataset is read using the ``read_csv()`` -function. - -.. code:: python - - data = numpy.array(pandas.read_csv("Fish.csv")) - -The last 5 columns in the dataset are used as inputs and the **Weight** -column is used as output. - -.. code:: python - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) # Fish Weight - -Note how the activation function at the last layer is set to ``"None"``. -Moreover, the ``problem_type`` parameter in the ``pygad.nn.train()`` and -``pygad.nn.predict()`` functions is set to ``"regression"``. - -After the ``pygad.nn.train()`` function completes, the mean absolute -error is calculated. - -.. code:: python - - abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) - print("Absolute error : {abs_error}.".format(abs_error=abs_error)) - -Here is the complete code. - -.. code:: python - - import numpy - import pygad.nn - import pandas - - data = numpy.array(pandas.read_csv("Fish.csv")) - - # Preparing the NumPy array of the inputs. - data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) - - # Preparing the NumPy array of the outputs. - data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) # Fish Weight - - # The number of inputs (i.e. feature vector length) per sample - num_inputs = data_inputs.shape[1] - # Number of outputs per sample - num_outputs = 1 - - HL1_neurons = 2 - - # Building the network architecture. - input_layer = pygad.nn.InputLayer(num_inputs) - hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") - output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="None") - - # Training the network. - pygad.nn.train(num_epochs=100, - last_layer=output_layer, - data_inputs=data_inputs, - data_outputs=data_outputs, - learning_rate=0.01, - problem_type="regression") - - # Using the trained network for predictions. - predictions = pygad.nn.predict(last_layer=output_layer, - data_inputs=data_inputs, - problem_type="regression") - - # Calculating some statistics - abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) - print("Absolute error : {abs_error}.".format(abs_error=abs_error)) +.. _pygadnn-module: + +``pygad.nn`` Module +=================== + +This section of the PyGAD's library documentation discusses the +**pygad.nn** module. + +Using the **pygad.nn** module, artificial neural networks are created. +The purpose of this module is to only implement the **forward pass** of +a neural network without using a training algorithm. The **pygad.nn** +module builds the network layers, implements the activations functions, +trains the network, makes predictions, and more. + +Later, the **pygad.gann** module is used to train the **pygad.nn** +network using the genetic algorithm built in the **pygad** module. + +Starting from `PyGAD +2.7.1 `__, +the **pygad.nn** module supports both classification and regression +problems. For more information, check the ``problem_type`` parameter in +the ``pygad.nn.train()`` and ``pygad.nn.predict()`` functions. + +Supported Layers +================ + +Each layer supported by the **pygad.nn** module has a corresponding +class. The layers and their classes are: + +1. **Input**: Implemented using the ``pygad.nn.InputLayer`` class. + +2. **Dense** (Fully Connected): Implemented using the + ``pygad.nn.DenseLayer`` class. + +In the future, more layers will be added. The next subsections discuss +such layers. + +.. _pygadnninputlayer-class: + +``pygad.nn.InputLayer`` Class +----------------------------- + +The ``pygad.nn.InputLayer`` class creates the input layer for the neural +network. For each network, there is only a single input layer. The +network architecture must start with an input layer. + +This class has no methods or class attributes. All it has is a +constructor that accepts a parameter named ``num_neurons`` representing +the number of neurons in the input layer. + +An instance attribute named ``num_neurons`` is created within the +constructor to keep such a number. Here is an example of building an +input layer with 20 neurons. + +.. code:: python + + input_layer = pygad.nn.InputLayer(num_neurons=20) + +Here is how the single attribute ``num_neurons`` within the instance of +the ``pygad.nn.InputLayer`` class can be accessed. + +.. code:: python + + num_input_neurons = input_layer.num_neurons + + print("Number of input neurons =", num_input_neurons) + +This is everything about the input layer. + +.. _pygadnndenselayer-class: + +``pygad.nn.DenseLayer`` Class +----------------------------- + +Using the ``pygad.nn.DenseLayer`` class, dense (fully-connected) layers +can be created. To create a dense layer, just create a new instance of +the class. The constructor accepts the following parameters: + +- ``num_neurons``: Number of neurons in the dense layer. + +- ``previous_layer``: A reference to the previous layer. Using the + ``previous_layer`` attribute, a linked list is created that connects + all network layers. + +- ``activation_function``: A string representing the activation + function to be used in this layer. Defaults to ``"sigmoid"``. + Currently, the supported values for the activation functions are + ``"sigmoid"``, ``"relu"``, ``"softmax"`` (supported in PyGAD 2.3.0 + and higher), and ``"None"`` (supported in PyGAD 2.7.0 and higher). + When a layer has its activation function set to ``"None"``, then it + means no activation function is applied. For a **regression + problem**, set the activation function of the output (last) layer to + ``"None"``. If all outputs in the regression problem are nonnegative, + then it is possible to use the ReLU function in the output layer. + +Within the constructor, the accepted parameters are used as instance +attributes. Besides the parameters, some new instance attributes are +created which are: + +- ``initial_weights``: The initial weights for the dense layer. + +- ``trained_weights``: The trained weights of the dense layer. This + attribute is initialized by the value in the ``initial_weights`` + attribute. + +Here is an example for creating a dense layer with 12 neurons. Note that +the ``previous_layer`` parameter is assigned to the input layer +``input_layer``. + +.. code:: python + + dense_layer = pygad.nn.DenseLayer(num_neurons=12, + previous_layer=input_layer, + activation_function="relu") + +Here is how to access some attributes in the dense layer: + +.. code:: python + + num_dense_neurons = dense_layer.num_neurons + dense_initail_weights = dense_layer.initial_weights + + print("Number of dense layer attributes =", num_dense_neurons) + print("Initial weights of the dense layer :", dense_initail_weights) + +Because ``dense_layer`` holds a reference to the input layer, then the +number of input neurons can be accessed. + +.. code:: python + + input_layer = dense_layer.previous_layer + num_input_neurons = input_layer.num_neurons + + print("Number of input neurons =", num_input_neurons) + +Here is another dense layer. This dense layer's ``previous_layer`` +attribute points to the previously created dense layer. + +.. code:: python + + dense_layer2 = pygad.nn.DenseLayer(num_neurons=5, + previous_layer=dense_layer, + activation_function="relu") + +Because ``dense_layer2`` holds a reference to ``dense_layer`` in its +``previous_layer`` attribute, then the number of neurons in +``dense_layer`` can be accessed. + +.. code:: python + + dense_layer = dense_layer2.previous_layer + dense_layer_neurons = dense_layer.num_neurons + + print("Number of dense neurons =", num_input_neurons) + +After getting the reference to ``dense_layer``, we can use it to access +the number of input neurons. + +.. code:: python + + dense_layer = dense_layer2.previous_layer + input_layer = dense_layer.previous_layer + num_input_neurons = input_layer.num_neurons + + print("Number of input neurons =", num_input_neurons) + +Assuming that ``dense_layer2`` is the last dense layer, then it is +regarded as the output layer. + +.. _previouslayer-attribute: + +``previous_layer`` Attribute +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``previous_layer`` attribute in the ``pygad.nn.DenseLayer`` class +creates a one way linked list between all the layers in the network +architecture as described by the next figure. + +The last (output) layer indexed N points to layer **N-1**, layer **N-1** +points to the layer **N-2**, the layer **N-2** points to the layer +**N-3**, and so on until reaching the end of the linked list which is +layer 1 (input layer). + +.. figure:: https://user-images.githubusercontent.com/16560492/81918975-816af880-95d7-11ea-83e3-34d14c3316db.jpg + :alt: + +The one way linked list allows returning all properties of all layers in +the network architecture by just passing the last layer in the network. +The linked list moves from the output layer towards the input layer. + +Using the ``previous_layer`` attribute of layer **N**, the layer **N-1** +can be accessed. Using the ``previous_layer`` attribute of layer +**N-1**, layer **N-2** can be accessed. The process continues until +reaching a layer that does not have a ``previous_layer`` attribute +(which is the input layer). + +The properties of the layers include the weights (initial or trained), +activation functions, and more. Here is how a ``while`` loop is used to +iterate through all the layers. The ``while`` loop stops only when the +current layer does not have a ``previous_layer`` attribute. This layer +is the input layer. + +.. code:: python + + layer = dense_layer2 + + while "previous_layer" in layer.__init__.__code__.co_varnames: + print("Number of neurons =", layer.num_neurons) + + # Go to the previous layer. + layer = layer.previous_layer + +Functions to Manipulate Neural Networks +======================================= + +There are a number of functions existing in the ``pygad.nn`` module that +helps to manipulate the neural network. + +.. _pygadnnlayersweights: + +``pygad.nn.layers_weights()`` +----------------------------- + +Creates and returns a list holding the weights matrices of all layers in +the neural network. + +Accepts the following parameters: + +- ``last_layer``: A reference to the last (output) layer in the network + architecture. + +- ``initial``: When ``True`` (default), the function returns the + **initial** weights of the layers using the layers' + ``initial_weights`` attribute. When ``False``, it returns the + **trained** weights of the layers using the layers' + ``trained_weights`` attribute. The initial weights are only needed + before network training starts. The trained weights are needed to + predict the network outputs. + +The function uses a ``while`` loop to iterate through the layers using +their ``previous_layer`` attribute. For each layer, either the initial +weights or the trained weights are returned based on where the +``initial`` parameter is ``True`` or ``False``. + +.. _pygadnnlayersweightsasvector: + +``pygad.nn.layers_weights_as_vector()`` +--------------------------------------- + +Creates and returns a list holding the weights **vectors** of all layers +in the neural network. The weights array of each layer is reshaped to +get a vector. + +This function is similar to the ``layers_weights()`` function except +that it returns the weights of each layer as a vector, not as an array. + +Accepts the following parameters: + +- ``last_layer``: A reference to the last (output) layer in the network + architecture. + +- ``initial``: When ``True`` (default), the function returns the + **initial** weights of the layers using the layers' + ``initial_weights`` attribute. When ``False``, it returns the + **trained** weights of the layers using the layers' + ``trained_weights`` attribute. The initial weights are only needed + before network training starts. The trained weights are needed to + predict the network outputs. + +The function uses a ``while`` loop to iterate through the layers using +their ``previous_layer`` attribute. For each layer, either the initial +weights or the trained weights are returned based on where the +``initial`` parameter is ``True`` or ``False``. + +.. _pygadnnlayersweightsasmatrix: + +``pygad.nn.layers_weights_as_matrix()`` +--------------------------------------- + +Converts the network weights from vectors to matrices. + +Compared to the ``layers_weights_as_vectors()`` function that only +accepts a reference to the last layer and returns the network weights as +vectors, this function accepts a reference to the last layer in addition +to a list holding the weights as vectors. Such vectors are converted +into matrices. + +Accepts the following parameters: + +- ``last_layer``: A reference to the last (output) layer in the network + architecture. + +- ``vector_weights``: The network weights as vectors where the weights + of each layer form a single vector. + +The function uses a ``while`` loop to iterate through the layers using +their ``previous_layer`` attribute. For each layer, the shape of its +weights array is returned. This shape is used to reshape the weights +vector of the layer into a matrix. + +.. _pygadnnlayersactivations: + +``pygad.nn.layers_activations()`` +--------------------------------- + +Creates and returns a list holding the names of the activation functions +of all layers in the neural network. + +Accepts the following parameter: + +- ``last_layer``: A reference to the last (output) layer in the network + architecture. + +The function uses a ``while`` loop to iterate through the layers using +their ``previous_layer`` attribute. For each layer, the name of the +activation function used is returned using the layer's +``activation_function`` attribute. + +.. _pygadnnsigmoid: + +``pygad.nn.sigmoid()`` +---------------------- + +Applies the sigmoid function and returns its result. + +Accepts the following parameters: + +- ``sop``: The input to which the sigmoid function is applied. + +.. _pygadnnrelu: + +``pygad.nn.relu()`` +------------------- + +Applies the rectified linear unit (ReLU) function and returns its +result. + +Accepts the following parameters: + +- ``sop``: The input to which the relu function is applied. + +.. _pygadnnsoftmax: + +``pygad.nn.softmax()`` +---------------------- + +Applies the softmax function and returns its result. + +Accepts the following parameters: + +- ``sop``: The input to which the softmax function is applied. + +.. _pygadnntrain: + +``pygad.nn.train()`` +-------------------- + +Trains the neural network. + +Accepts the following parameters: + +- ``num_epochs``: Number of epochs. + +- ``last_layer``: Reference to the last (output) layer in the network + architecture. + +- ``data_inputs``: Data features. + +- ``data_outputs``: Data outputs. + +- ``problem_type``: The type of the problem which can be either + ``"classification"`` or ``"regression"``. Added in PyGAD 2.7.0 and + higher. + +- ``learning_rate``: Learning rate. + +For each epoch, all the data samples are fed to the network to return +their predictions. After each epoch, the weights are updated using only +the learning rate. No learning algorithm is used because the purpose of +this project is to only build the forward pass of training a neural +network. + +.. _pygadnnupdateweights: + +``pygad.nn.update_weights()`` +----------------------------- + +Calculates and returns the updated weights. Even no training algorithm +is used in this project, the weights are updated using the learning +rate. It is not the best way to update the weights but it is better than +keeping it as it is by making some small changes to the weights. + +Accepts the following parameters: + +- ``weights``: The current weights of the network. + +- ``network_error``: The network error. + +- ``learning_rate``: The learning rate. + +.. _pygadnnupdatelayerstrainedweights: + +``pygad.nn.update_layers_trained_weights()`` +-------------------------------------------- + +After the network weights are trained, this function updates the +``trained_weights`` attribute of each layer by the weights calculated +after passing all the epochs (such weights are passed in the +``final_weights`` parameter) + +By just passing a reference to the last layer in the network (i.e. +output layer) in addition to the final weights, this function updates +the ``trained_weights`` attribute of all layers. + +Accepts the following parameters: + +- ``last_layer``: A reference to the last (output) layer in the network + architecture. + +- ``final_weights``: An array of weights of all layers in the network + after passing through all the epochs. + +The function uses a ``while`` loop to iterate through the layers using +their ``previous_layer`` attribute. For each layer, its +``trained_weights`` attribute is assigned the weights of the layer from +the ``final_weights`` parameter. + +.. _pygadnnpredict: + +``pygad.nn.predict()`` +---------------------- + +Uses the trained weights for predicting the samples' outputs. It returns +a list of the predicted outputs for all samples. + +Accepts the following parameters: + +- ``last_layer``: A reference to the last (output) layer in the network + architecture. + +- ``data_inputs``: Data features. + +- ``problem_type``: The type of the problem which can be either + ``"classification"`` or ``"regression"``. Added in PyGAD 2.7.0 and + higher. + +All the data samples are fed to the network to return their predictions. + +Helper Functions +================ + +There are functions in the ``pygad.nn`` module that does not directly +manipulate the neural networks. + +.. _pygadnntovector: + +``pygad.nn.to_vector()`` +------------------------ + +Converts a passed NumPy array (of any dimensionality) to its ``array`` +parameter into a 1D vector and returns the vector. + +Accepts the following parameters: + +- ``array``: The NumPy array to be converted into a 1D vector. + +.. _pygadnntoarray: + +``pygad.nn.to_array()`` +----------------------- + +Converts a passed vector to its ``vector`` parameter into a NumPy array +and returns the array. + +Accepts the following parameters: + +- ``vector``: The 1D vector to be converted into an array. + +- ``shape``: The target shape of the array. + +Supported Activation Functions +============================== + +The supported activation functions are: + +1. Sigmoid: Implemented using the ``pygad.nn.sigmoid()`` function. + +2. Rectified Linear Unit (ReLU): Implemented using the + ``pygad.nn.relu()`` function. + +3. Softmax: Implemented using the ``pygad.nn.softmax()`` function. + +Steps to Build a Neural Network +=============================== + +This section discusses how to use the ``pygad.nn`` module for building a +neural network. The summary of the steps are as follows: + +- Reading the Data + +- Building the Network Architecture + +- Training the Network + +- Making Predictions + +- Calculating Some Statistics + +Reading the Data +---------------- + +Before building the network architecture, the first thing to do is to +prepare the data that will be used for training the network. + +In this example, 4 classes of the **Fruits360** dataset are used for +preparing the training data. The 4 classes are: + +1. `Apple + Braeburn `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/apple + +2. `Lemon + Meyer `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/lemon + +3. `Mango `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/mango + +4. `Raspberry `__: + This class's data is available at + https://github.com/ahmedfgad/NumPyANN/tree/master/raspberry + +The features from such 4 classes are extracted according to the next +code. This code reads the raw images of the 4 classes of the dataset, +prepares the features and the outputs as NumPy arrays, and saves the +arrays in 2 files. + +This code extracts a feature vector from each image representing the +color histogram of the HSV space's hue channel. + +.. code:: python + + import numpy + import skimage.io, skimage.color, skimage.feature + import os + + fruits = ["apple", "raspberry", "mango", "lemon"] + # Number of samples in the datset used = 492+490+490+490=1,962 + # 360 is the length of the feature vector. + dataset_features = numpy.zeros(shape=(1962, 360)) + outputs = numpy.zeros(shape=(1962)) + + idx = 0 + class_label = 0 + for fruit_dir in fruits: + curr_dir = os.path.join(os.path.sep, fruit_dir) + all_imgs = os.listdir(os.getcwd()+curr_dir) + for img_file in all_imgs: + if img_file.endswith(".jpg"): # Ensures reading only JPG files. + fruit_data = skimage.io.imread(fname=os.path.sep.join([os.getcwd(), curr_dir, img_file]), as_gray=False) + fruit_data_hsv = skimage.color.rgb2hsv(rgb=fruit_data) + hist = numpy.histogram(a=fruit_data_hsv[:, :, 0], bins=360) + dataset_features[idx, :] = hist[0] + outputs[idx] = class_label + idx = idx + 1 + class_label = class_label + 1 + + # Saving the extracted features and the outputs as NumPy files. + numpy.save("dataset_features.npy", dataset_features) + numpy.save("outputs.npy", outputs) + +To save your time, the training data is already prepared and 2 files +created by the next code are available for download at these links: + +1. `dataset_features.npy `__: + The features + https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy + +2. `outputs.npy `__: + The class labels + https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy + +The +`outputs.npy `__ +file gives the following labels for the 4 classes: + +1. `Apple + Braeburn `__: + Class label is **0** + +2. `Lemon + Meyer `__: + Class label is **1** + +3. `Mango `__: + Class label is **2** + +4. `Raspberry `__: + Class label is **3** + +The project has 4 folders holding the images for the 4 classes. + +After the 2 files are created, then just read them to return the NumPy +arrays according to the next 2 lines: + +.. code:: python + + data_inputs = numpy.load("dataset_features.npy") + data_outputs = numpy.load("outputs.npy") + +After the data is prepared, next is to create the network architecture. + +Building the Network Architecture +--------------------------------- + +The input layer is created by instantiating the ``pygad.nn.InputLayer`` +class according to the next code. A network can only have a single input +layer. + +.. code:: python + + import pygad.nn + num_inputs = data_inputs.shape[1] + + input_layer = pygad.nn.InputLayer(num_inputs) + +After the input layer is created, next is to create a number of dense +layers according to the next code. Normally, the last dense layer is +regarded as the output layer. Note that the output layer has a number of +neurons equal to the number of classes in the dataset which is 4. + +.. code:: python + + hidden_layer = pygad.nn.DenseLayer(num_neurons=HL2_neurons, previous_layer=input_layer, activation_function="relu") + output_layer = pygad.nn.DenseLayer(num_neurons=4, previous_layer=hidden_layer2, activation_function="softmax") + +After both the data and the network architecture are prepared, the next +step is to train the network. + +Training the Network +-------------------- + +Here is an example of using the ``pygad.nn.train()`` function. + +.. code:: python + + pygad.nn.train(num_epochs=10, + last_layer=output_layer, + data_inputs=data_inputs, + data_outputs=data_outputs, + learning_rate=0.01) + +After training the network, the next step is to make predictions. + +Making Predictions +------------------ + +The ``pygad.nn.predict()`` function uses the trained network for making +predictions. Here is an example. + +.. code:: python + + predictions = pygad.nn.predict(last_layer=output_layer, data_inputs=data_inputs) + +It is not expected to have high accuracy in the predictions because no +training algorithm is used. + +Calculating Some Statistics +--------------------------- + +Based on the predictions the network made, some statistics can be +calculated such as the number of correct and wrong predictions in +addition to the classification accuracy. + +.. code:: python + + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +It is very important to note that it is not expected that the +classification accuracy is high because no training algorithm is used. +Please check the documentation of the ``pygad.gann`` module for training +the network using the genetic algorithm. + +Examples +======== + +This section gives the complete code of some examples that build neural +networks using ``pygad.nn``. Each subsection builds a different network. + +XOR Classification +------------------ + +This is an example of building a network with 1 hidden layer with 2 +neurons for building a network that simulates the XOR logic gate. +Because the XOR problem has 2 classes (0 and 1), then the output layer +has 2 neurons, one for each class. + +.. code:: python + + import numpy + import pygad.nn + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.array([[1, 1], + [1, 0], + [0, 1], + [0, 0]]) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.array([0, + 1, + 1, + 0]) + + # The number of inputs (i.e. feature vector length) per sample + num_inputs = data_inputs.shape[1] + # Number of outputs per sample + num_outputs = 2 + + HL1_neurons = 2 + + # Building the network architecture. + input_layer = pygad.nn.InputLayer(num_inputs) + hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") + output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="softmax") + + # Training the network. + pygad.nn.train(num_epochs=10, + last_layer=output_layer, + data_inputs=data_inputs, + data_outputs=data_outputs, + learning_rate=0.01) + + # Using the trained network for predictions. + predictions = pygad.nn.predict(last_layer=output_layer, data_inputs=data_inputs) + + # Calculating some statistics + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +Image Classification +-------------------- + +This example is discussed in the **Steps to Build a Neural Network** +section and its complete code is listed below. + +Remember to either download or create the +`dataset_features.npy `__ +and +`outputs.npy `__ +files before running this code. + +.. code:: python + + import numpy + import pygad.nn + + # Reading the data features. Check the 'extract_features.py' script for extracting the features & preparing the outputs of the dataset. + data_inputs = numpy.load("dataset_features.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy + + # Optional step for filtering the features using the standard deviation. + features_STDs = numpy.std(a=data_inputs, axis=0) + data_inputs = data_inputs[:, features_STDs > 50] + + # Reading the data outputs. Check the 'extract_features.py' script for extracting the features & preparing the outputs of the dataset. + data_outputs = numpy.load("outputs.npy") # Download from https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy + + # The number of inputs (i.e. feature vector length) per sample + num_inputs = data_inputs.shape[1] + # Number of outputs per sample + num_outputs = 4 + + HL1_neurons = 150 + HL2_neurons = 60 + + # Building the network architecture. + input_layer = pygad.nn.InputLayer(num_inputs) + hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") + hidden_layer2 = pygad.nn.DenseLayer(num_neurons=HL2_neurons, previous_layer=hidden_layer1, activation_function="relu") + output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer2, activation_function="softmax") + + # Training the network. + pygad.nn.train(num_epochs=10, + last_layer=output_layer, + data_inputs=data_inputs, + data_outputs=data_outputs, + learning_rate=0.01) + + # Using the trained network for predictions. + predictions = pygad.nn.predict(last_layer=output_layer, data_inputs=data_inputs) + + # Calculating some statistics + num_wrong = numpy.where(predictions != data_outputs)[0] + num_correct = data_outputs.size - num_wrong.size + accuracy = 100 * (num_correct/data_outputs.size) + print("Number of correct classifications : {num_correct}.".format(num_correct=num_correct)) + print("Number of wrong classifications : {num_wrong}.".format(num_wrong=num_wrong.size)) + print("Classification accuracy : {accuracy}.".format(accuracy=accuracy)) + +Regression Example 1 +-------------------- + +The next code listing builds a neural network for regression. Here is +what to do to make the code works for regression: + +1. Set the ``problem_type`` parameter in the ``pygad.nn.train()`` and + ``pygad.nn.predict()`` functions to the string ``"regression"``. + +.. code:: python + + pygad.nn.train(..., + problem_type="regression") + + predictions = pygad.nn.predict(..., + problem_type="regression") + +1. Set the activation function for the output layer to the string + ``"None"``. + +.. code:: python + + output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="None") + +1. Calculate the prediction error according to your preferred error + function. Here is how the mean absolute error is calculated. + +.. code:: python + + abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) + print("Absolute error : {abs_error}.".format(abs_error=abs_error)) + +Here is the complete code. Yet, there is no algorithm used to train the +network and thus the network is expected to give bad results. Later, the +``pygad.gann`` module is used to train either a regression or +classification networks. + +.. code:: python + + import numpy + import pygad.nn + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.array([[2, 5, -3, 0.1], + [8, 15, 20, 13]]) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.array([0.1, + 1.5]) + + # The number of inputs (i.e. feature vector length) per sample + num_inputs = data_inputs.shape[1] + # Number of outputs per sample + num_outputs = 1 + + HL1_neurons = 2 + + # Building the network architecture. + input_layer = pygad.nn.InputLayer(num_inputs) + hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") + output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="None") + + # Training the network. + pygad.nn.train(num_epochs=100, + last_layer=output_layer, + data_inputs=data_inputs, + data_outputs=data_outputs, + learning_rate=0.01, + problem_type="regression") + + # Using the trained network for predictions. + predictions = pygad.nn.predict(last_layer=output_layer, + data_inputs=data_inputs, + problem_type="regression") + + # Calculating some statistics + abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) + print("Absolute error : {abs_error}.".format(abs_error=abs_error)) + +Regression Example 2 - Fish Weight Prediction +--------------------------------------------- + +This example uses the Fish Market Dataset available at Kaggle +(https://www.kaggle.com/aungpyaeap/fish-market). Simply download the CSV +dataset from `this +link `__ +(https://www.kaggle.com/aungpyaeap/fish-market/download). The dataset is +also available at the `GitHub project of the pygad.nn +module `__: +https://github.com/ahmedfgad/NumPyANN + +Using the Pandas library, the dataset is read using the ``read_csv()`` +function. + +.. code:: python + + data = numpy.array(pandas.read_csv("Fish.csv")) + +The last 5 columns in the dataset are used as inputs and the **Weight** +column is used as output. + +.. code:: python + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) # Fish Weight + +Note how the activation function at the last layer is set to ``"None"``. +Moreover, the ``problem_type`` parameter in the ``pygad.nn.train()`` and +``pygad.nn.predict()`` functions is set to ``"regression"``. + +After the ``pygad.nn.train()`` function completes, the mean absolute +error is calculated. + +.. code:: python + + abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) + print("Absolute error : {abs_error}.".format(abs_error=abs_error)) + +Here is the complete code. + +.. code:: python + + import numpy + import pygad.nn + import pandas + + data = numpy.array(pandas.read_csv("Fish.csv")) + + # Preparing the NumPy array of the inputs. + data_inputs = numpy.asarray(data[:, 2:], dtype=numpy.float32) + + # Preparing the NumPy array of the outputs. + data_outputs = numpy.asarray(data[:, 1], dtype=numpy.float32) # Fish Weight + + # The number of inputs (i.e. feature vector length) per sample + num_inputs = data_inputs.shape[1] + # Number of outputs per sample + num_outputs = 1 + + HL1_neurons = 2 + + # Building the network architecture. + input_layer = pygad.nn.InputLayer(num_inputs) + hidden_layer1 = pygad.nn.DenseLayer(num_neurons=HL1_neurons, previous_layer=input_layer, activation_function="relu") + output_layer = pygad.nn.DenseLayer(num_neurons=num_outputs, previous_layer=hidden_layer1, activation_function="None") + + # Training the network. + pygad.nn.train(num_epochs=100, + last_layer=output_layer, + data_inputs=data_inputs, + data_outputs=data_outputs, + learning_rate=0.01, + problem_type="regression") + + # Using the trained network for predictions. + predictions = pygad.nn.predict(last_layer=output_layer, + data_inputs=data_inputs, + problem_type="regression") + + # Calculating some statistics + abs_error = numpy.mean(numpy.abs(predictions - data_outputs)) + print("Absolute error : {abs_error}.".format(abs_error=abs_error)) diff --git a/docs/source/README_pygad_ReadTheDocs.rst b/docs/source/pygad.rst similarity index 97% rename from docs/source/README_pygad_ReadTheDocs.rst rename to docs/source/pygad.rst index c1e6d2f..4e09154 100644 --- a/docs/source/README_pygad_ReadTheDocs.rst +++ b/docs/source/pygad.rst @@ -1,4626 +1,4626 @@ -``pygad`` Module -================ - -This section of the PyGAD's library documentation discusses the -``pygad`` module. - -Using the ``pygad`` module, instances of the genetic algorithm can be -created, run, saved, and loaded. - -.. _pygadga-class: - -``pygad.GA`` Class -================== - -The first module available in PyGAD is named ``pygad`` and contains a -class named ``GA`` for building the genetic algorithm. The constructor, -methods, function, and attributes within the class are discussed in this -section. - -.. _init: - -``__init__()`` --------------- - -For creating an instance of the ``pygad.GA`` class, the constructor -accepts several parameters that allow the user to customize the genetic -algorithm to different types of applications. - -The ``pygad.GA`` class constructor supports the following parameters: - -- ``num_generations``: Number of generations. - -- ``num_parents_mating``: Number of solutions to be selected as - parents. - -- ``fitness_func``: Accepts a function/method and returns the fitness - value of the solution. If a function is passed, then it must accept 3 - parameters (1. the instance of the ``pygad.GA`` class, 2. a single - solution, and 3. its index in the population). If method, then it - accepts a fourth parameter representing the method's class instance. - Check the `Preparing the fitness_func - Parameter `__ - section for information about creating such a function. - -- ``fitness_batch_size=None``: A new optional parameter called - ``fitness_batch_size`` is supported to calculate the fitness function - in batches. If it is assigned the value ``1`` or ``None`` (default), - then the normal flow is used where the fitness function is called for - each individual solution. If the ``fitness_batch_size`` parameter is - assigned a value satisfying this condition - ``1 < fitness_batch_size <= sol_per_pop``, then the solutions are - grouped into batches of size ``fitness_batch_size`` and the fitness - function is called once for each batch. Check the `Batch Fitness - Calculation `__ - section for more details and examples. Added in from `PyGAD - 2.19.0 `__. - -- ``initial_population``: A user-defined initial population. It is - useful when the user wants to start the generations with a custom - initial population. It defaults to ``None`` which means no initial - population is specified by the user. In this case, - `PyGAD `__ creates an initial - population using the ``sol_per_pop`` and ``num_genes`` parameters. An - exception is raised if the ``initial_population`` is ``None`` while - any of the 2 parameters (``sol_per_pop`` or ``num_genes``) is also - ``None``. Introduced in `PyGAD - 2.0.0 `__ - and higher. - -- ``sol_per_pop``: Number of solutions (i.e. chromosomes) within the - population. This parameter has no action if ``initial_population`` - parameter exists. - -- ``num_genes``: Number of genes in the solution/chromosome. This - parameter is not needed if the user feeds the initial population to - the ``initial_population`` parameter. - -- ``gene_type=float``: Controls the gene type. It can be assigned to a - single data type that is applied to all genes or can specify the data - type of each individual gene. It defaults to ``float`` which means - all genes are of ``float`` data type. Starting from `PyGAD - 2.9.0 `__, - the ``gene_type`` parameter can be assigned to a numeric value of any - of these types: ``int``, ``float``, and - ``numpy.int/uint/float(8-64)``. Starting from `PyGAD - 2.14.0 `__, - it can be assigned to a ``list``, ``tuple``, or a ``numpy.ndarray`` - which hold a data type for each gene (e.g. - ``gene_type=[int, float, numpy.int8]``). This helps to control the - data type of each individual gene. In `PyGAD - 2.15.0 `__, - a precision for the ``float`` data types can be specified (e.g. - ``gene_type=[float, 2]``. - -- ``init_range_low=-4``: The lower value of the random range from which - the gene values in the initial population are selected. - ``init_range_low`` defaults to ``-4``. Available in `PyGAD - 1.0.20 `__ - and higher. This parameter has no action if the - ``initial_population`` parameter exists. - -- ``init_range_high=4``: The upper value of the random range from which - the gene values in the initial population are selected. - ``init_range_high`` defaults to ``+4``. Available in `PyGAD - 1.0.20 `__ - and higher. This parameter has no action if the - ``initial_population`` parameter exists. - -- ``parent_selection_type="sss"``: The parent selection type. Supported - types are ``sss`` (for steady-state selection), ``rws`` (for roulette - wheel selection), ``sus`` (for stochastic universal selection), - ``rank`` (for rank selection), ``random`` (for random selection), and - ``tournament`` (for tournament selection). A custom parent selection - function can be passed starting from `PyGAD - 2.16.0 `__. - Check the `User-Defined Crossover, Mutation, and Parent Selection - Operators `__ - section for more details about building a user-defined parent - selection function. - -- ``keep_parents=-1``: Number of parents to keep in the current - population. ``-1`` (default) means to keep all parents in the next - population. ``0`` means keep no parents in the next population. A - value ``greater than 0`` means keeps the specified number of parents - in the next population. Note that the value assigned to - ``keep_parents`` cannot be ``< - 1`` or greater than the number of - solutions within the population ``sol_per_pop``. Starting from `PyGAD - 2.18.0 `__, - this parameter have an effect only when the ``keep_elitism`` - parameter is ``0``. Starting from `PyGAD - 2.20.0 `__, - the parents' fitness from the last generation will not be re-used if - ``keep_parents=0``. - -- ``keep_elitism=1``: Added in `PyGAD - 2.18.0 `__. - It can take the value ``0`` or a positive integer that satisfies - (``0 <= keep_elitism <= sol_per_pop``). It defaults to ``1`` which - means only the best solution in the current generation is kept in the - next generation. If assigned ``0``, this means it has no effect. If - assigned a positive integer ``K``, then the best ``K`` solutions are - kept in the next generation. It cannot be assigned a value greater - than the value assigned to the ``sol_per_pop`` parameter. If this - parameter has a value different than ``0``, then the ``keep_parents`` - parameter will have no effect. - -- ``K_tournament=3``: In case that the parent selection type is - ``tournament``, the ``K_tournament`` specifies the number of parents - participating in the tournament selection. It defaults to ``3``. - -- ``crossover_type="single_point"``: Type of the crossover operation. - Supported types are ``single_point`` (for single-point crossover), - ``two_points`` (for two points crossover), ``uniform`` (for uniform - crossover), and ``scattered`` (for scattered crossover). Scattered - crossover is supported from PyGAD - `2.9.0 `__ - and higher. It defaults to ``single_point``. A custom crossover - function can be passed starting from `PyGAD - 2.16.0 `__. - Check the `User-Defined Crossover, Mutation, and Parent Selection - Operators `__ - section for more details about creating a user-defined crossover - function. Starting from `PyGAD - 2.2.2 `__ - and higher, if ``crossover_type=None``, then the crossover step is - bypassed which means no crossover is applied and thus no offspring - will be created in the next generations. The next generation will use - the solutions in the current population. - -- ``crossover_probability=None``: The probability of selecting a parent - for applying the crossover operation. Its value must be between 0.0 - and 1.0 inclusive. For each parent, a random value between 0.0 and - 1.0 is generated. If this random value is less than or equal to the - value assigned to the ``crossover_probability`` parameter, then the - parent is selected. Added in `PyGAD - 2.5.0 `__ - and higher. - -- ``mutation_type="random"``: Type of the mutation operation. Supported - types are ``random`` (for random mutation), ``swap`` (for swap - mutation), ``inversion`` (for inversion mutation), ``scramble`` (for - scramble mutation), and ``adaptive`` (for adaptive mutation). It - defaults to ``random``. A custom mutation function can be passed - starting from `PyGAD - 2.16.0 `__. - Check the `User-Defined Crossover, Mutation, and Parent Selection - Operators `__ - section for more details about creating a user-defined mutation - function. Starting from `PyGAD - 2.2.2 `__ - and higher, if ``mutation_type=None``, then the mutation step is - bypassed which means no mutation is applied and thus no changes are - applied to the offspring created using the crossover operation. The - offspring will be used unchanged in the next generation. ``Adaptive`` - mutation is supported starting from `PyGAD - 2.10.0 `__. - For more information about adaptive mutation, go the the `Adaptive - Mutation `__ - section. For example about using adaptive mutation, check the `Use - Adaptive Mutation in - PyGAD `__ - section. - -- ``mutation_probability=None``: The probability of selecting a gene - for applying the mutation operation. Its value must be between 0.0 - and 1.0 inclusive. For each gene in a solution, a random value - between 0.0 and 1.0 is generated. If this random value is less than - or equal to the value assigned to the ``mutation_probability`` - parameter, then the gene is selected. If this parameter exists, then - there is no need for the 2 parameters ``mutation_percent_genes`` and - ``mutation_num_genes``. Added in `PyGAD - 2.5.0 `__ - and higher. - -- ``mutation_by_replacement=False``: An optional bool parameter. It - works only when the selected type of mutation is random - (``mutation_type="random"``). In this case, - ``mutation_by_replacement=True`` means replace the gene by the - randomly generated value. If False, then it has no effect and random - mutation works by adding the random value to the gene. Supported in - `PyGAD - 2.2.2 `__ - and higher. Check the changes in `PyGAD - 2.2.2 `__ - under the Release History section for an example. - -- ``mutation_percent_genes="default"``: Percentage of genes to mutate. - It defaults to the string ``"default"`` which is later translated - into the integer ``10`` which means 10% of the genes will be mutated. - It must be ``>0`` and ``<=100``. Out of this percentage, the number - of genes to mutate is deduced which is assigned to the - ``mutation_num_genes`` parameter. The ``mutation_percent_genes`` - parameter has no action if ``mutation_probability`` or - ``mutation_num_genes`` exist. Starting from `PyGAD - 2.2.2 `__ - and higher, this parameter has no action if ``mutation_type`` is - ``None``. - -- ``mutation_num_genes=None``: Number of genes to mutate which defaults - to ``None`` meaning that no number is specified. The - ``mutation_num_genes`` parameter has no action if the parameter - ``mutation_probability`` exists. Starting from `PyGAD - 2.2.2 `__ - and higher, this parameter has no action if ``mutation_type`` is - ``None``. - -- ``random_mutation_min_val=-1.0``: For ``random`` mutation, the - ``random_mutation_min_val`` parameter specifies the start value of - the range from which a random value is selected to be added to the - gene. It defaults to ``-1``. Starting from `PyGAD - 2.2.2 `__ - and higher, this parameter has no action if ``mutation_type`` is - ``None``. - -- ``random_mutation_max_val=1.0``: For ``random`` mutation, the - ``random_mutation_max_val`` parameter specifies the end value of the - range from which a random value is selected to be added to the gene. - It defaults to ``+1``. Starting from `PyGAD - 2.2.2 `__ - and higher, this parameter has no action if ``mutation_type`` is - ``None``. - -- ``gene_space=None``: It is used to specify the possible values for - each gene in case the user wants to restrict the gene values. It is - useful if the gene space is restricted to a certain range or to - discrete values. It accepts a ``list``, ``tuple``, ``range``, or - ``numpy.ndarray``. When all genes have the same global space, specify - their values as a ``list``/``tuple``/``range``/``numpy.ndarray``. For - example, ``gene_space = [0.3, 5.2, -4, 8]`` restricts the gene values - to the 4 specified values. If each gene has its own space, then the - ``gene_space`` parameter can be nested like - ``[[0.4, -5], [0.5, -3.2, 8.2, -9], ...]`` where the first sublist - determines the values for the first gene, the second sublist for the - second gene, and so on. If the nested list/tuple has a ``None`` - value, then the gene's initial value is selected randomly from the - range specified by the 2 parameters ``init_range_low`` and - ``init_range_high`` and its mutation value is selected randomly from - the range specified by the 2 parameters ``random_mutation_min_val`` - and ``random_mutation_max_val``. ``gene_space`` is added in `PyGAD - 2.5.0 `__. - Check the `Release History of PyGAD - 2.5.0 `__ - section of the documentation for more details. In `PyGAD - 2.9.0 `__, - NumPy arrays can be assigned to the ``gene_space`` parameter. In - `PyGAD - 2.11.0 `__, - the ``gene_space`` parameter itself or any of its elements can be - assigned to a dictionary to specify the lower and upper limits of the - genes. For example, ``{'low': 2, 'high': 4}`` means the minimum and - maximum values are 2 and 4, respectively. In `PyGAD - 2.15.0 `__, - a new key called ``"step"`` is supported to specify the step of - moving from the start to the end of the range specified by the 2 - existing keys ``"low"`` and ``"high"``. - -- ``on_start=None``: Accepts a function/method to be called only once - before the genetic algorithm starts its evolution. If function, then - it must accept a single parameter representing the instance of the - genetic algorithm. If method, then it must accept 2 parameters where - the second one refers to the method's object. Added in `PyGAD - 2.6.0 `__. - -- ``on_fitness=None``: Accepts a function/method to be called after - calculating the fitness values of all solutions in the population. If - function, then it must accept 2 parameters: 1) a list of all - solutions' fitness values 2) the instance of the genetic algorithm. - If method, then it must accept 3 parameters where the third one - refers to the method's object. Added in `PyGAD - 2.6.0 `__. - -- ``on_parents=None``: Accepts a function/method to be called after - selecting the parents that mates. If function, then it must accept 2 - parameters: 1) the selected parents 2) the instance of the genetic - algorithm If method, then it must accept 3 parameters where the third - one refers to the method's object. Added in `PyGAD - 2.6.0 `__. - -- ``on_crossover=None``: Accepts a function to be called each time the - crossover operation is applied. This function must accept 2 - parameters: the first one represents the instance of the genetic - algorithm and the second one represents the offspring generated using - crossover. Added in `PyGAD - 2.6.0 `__. - -- ``on_mutation=None``: Accepts a function to be called each time the - mutation operation is applied. This function must accept 2 - parameters: the first one represents the instance of the genetic - algorithm and the second one represents the offspring after applying - the mutation. Added in `PyGAD - 2.6.0 `__. - -- ``on_generation=None``: Accepts a function to be called after each - generation. This function must accept a single parameter representing - the instance of the genetic algorithm. If the function returned the - string ``stop``, then the ``run()`` method stops without completing - the other generations. Added in `PyGAD - 2.6.0 `__. - -- ``on_stop=None``: Accepts a function to be called only once exactly - before the genetic algorithm stops or when it completes all the - generations. This function must accept 2 parameters: the first one - represents the instance of the genetic algorithm and the second one - is a list of fitness values of the last population's solutions. Added - in `PyGAD - 2.6.0 `__. - -- ``delay_after_gen=0.0``: It accepts a non-negative number specifying - the time in seconds to wait after a generation completes and before - going to the next generation. It defaults to ``0.0`` which means no - delay after the generation. Available in `PyGAD - 2.4.0 `__ - and higher. - -- ``save_best_solutions=False``: When ``True``, then the best solution - after each generation is saved into an attribute named - ``best_solutions``. If ``False`` (default), then no solutions are - saved and the ``best_solutions`` attribute will be empty. Supported - in `PyGAD - 2.9.0 `__. - -- ``save_solutions=False``: If ``True``, then all solutions in each - generation are appended into an attribute called ``solutions`` which - is NumPy array. Supported in `PyGAD - 2.15.0 `__. - -- ``suppress_warnings=False``: A bool parameter to control whether the - warning messages are printed or not. It defaults to ``False``. - -- ``allow_duplicate_genes=True``: Added in `PyGAD - 2.13.0 `__. - If ``True``, then a solution/chromosome may have duplicate gene - values. If ``False``, then each gene will have a unique value in its - solution. - -- ``stop_criteria=None``: Some criteria to stop the evolution. Added in - `PyGAD - 2.15.0 `__. - Each criterion is passed as ``str`` which has a stop word. The - current 2 supported words are ``reach`` and ``saturate``. ``reach`` - stops the ``run()`` method if the fitness value is equal to or - greater than a given fitness value. An example for ``reach`` is - ``"reach_40"`` which stops the evolution if the fitness is >= 40. - ``saturate`` means stop the evolution if the fitness saturates for a - given number of consecutive generations. An example for ``saturate`` - is ``"saturate_7"`` which means stop the ``run()`` method if the - fitness does not change for 7 consecutive generations. - -- ``parallel_processing=None``: Added in `PyGAD - 2.17.0 `__. - If ``None`` (Default), this means no parallel processing is applied. - It can accept a list/tuple of 2 elements [1) Can be either - ``'process'`` or ``'thread'`` to indicate whether processes or - threads are used, respectively., 2) The number of processes or - threads to use.]. For example, - ``parallel_processing=['process', 10]`` applies parallel processing - with 10 processes. If a positive integer is assigned, then it is used - as the number of threads. For example, ``parallel_processing=5`` uses - 5 threads which is equivalent to - ``parallel_processing=["thread", 5]``. For more information, check - the `Parallel Processing in - PyGAD `__ - section. - -- ``random_seed=None``: Added in `PyGAD - 2.18.0 `__. - It defines the random seed to be used by the random function - generators (we use random functions in the NumPy and random modules). - This helps to reproduce the same results by setting the same random - seed (e.g. ``random_seed=2``). If given the value ``None``, then it - has no effect. - -- ``logger=None``: Accepts an instance of the ``logging.Logger`` class - to log the outputs. Any message is no longer printed using - ``print()`` but logged. If ``logger=None``, then a logger is created - that uses ``StreamHandler`` to logs the messages to the console. - Added in `PyGAD - 3.0.0 `__. - Check the `Logging - Outputs `__ - for more information. - -The user doesn't have to specify all of such parameters while creating -an instance of the GA class. A very important parameter you must care -about is ``fitness_func`` which defines the fitness function. - -It is OK to set the value of any of the 2 parameters ``init_range_low`` -and ``init_range_high`` to be equal, higher, or lower than the other -parameter (i.e. ``init_range_low`` is not needed to be lower than -``init_range_high``). The same holds for the ``random_mutation_min_val`` -and ``random_mutation_max_val`` parameters. - -If the 2 parameters ``mutation_type`` and ``crossover_type`` are -``None``, this disables any type of evolution the genetic algorithm can -make. As a result, the genetic algorithm cannot find a better solution -that the best solution in the initial population. - -The parameters are validated within the constructor. If at least a -parameter is not correct, an exception is thrown. - -.. _plotting-methods-in-pygadga-class: - -Plotting Methods in ``pygad.GA`` Class --------------------------------------- - -- ``plot_fitness()``: Shows how the fitness evolves by generation. - -- ``plot_genes()``: Shows how the gene value changes for each - generation. - -- ``plot_new_solution_rate()``: Shows the number of new solutions - explored in each solution. - -Class Attributes ----------------- - -- ``supported_int_types``: A list of the supported types for the - integer numbers. - -- ``supported_float_types``: A list of the supported types for the - floating-point numbers. - -- ``supported_int_float_types``: A list of the supported types for all - numbers. It just concatenates the previous 2 lists. - -.. _other-instance-attributes--methods: - -Other Instance Attributes & Methods ------------------------------------ - -All the parameters and functions passed to the ``pygad.GA`` class -constructor are used as class attributes and methods in the instances of -the ``pygad.GA`` class. In addition to such attributes, there are other -attributes and methods added to the instances of the ``pygad.GA`` class: - -The next 2 subsections list such attributes and methods. - -Other Attributes -~~~~~~~~~~~~~~~~ - -- ``generations_completed``: Holds the number of the last completed - generation. - -- ``population``: A NumPy array holding the initial population. - -- ``valid_parameters``: Set to ``True`` when all the parameters passed - in the ``GA`` class constructor are valid. - -- ``run_completed``: Set to ``True`` only after the ``run()`` method - completes gracefully. - -- ``pop_size``: The population size. - -- ``best_solutions_fitness``: A list holding the fitness values of the - best solutions for all generations. - -- ``best_solution_generation``: The generation number at which the best - fitness value is reached. It is only assigned the generation number - after the ``run()`` method completes. Otherwise, its value is -1. - -- ``best_solutions``: A NumPy array holding the best solution per each - generation. It only exists when the ``save_best_solutions`` parameter - in the ``pygad.GA`` class constructor is set to ``True``. - -- ``last_generation_fitness``: The fitness values of the solutions in - the last generation. `Added in PyGAD - 2.12.0 `__. - -- ``previous_generation_fitness``: At the end of each generation, the - fitness of the most recent population is saved in the - ``last_generation_fitness`` attribute. The fitness of the population - exactly preceding this most recent population is saved in the - ``last_generation_fitness`` attribute. This - ``previous_generation_fitness`` attribute is used to fetch the - pre-calculated fitness instead of calling the fitness function for - already explored solutions. `Added in PyGAD - 2.16.2 `__. - -- ``last_generation_parents``: The parents selected from the last - generation. `Added in PyGAD - 2.12.0 `__. - -- ``last_generation_offspring_crossover``: The offspring generated - after applying the crossover in the last generation. `Added in PyGAD - 2.12.0 `__. - -- ``last_generation_offspring_mutation``: The offspring generated after - applying the mutation in the last generation. `Added in PyGAD - 2.12.0 `__. - -- ``gene_type_single``: A flag that is set to ``True`` if the - ``gene_type`` parameter is assigned to a single data type that is - applied to all genes. If ``gene_type`` is assigned a ``list``, - ``tuple``, or ``numpy.ndarray``, then the value of - ``gene_type_single`` will be ``False``. `Added in PyGAD - 2.14.0 `__. - -- ``last_generation_parents_indices``: This attribute holds the indices - of the selected parents in the last generation. Supported in `PyGAD - 2.15.0 `__. - -- ``last_generation_elitism``: This attribute holds the elitism of the - last generation. It is effective only if the ``keep_elitism`` - parameter has a non-zero value. Supported in `PyGAD - 2.18.0 `__. - -- ``last_generation_elitism_indices``: This attribute holds the indices - of the elitism of the last generation. It is effective only if the - ``keep_elitism`` parameter has a non-zero value. Supported in `PyGAD - 2.19.0 `__. - -- ``logger``: This attribute holds the logger from the ``logging`` - module. Supported in `PyGAD - 3.0.0 `__. - -Note that the attributes with its name start with ``last_generation_`` -are updated after each generation. - -Other Methods -~~~~~~~~~~~~~ - -- ``cal_pop_fitness()``: A method that calculates the fitness values - for all solutions within the population by calling the function - passed to the ``fitness_func`` parameter for each solution. - -- ``crossover()``: Refers to the method that applies the crossover - operator based on the selected type of crossover in the - ``crossover_type`` property. - -- ``mutation()``: Refers to the method that applies the mutation - operator based on the selected type of mutation in the - ``mutation_type`` property. - -- ``select_parents()``: Refers to a method that selects the parents - based on the parent selection type specified in the - ``parent_selection_type`` attribute. - -- ``adaptive_mutation_population_fitness()``: Returns the average - fitness value used in the adaptive mutation to filter the solutions. - -- ``solve_duplicate_genes_randomly()``: Solves the duplicates in a - solution by randomly selecting new values for the duplicating genes. - -- ``solve_duplicate_genes_by_space()``: Solves the duplicates in a - solution by selecting values for the duplicating genes from the gene - space - -- ``unique_int_gene_from_range()``: Finds a unique integer value for - the gene. - -- ``unique_genes_by_space()``: Loops through all the duplicating genes - to find unique values that from their gene spaces to solve the - duplicates. For each duplicating gene, a call to the - ``unique_gene_by_space()`` is made. - -- ``unique_gene_by_space()``: Returns a unique gene value for a single - gene based on its value space to solve the duplicates. - -- ``summary()``: Prints a Keras-like summary of the PyGAD lifecycle. - This helps to have an overview of the architecture. Supported in - `PyGAD - 2.19.0 `__. - Check the `Print Lifecycle - Summary `__ - section for more details and examples. - -The next sections discuss the methods available in the ``pygad.GA`` -class. - -.. _initializepopulation: - -``initialize_population()`` ---------------------------- - -It creates an initial population randomly as a NumPy array. The array is -saved in the instance attribute named ``population``. - -Accepts the following parameters: - -- ``low``: The lower value of the random range from which the gene - values in the initial population are selected. It defaults to -4. - Available in PyGAD 1.0.20 and higher. - -- ``high``: The upper value of the random range from which the gene - values in the initial population are selected. It defaults to -4. - Available in PyGAD 1.0.20. - -This method assigns the values of the following 3 instance attributes: - -1. ``pop_size``: Size of the population. - -2. ``population``: Initially, it holds the initial population and later - updated after each generation. - -3. ``initial_population``: Keeping the initial population. - -.. _calpopfitness: - -``cal_pop_fitness()`` ---------------------- - -The ``cal_pop_fitness()`` method calculates and returns the fitness -values of the solutions in the current population. - -This function is optimized to save time by making fewer calls the -fitness function. It follows this process: - -1. If the ``save_solutions`` parameter is set to ``True``, then it - checks if the solution is already explored and saved in the - ``solutions`` instance attribute. If so, then it just retrieves its - fitness from the ``solutions_fitness`` instance attribute without - calling the fitness function. - -2. If ``save_solutions`` is set to ``False`` or if it is ``True`` but - the solution was not explored yet, then the ``cal_pop_fitness()`` - method checks if the ``keep_elitism`` parameter is set to a positive - integer. If so, then it checks if the solution is saved into the - ``last_generation_elitism`` instance attribute. If so, then it - retrieves its fitness from the ``previous_generation_fitness`` - instance attribute. - -3. If neither of the above 3 conditions apply (1. ``save_solutions`` is - set to ``False`` or 2. if it is ``True`` but the solution was not - explored yet or 3. ``keep_elitism`` is set to zero), then the - ``cal_pop_fitness()`` method checks if the ``keep_parents`` parameter - is set to ``-1`` or a positive integer. If so, then it checks if the - solution is saved into the ``last_generation_parents`` instance - attribute. If so, then it retrieves its fitness from the - ``previous_generation_fitness`` instance attribute. - -4. If neither of the above 4 conditions apply, then we have to call the - fitness function to calculate the fitness for the solution. This is - by calling the function assigned to the ``fitness_func`` parameter. - -This function takes into consideration: - -1. The ``parallel_processing`` parameter to check whether parallel - processing is in effect. - -2. The ``fitness_batch_size`` parameter to check if the fitness should - be calculated in batches of solutions. - -It returns a vector of the solutions' fitness values. - -``run()`` ---------- - -Runs the genetic algorithm. This is the main method in which the genetic -algorithm is evolved through some generations. It accepts no parameters -as it uses the instance to access all of its requirements. - -For each generation, the fitness values of all solutions within the -population are calculated according to the ``cal_pop_fitness()`` method -which internally just calls the function assigned to the -``fitness_func`` parameter in the ``pygad.GA`` class constructor for -each solution. - -According to the fitness values of all solutions, the parents are -selected using the ``select_parents()`` method. This method behaviour is -determined according to the parent selection type in the -``parent_selection_type`` parameter in the ``pygad.GA`` class -constructor - -Based on the selected parents, offspring are generated by applying the -crossover and mutation operations using the ``crossover()`` and -``mutation()`` methods. The behaviour of such 2 methods is defined -according to the ``crossover_type`` and ``mutation_type`` parameters in -the ``pygad.GA`` class constructor. - -After the generation completes, the following takes place: - -- The ``population`` attribute is updated by the new population. - -- The ``generations_completed`` attribute is assigned by the number of - the last completed generation. - -- If there is a callback function assigned to the ``on_generation`` - attribute, then it will be called. - -After the ``run()`` method completes, the following takes place: - -- The ``best_solution_generation`` is assigned the generation number at - which the best fitness value is reached. - -- The ``run_completed`` attribute is set to ``True``. - -Parent Selection Methods ------------------------- - -The ``ParentSelection`` class in the ``pygad.utils.parent_selection`` -module has several methods for selecting the parents that will mate to -produce the offspring. All of such methods accept the same parameters -which are: - -- ``fitness``: The fitness values of the solutions in the current - population. - -- ``num_parents``: The number of parents to be selected. - -All of such methods return an array of the selected parents. - -The next subsections list the supported methods for parent selection. - -.. _steadystateselection: - -``steady_state_selection()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Selects the parents using the steady-state selection technique. - -.. _rankselection: - -``rank_selection()`` -~~~~~~~~~~~~~~~~~~~~ - -Selects the parents using the rank selection technique. - -.. _randomselection: - -``random_selection()`` -~~~~~~~~~~~~~~~~~~~~~~ - -Selects the parents randomly. - -.. _tournamentselection: - -``tournament_selection()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Selects the parents using the tournament selection technique. - -.. _roulettewheelselection: - -``roulette_wheel_selection()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Selects the parents using the roulette wheel selection technique. - -.. _stochasticuniversalselection: - -``stochastic_universal_selection()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Selects the parents using the stochastic universal selection technique. - -Crossover Methods ------------------ - -The ``Crossover`` class in the ``pygad.utils.crossover`` module supports -several methods for applying crossover between the selected parents. All -of these methods accept the same parameters which are: - -- ``parents``: The parents to mate for producing the offspring. - -- ``offspring_size``: The size of the offspring to produce. - -All of such methods return an array of the produced offspring. - -The next subsections list the supported methods for crossover. - -.. _singlepointcrossover: - -``single_point_crossover()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the single-point crossover. It selects a point randomly at which -crossover takes place between the pairs of parents. - -.. _twopointscrossover: - -``two_points_crossover()`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the 2 points crossover. It selects the 2 points randomly at -which crossover takes place between the pairs of parents. - -.. _uniformcrossover: - -``uniform_crossover()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the uniform crossover. For each gene, a parent out of the 2 -mating parents is selected randomly and the gene is copied from it. - -.. _scatteredcrossover: - -``scattered_crossover()`` -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the scattered crossover. It randomly selects the gene from one -of the 2 parents. - -Mutation Methods ----------------- - -The ``Mutation`` class in the ``pygad.utils.mutation`` module supports -several methods for applying mutation. All of these methods accept the -same parameter which is: - -- ``offspring``: The offspring to mutate. - -All of such methods return an array of the mutated offspring. - -The next subsections list the supported methods for mutation. - -.. _randommutation: - -``random_mutation()`` -~~~~~~~~~~~~~~~~~~~~~ - -Applies the random mutation which changes the values of some genes -randomly. The number of genes is specified according to either the -``mutation_num_genes`` or the ``mutation_percent_genes`` attributes. - -For each gene, a random value is selected according to the range -specified by the 2 attributes ``random_mutation_min_val`` and -``random_mutation_max_val``. The random value is added to the selected -gene. - -.. _swapmutation: - -``swap_mutation()`` -~~~~~~~~~~~~~~~~~~~ - -Applies the swap mutation which interchanges the values of 2 randomly -selected genes. - -.. _inversionmutation: - -``inversion_mutation()`` -~~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the inversion mutation which selects a subset of genes and -inverts them. - -.. _scramblemutation: - -``scramble_mutation()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the scramble mutation which selects a subset of genes and -shuffles their order randomly. - -.. _adaptivemutation: - -``adaptive_mutation()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -Applies the adaptive mutation which selects a subset of genes and -shuffles their order randomly. - -.. _bestsolution: - -``best_solution()`` -------------------- - -Returns information about the best solution found by the genetic -algorithm. - -It accepts the following parameters: - -- ``pop_fitness=None``: An optional parameter that accepts a list of - the fitness values of the solutions in the population. If ``None``, - then the ``cal_pop_fitness()`` method is called to calculate the - fitness values of the population. - -It returns the following: - -- ``best_solution``: Best solution in the current population. - -- ``best_solution_fitness``: Fitness value of the best solution. - -- ``best_match_idx``: Index of the best solution in the current - population. - -.. _plotfitness: - -``plot_fitness()`` ------------------- - -Previously named ``plot_result()``, this method creates, shows, and -returns a figure that summarizes how the fitness value evolves by -generation. It works only after completing at least 1 generation. - -If no generation is completed (at least 1), an exception is raised. - -Starting from `PyGAD -2.15.0 `__ -and higher, this method accepts the following parameters: - -1. ``title``: Title of the figure. - -2. ``xlabel``: X-axis label. - -3. ``ylabel``: Y-axis label. - -4. ``linewidth``: Line width of the plot. Defaults to ``3``. - -5. ``font_size``: Font size for the labels and title. Defaults to - ``14``. - -6. ``plot_type``: Type of the plot which can be either ``"plot"`` - (default), ``"scatter"``, or ``"bar"``. - -7. ``color``: Color of the plot which defaults to ``"#3870FF"``. - -8. ``save_dir``: Directory to save the figure. - -.. _plotnewsolutionrate: - -``plot_new_solution_rate()`` ----------------------------- - -The ``plot_new_solution_rate()`` method creates, shows, and returns a -figure that shows the number of new solutions explored in each -generation. This method works only when ``save_solutions=True`` in the -constructor of the ``pygad.GA`` class. It also works only after -completing at least 1 generation. - -If no generation is completed (at least 1), an exception is raised. - -This method accepts the following parameters: - -1. ``title``: Title of the figure. - -2. ``xlabel``: X-axis label. - -3. ``ylabel``: Y-axis label. - -4. ``linewidth``: Line width of the plot. Defaults to ``3``. - -5. ``font_size``: Font size for the labels and title. Defaults to - ``14``. - -6. ``plot_type``: Type of the plot which can be either ``"plot"`` - (default), ``"scatter"``, or ``"bar"``. - -7. ``color``: Color of the plot which defaults to ``"#3870FF"``. - -8. ``save_dir``: Directory to save the figure. - -.. _plotgenes: - -``plot_genes()`` ----------------- - -The ``plot_genes()`` method creates, shows, and returns a figure that -describes each gene. It has different options to create the figures -which helps to: - -1. Explore the gene value for each generation by creating a normal plot. - -2. Create a histogram for each gene. - -3. Create a boxplot. - -This is controlled by the ``graph_type`` parameter. - -It works only after completing at least 1 generation. If no generation -is completed, an exception is raised. If no generation is completed (at -least 1), an exception is raised. - -This method accepts the following parameters: - -1. ``title``: Title of the figure. - -2. ``xlabel``: X-axis label. - -3. ``ylabel``: Y-axis label. - -4. ``linewidth``: Line width of the plot. Defaults to ``3``. - -5. ``font_size``: Font size for the labels and title. Defaults to - ``14``. - -6. ``plot_type``: Type of the plot which can be either ``"plot"`` - (default), ``"scatter"``, or ``"bar"``. - -7. ``graph_type``: Type of the graph which can be either ``"plot"`` - (default), ``"boxplot"``, or ``"histogram"``. - -8. ``fill_color``: Fill color of the graph which defaults to - ``"#3870FF"``. This has no effect if ``graph_type="plot"``. - -9. ``color``: Color of the plot which defaults to ``"#3870FF"``. - -10. ``solutions``: Defaults to ``"all"`` which means use all solutions. - If ``"best"`` then only the best solutions are used. - -11. ``save_dir``: Directory to save the figure. - -An exception is raised if: - -- ``solutions="all"`` while ``save_solutions=False`` in the constructor - of the ``pygad.GA`` class. . - -- ``solutions="best"`` while ``save_best_solutions=False`` in the - constructor of the ``pygad.GA`` class. . - -``save()`` ----------- - -Saves the genetic algorithm instance - -Accepts the following parameter: - -- ``filename``: Name of the file to save the instance. No extension is - needed. - -Functions in ``pygad`` -====================== - -Besides the methods available in the ``pygad.GA`` class, this section -discusses the functions available in ``pygad``. Up to this time, there -is only a single function named ``load()``. - -.. _pygadload: - -``pygad.load()`` ----------------- - -Reads a saved instance of the genetic algorithm. This is not a method -but a function that is indented under the ``pygad`` module. So, it could -be called by the pygad module as follows: ``pygad.load(filename)``. - -Accepts the following parameter: - -- ``filename``: Name of the file holding the saved instance of the - genetic algorithm. No extension is needed. - -Returns the genetic algorithm instance. - -Steps to Use ``pygad`` -====================== - -To use the ``pygad`` module, here is a summary of the required steps: - -1. Preparing the ``fitness_func`` parameter. - -2. Preparing Other Parameters. - -3. Import ``pygad``. - -4. Create an Instance of the ``pygad.GA`` Class. - -5. Run the Genetic Algorithm. - -6. Plotting Results. - -7. Information about the Best Solution. - -8. Saving & Loading the Results. - -Let's discuss how to do each of these steps. - -.. _preparing-the-fitnessfunc-parameter: - -Preparing the ``fitness_func`` Parameter ------------------------------------------ - -Even there are some steps in the genetic algorithm pipeline that can -work the same regardless of the problem being solved, one critical step -is the calculation of the fitness value. There is no unique way of -calculating the fitness value and it changes from one problem to -another. - -PyGAD has a parameter called ``fitness_func`` that allows the user to -specify a custom function/method to use when calculating the fitness. -This function/method must be a maximization function/method so that a -solution with a high fitness value returned is selected compared to a -solution with a low value. Doing that allows the user to freely use -PyGAD to solve any problem by passing the appropriate fitness -function/method. It is very important to understand the problem well for -creating it. - -Let's discuss an example: - - | Given the following function: - | y = f(w1:w6) = w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + 6wx6 - | where (x1,x2,x3,x4,x5,x6)=(4, -2, 3.5, 5, -11, -4.7) and y=44 - | What are the best values for the 6 weights (w1 to w6)? We are going - to use the genetic algorithm to optimize this function. - -So, the task is about using the genetic algorithm to find the best -values for the 6 weight ``W1`` to ``W6``. Thinking of the problem, it is -clear that the best solution is that returning an output that is close -to the desired output ``y=44``. So, the fitness function/method should -return a value that gets higher when the solution's output is closer to -``y=44``. Here is a function that does that: - -.. code:: python - - function_inputs = [4, -2, 3.5, 5, -11, -4.7] # Function inputs. - desired_output = 44 # Function output. - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / numpy.abs(output - desired_output) - return fitness - -Such a user-defined function must accept 3 parameters: - -1. The instance of the ``pygad.GA`` class. This helps the user to fetch - any property that helps when calculating the fitness. - -2. The solution(s) to calculate the fitness value(s). Note that the - fitness function can accept multiple solutions only if the - ``fitness_batch_size`` is given a value greater than 1. - -3. The indices of the solutions in the population. The number of indices - also depends on the ``fitness_batch_size`` parameter. - -If a method is passed to the ``fitness_func`` parameter, then it accepts -a fourth parameter representing the method's instance. - -The ``__code__`` object is used to check if this function accepts the -required number of parameters. If more or fewer parameters are passed, -an exception is thrown. - -By creating this function, you did a very important step towards using -PyGAD. - -Preparing Other Parameters -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Here is an example for preparing the other parameters: - -.. code:: python - - num_generations = 50 - num_parents_mating = 4 - - fitness_function = fitness_func - - sol_per_pop = 8 - num_genes = len(function_inputs) - - init_range_low = -2 - init_range_high = 5 - - parent_selection_type = "sss" - keep_parents = 1 - - crossover_type = "single_point" - - mutation_type = "random" - mutation_percent_genes = 10 - -.. _the-ongeneration-parameter: - -The ``on_generation`` Parameter -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -An optional parameter named ``on_generation`` is supported which allows -the user to call a function (with a single parameter) after each -generation. Here is a simple function that just prints the current -generation number and the fitness value of the best solution in the -current generation. The ``generations_completed`` attribute of the GA -class returns the number of the last completed generation. - -.. code:: python - - def on_gen(ga_instance): - print("Generation : ", ga_instance.generations_completed) - print("Fitness of the best solution :", ga_instance.best_solution()[1]) - -After being defined, the function is assigned to the ``on_generation`` -parameter of the GA class constructor. By doing that, the ``on_gen()`` -function will be called after each generation. - -.. code:: python - - ga_instance = pygad.GA(..., - on_generation=on_gen, - ...) - -After the parameters are prepared, we can import PyGAD and build an -instance of the ``pygad.GA`` class. - -Import ``pygad`` ----------------- - -The next step is to import PyGAD as follows: - -.. code:: python - - import pygad - -The ``pygad.GA`` class holds the implementation of all methods for -running the genetic algorithm. - -.. _create-an-instance-of-the-pygadga-class: - -Create an Instance of the ``pygad.GA`` Class --------------------------------------------- - -The ``pygad.GA`` class is instantiated where the previously prepared -parameters are fed to its constructor. The constructor is responsible -for creating the initial population. - -.. code:: python - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - fitness_func=fitness_function, - sol_per_pop=sol_per_pop, - num_genes=num_genes, - init_range_low=init_range_low, - init_range_high=init_range_high, - parent_selection_type=parent_selection_type, - keep_parents=keep_parents, - crossover_type=crossover_type, - mutation_type=mutation_type, - mutation_percent_genes=mutation_percent_genes) - -Run the Genetic Algorithm -------------------------- - -After an instance of the ``pygad.GA`` class is created, the next step is -to call the ``run()`` method as follows: - -.. code:: python - - ga_instance.run() - -Inside this method, the genetic algorithm evolves over some generations -by doing the following tasks: - -1. Calculating the fitness values of the solutions within the current - population. - -2. Select the best solutions as parents in the mating pool. - -3. Apply the crossover & mutation operation - -4. Repeat the process for the specified number of generations. - -Plotting Results ----------------- - -There is a method named ``plot_fitness()`` which creates a figure -summarizing how the fitness values of the solutions change with the -generations. - -.. code:: python - - ga_instance.plot_fitness() - -.. figure:: https://user-images.githubusercontent.com/16560492/78830005-93111d00-79e7-11ea-9d8e-a8d8325a6101.png - :alt: - -Information about the Best Solution ------------------------------------ - -The following information about the best solution in the last population -is returned using the ``best_solution()`` method. - -- Solution - -- Fitness value of the solution - -- Index of the solution within the population - -.. code:: python - - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - -Using the ``best_solution_generation`` attribute of the instance from -the ``pygad.GA`` class, the generation number at which the -``best fitness`` is reached could be fetched. - -.. code:: python - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - -.. _saving--loading-the-results: - -Saving & Loading the Results ----------------------------- - -After the ``run()`` method completes, it is possible to save the current -instance of the genetic algorithm to avoid losing the progress made. The -``save()`` method is available for that purpose. Just pass the file name -to it without an extension. According to the next code, a file named -``genetic.pkl`` will be created and saved in the current directory. - -.. code:: python - - filename = 'genetic' - ga_instance.save(filename=filename) - -You can also load the saved model using the ``load()`` function and -continue using it. For example, you might run the genetic algorithm for -some generations, save its current state using the ``save()`` method, -load the model using the ``load()`` function, and then call the -``run()`` method again. - -.. code:: python - - loaded_ga_instance = pygad.load(filename=filename) - -After the instance is loaded, you can use it to run any method or access -any property. - -.. code:: python - - print(loaded_ga_instance.best_solution()) - -Crossover, Mutation, and Parent Selection -========================================= - -PyGAD supports different types for selecting the parents and applying -the crossover & mutation operators. More features will be added in the -future. To ask for a new feature, please check the ``Ask for Feature`` -section. - -Supported Crossover Operations ------------------------------- - -The supported crossover operations at this time are: - -1. Single point: Implemented using the ``single_point_crossover()`` - method. - -2. Two points: Implemented using the ``two_points_crossover()`` method. - -3. Uniform: Implemented using the ``uniform_crossover()`` method. - -Supported Mutation Operations ------------------------------ - -The supported mutation operations at this time are: - -1. Random: Implemented using the ``random_mutation()`` method. - -2. Swap: Implemented using the ``swap_mutation()`` method. - -3. Inversion: Implemented using the ``inversion_mutation()`` method. - -4. Scramble: Implemented using the ``scramble_mutation()`` method. - -Supported Parent Selection Operations -------------------------------------- - -The supported parent selection techniques at this time are: - -1. Steady-state: Implemented using the ``steady_state_selection()`` - method. - -2. Roulette wheel: Implemented using the ``roulette_wheel_selection()`` - method. - -3. Stochastic universal: Implemented using the - ``stochastic_universal_selection()``\ method. - -4. Rank: Implemented using the ``rank_selection()`` method. - -5. Random: Implemented using the ``random_selection()`` method. - -6. Tournament: Implemented using the ``tournament_selection()`` method. - -Life Cycle of PyGAD -=================== - -The next figure lists the different stages in the lifecycle of an -instance of the ``pygad.GA`` class. Note that PyGAD stops when either -all generations are completed or when the function passed to the -``on_generation`` parameter returns the string ``stop``. - -.. figure:: https://user-images.githubusercontent.com/16560492/220486073-c5b6089d-81e4-44d9-a53c-385f479a7273.jpg - :alt: - -The next code implements all the callback functions to trace the -execution of the genetic algorithm. Each callback function prints its -name. - -.. code:: python - - import pygad - import numpy - - function_inputs = [4,-2,3.5,5,-11,-4.7] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - fitness_function = fitness_func - - def on_start(ga_instance): - print("on_start()") - - def on_fitness(ga_instance, population_fitness): - print("on_fitness()") - - def on_parents(ga_instance, selected_parents): - print("on_parents()") - - def on_crossover(ga_instance, offspring_crossover): - print("on_crossover()") - - def on_mutation(ga_instance, offspring_mutation): - print("on_mutation()") - - def on_generation(ga_instance): - print("on_generation()") - - def on_stop(ga_instance, last_population_fitness): - print("on_stop()") - - ga_instance = pygad.GA(num_generations=3, - num_parents_mating=5, - fitness_func=fitness_function, - sol_per_pop=10, - num_genes=len(function_inputs), - on_start=on_start, - on_fitness=on_fitness, - on_parents=on_parents, - on_crossover=on_crossover, - on_mutation=on_mutation, - on_generation=on_generation, - on_stop=on_stop) - - ga_instance.run() - -Based on the used 3 generations as assigned to the ``num_generations`` -argument, here is the output. - -.. code:: - - on_start() - - on_fitness() - on_parents() - on_crossover() - on_mutation() - on_generation() - - on_fitness() - on_parents() - on_crossover() - on_mutation() - on_generation() - - on_fitness() - on_parents() - on_crossover() - on_mutation() - on_generation() - - on_stop() - -Adaptive Mutation -================= - -In the regular genetic algorithm, the mutation works by selecting a -single fixed mutation rate for all solutions regardless of their fitness -values. So, regardless on whether this solution has high or low quality, -the same number of genes are mutated all the time. - -The pitfalls of using a constant mutation rate for all solutions are -summarized in this paper `Libelli, S. Marsili, and P. Alba. "Adaptive -mutation in genetic algorithms." Soft computing 4.2 (2000): -76-80 `__ -as follows: - - The weak point of "classical" GAs is the total randomness of - mutation, which is applied equally to all chromosomes, irrespective - of their fitness. Thus a very good chromosome is equally likely to be - disrupted by mutation as a bad one. - - On the other hand, bad chromosomes are less likely to produce good - ones through crossover, because of their lack of building blocks, - until they remain unchanged. They would benefit the most from - mutation and could be used to spread throughout the parameter space - to increase the search thoroughness. So there are two conflicting - needs in determining the best probability of mutation. - - Usually, a reasonable compromise in the case of a constant mutation - is to keep the probability low to avoid disruption of good - chromosomes, but this would prevent a high mutation rate of - low-fitness chromosomes. Thus a constant probability of mutation - would probably miss both goals and result in a slow improvement of - the population. - -According to `Libelli, S. Marsili, and P. -Alba. `__ -work, the adaptive mutation solves the problems of constant mutation. - -Adaptive mutation works as follows: - -1. Calculate the average fitness value of the population (``f_avg``). - -2. For each chromosome, calculate its fitness value (``f``). - -3. If ``ff_avg``, then this solution is regarded as a high-quality - solution and thus the mutation rate should be kept low to avoid - disrupting this high quality solution. - -In PyGAD, if ``f=f_avg``, then the solution is regarded of high quality. - -The next figure summarizes the previous steps. - -.. figure:: https://user-images.githubusercontent.com/16560492/103468973-e3c26600-4d2c-11eb-8af3-b3bb39b50540.jpg - :alt: - -This strategy is applied in PyGAD. - -Use Adaptive Mutation in PyGAD ------------------------------- - -In PyGAD 2.10.0, adaptive mutation is supported. To use it, just follow -the following 2 simple steps: - -1. In the constructor of the ``pygad.GA`` class, set - ``mutation_type="adaptive"`` to specify that the type of mutation is - adaptive. - -2. Specify the mutation rates for the low and high quality solutions - using one of these 3 parameters according to your preference: - ``mutation_probability``, ``mutation_num_genes``, and - ``mutation_percent_genes``. Please check the `documentation of each - of these - parameters `__ - for more information. - -When adaptive mutation is used, then the value assigned to any of the 3 -parameters can be of any of these data types: - -1. ``list`` - -2. ``tuple`` - -3. ``numpy.ndarray`` - -Whatever the data type used, the length of the ``list``, ``tuple``, or -the ``numpy.ndarray`` must be exactly 2. That is there are just 2 -values: - -1. The first value is the mutation rate for the low-quality solutions. - -2. The second value is the mutation rate for the high-quality solutions. - -PyGAD expects that the first value is higher than the second value and -thus a warning is printed in case the first value is lower than the -second one. - -Here are some examples to feed the mutation rates: - -.. code:: python - - # mutation_probability - mutation_probability = [0.25, 0.1] - mutation_probability = (0.35, 0.17) - mutation_probability = numpy.array([0.15, 0.05]) - - # mutation_num_genes - mutation_num_genes = [4, 2] - mutation_num_genes = (3, 1) - mutation_num_genes = numpy.array([7, 2]) - - # mutation_percent_genes - mutation_percent_genes = [25, 12] - mutation_percent_genes = (15, 8) - mutation_percent_genes = numpy.array([21, 13]) - -Assume that the average fitness is 12 and the fitness values of 2 -solutions are 15 and 7. If the mutation probabilities are specified as -follows: - -.. code:: python - - mutation_probability = [0.25, 0.1] - -Then the mutation probability of the first solution is 0.1 because its -fitness is 15 which is higher than the average fitness 12. The mutation -probability of the second solution is 0.25 because its fitness is 7 -which is lower than the average fitness 12. - -Here is an example that uses adaptive mutation. - -.. code:: python - - import pygad - import numpy - - function_inputs = [4,-2,3.5,5,-11,-4.7] # Function inputs. - desired_output = 44 # Function output. - - def fitness_func(ga_instance, solution, solution_idx): - # The fitness function calulates the sum of products between each input and its corresponding weight. - output = numpy.sum(solution*function_inputs) - # The value 0.000001 is used to avoid the Inf value when the denominator numpy.abs(output - desired_output) is 0.0. - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - # Creating an instance of the GA class inside the ga module. Some parameters are initialized within the constructor. - ga_instance = pygad.GA(num_generations=200, - fitness_func=fitness_func, - num_parents_mating=10, - sol_per_pop=20, - num_genes=len(function_inputs), - mutation_type="adaptive", - mutation_num_genes=(3, 1)) - - # Running the GA to optimize the parameters of the function. - ga_instance.run() - - ga_instance.plot_fitness(title="PyGAD with Adaptive Mutation", linewidth=5) - -Limit the Gene Value Range -========================== - -In `PyGAD -2.11.0 `__, -the ``gene_space`` parameter supported a new feature to allow -customizing the range of accepted values for each gene. Let's take a -quick review of the ``gene_space`` parameter to build over it. - -The ``gene_space`` parameter allows the user to feed the space of values -of each gene. This way the accepted values for each gene is retracted to -the user-defined values. Assume there is a problem that has 3 genes -where each gene has different set of values as follows: - -1. Gene 1: ``[0.4, 12, -5, 21.2]`` - -2. Gene 2: ``[-2, 0.3]`` - -3. Gene 3: ``[1.2, 63.2, 7.4]`` - -Then, the ``gene_space`` for this problem is as given below. Note that -the order is very important. - -.. code:: python - - gene_space = [[0.4, 12, -5, 21.2], - [-2, 0.3], - [1.2, 63.2, 7.4]] - -In case all genes share the same set of values, then simply feed a -single list to the ``gene_space`` parameter as follows. In this case, -all genes can only take values from this list of 6 values. - -.. code:: python - - gene_space = [33, 7, 0.5, 95. 6.3, 0.74] - -The previous example restricts the gene values to just a set of fixed -number of discrete values. In case you want to use a range of discrete -values to the gene, then you can use the ``range()`` function. For -example, ``range(1, 7)`` means the set of allowed values for the gene -are ``1, 2, 3, 4, 5, and 6``. You can also use the ``numpy.arange()`` or -``numpy.linspace()`` functions for the same purpose. - -The previous discussion only works with a range of discrete values not -continuous values. In `PyGAD -2.11.0 `__, -the ``gene_space`` parameter can be assigned a dictionary that allows -the gene to have values from a continuous range. - -Assuming you want to restrict the gene within this half-open range [1 to -5) where 1 is included and 5 is not. Then simply create a dictionary -with 2 items where the keys of the 2 items are: - -1. ``'low'``: The minimum value in the range which is 1 in the example. - -2. ``'high'``: The maximum value in the range which is 5 in the example. - -The dictionary will look like that: - -.. code:: python - - {'low': 1, - 'high': 5} - -It is not acceptable to add more than 2 items in the dictionary or use -other keys than ``'low'`` and ``'high'``. - -For a 3-gene problem, the next code creates a dictionary for each gene -to restrict its values in a continuous range. For the first gene, it can -take any floating-point value from the range that starts from 1 -(inclusive) and ends at 5 (exclusive). - -.. code:: python - - gene_space = [{'low': 1, 'high': 5}, {'low': 0.3, 'high': 1.4}, {'low': -0.2, 'high': 4.5}] - -Stop at Any Generation -====================== - -In `PyGAD -2.4.0 `__, -it is possible to stop the genetic algorithm after any generation. All -you need to do it to return the string ``"stop"`` in the callback -function ``on_generation``. When this callback function is implemented -and assigned to the ``on_generation`` parameter in the constructor of -the ``pygad.GA`` class, then the algorithm immediately stops after -completing its current generation. Let's discuss an example. - -Assume that the user wants to stop algorithm either after the 100 -generations or if a condition is met. The user may assign a value of 100 -to the ``num_generations`` parameter of the ``pygad.GA`` class -constructor. - -The condition that stops the algorithm is written in a callback function -like the one in the next code. If the fitness value of the best solution -exceeds 70, then the string ``"stop"`` is returned. - -.. code:: python - - def func_generation(ga_instance): - if ga_instance.best_solution()[1] >= 70: - return "stop" - -Stop Criteria -============= - -In `PyGAD -2.15.0 `__, -a new parameter named ``stop_criteria`` is added to the constructor of -the ``pygad.GA`` class. It helps to stop the evolution based on some -criteria. It can be assigned to one or more criterion. - -Each criterion is passed as ``str`` that consists of 2 parts: - -1. Stop word. - -2. Number. - -It takes this form: - -.. code:: python - - "word_num" - -The current 2 supported words are ``reach`` and ``saturate``. - -The ``reach`` word stops the ``run()`` method if the fitness value is -equal to or greater than a given fitness value. An example for ``reach`` -is ``"reach_40"`` which stops the evolution if the fitness is >= 40. - -``saturate`` stops the evolution if the fitness saturates for a given -number of consecutive generations. An example for ``saturate`` is -``"saturate_7"`` which means stop the ``run()`` method if the fitness -does not change for 7 consecutive generations. - -Here is an example that stops the evolution if either the fitness value -reached ``127.4`` or if the fitness saturates for ``15`` generations. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4, -2, 3.5, 8, 9, 4] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - - return fitness - - ga_instance = pygad.GA(num_generations=200, - sol_per_pop=10, - num_parents_mating=4, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - stop_criteria=["reach_127.4", "saturate_15"]) - - ga_instance.run() - print("Number of generations passed is {generations_completed}".format(generations_completed=ga_instance.generations_completed)) - -Elitism Selection -================= - -In `PyGAD -2.18.0 `__, -a new parameter called ``keep_elitism`` is supported. It accepts an -integer to define the number of elitism (i.e. best solutions) to keep in -the next generation. This parameter defaults to ``1`` which means only -the best solution is kept in the next generation. - -In the next example, the ``keep_elitism`` parameter in the constructor -of the ``pygad.GA`` class is set to 2. Thus, the best 2 solutions in -each generation are kept in the next generation. - -.. code:: python - - import numpy - import pygad - - function_inputs = [4,-2,3.5,5,-11,-4.7] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / numpy.abs(output - desired_output) - return fitness - - ga_instance = pygad.GA(num_generations=2, - num_parents_mating=3, - fitness_func=fitness_func, - num_genes=6, - sol_per_pop=5, - keep_elitism=2) - - ga_instance.run() - -The value passed to the ``keep_elitism`` parameter must satisfy 2 -conditions: - -1. It must be ``>= 0``. - -2. It must be ``<= sol_per_pop``. That is its value cannot exceed the - number of solutions in the current population. - -In the previous example, if the ``keep_elitism`` parameter is set equal -to the value passed to the ``sol_per_pop`` parameter, which is 5, then -there will be no evolution at all as in the next figure. This is because -all the 5 solutions are used as elitism in the next generation and no -offspring will be created. - -.. code:: python - - ... - - ga_instance = pygad.GA(..., - sol_per_pop=5, - keep_elitism=5) - - ga_instance.run() - -.. figure:: https://user-images.githubusercontent.com/16560492/189273225-67ffad41-97ab-45e1-9324-429705e17b20.png - :alt: - -Note that if the ``keep_elitism`` parameter is effective (i.e. is -assigned a positive integer, not zero), then the ``keep_parents`` -parameter will have no effect. Because the default value of the -``keep_elitism`` parameter is 1, then the ``keep_parents`` parameter has -no effect by default. The ``keep_parents`` parameter is only effective -when ``keep_elitism=0``. - -Random Seed -=========== - -In `PyGAD -2.18.0 `__, -a new parameter called ``random_seed`` is supported. Its value is used -as a seed for the random function generators. - -PyGAD uses random functions in these 2 libraries: - -1. NumPy - -2. random - -The ``random_seed`` parameter defaults to ``None`` which means no seed -is used. As a result, different random numbers are generated for each -run of PyGAD. - -If this parameter is assigned a proper seed, then the results will be -reproducible. In the next example, the integer 2 is used as a random -seed. - -.. code:: python - - import numpy - import pygad - - function_inputs = [4,-2,3.5,5,-11,-4.7] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / numpy.abs(output - desired_output) - return fitness - - ga_instance = pygad.GA(num_generations=2, - num_parents_mating=3, - fitness_func=fitness_func, - sol_per_pop=5, - num_genes=6, - random_seed=2) - - ga_instance.run() - best_solution, best_solution_fitness, best_match_idx = ga_instance.best_solution() - print(best_solution) - print(best_solution_fitness) - -This is the best solution found and its fitness value. - -.. code:: - - [ 2.77249188 -4.06570662 0.04196872 -3.47770796 -0.57502138 -3.22775267] - 0.04872203136549972 - -After running the code again, it will find the same result. - -.. code:: - - [ 2.77249188 -4.06570662 0.04196872 -3.47770796 -0.57502138 -3.22775267] - 0.04872203136549972 - -Continue without Loosing Progress -================================= - -In `PyGAD -2.18.0 `__, -and thanks for `Felix Bernhard `__ for -opening `this GitHub -issue `__, -the values of these 4 instance attributes are no longer reset after each -call to the ``run()`` method. - -1. ``self.best_solutions`` - -2. ``self.best_solutions_fitness`` - -3. ``self.solutions`` - -4. ``self.solutions_fitness`` - -This helps the user to continue where the last run stopped without -loosing the values of these 4 attributes. - -Now, the user can save the model by calling the ``save()`` method. - -.. code:: python - - import pygad - - def fitness_func(ga_instance, solution, solution_idx): - ... - return fitness - - ga_instance = pygad.GA(...) - - ga_instance.run() - - ga_instance.plot_fitness() - - ga_instance.save("pygad_GA") - -Then the saved model is loaded by calling the ``load()`` function. After -calling the ``run()`` method over the loaded instance, then the data -from the previous 4 attributes are not reset but extended with the new -data. - -.. code:: python - - import pygad - - def fitness_func(ga_instance, solution, solution_idx): - ... - return fitness - - loaded_ga_instance = pygad.load("pygad_GA") - - loaded_ga_instance.run() - - loaded_ga_instance.plot_fitness() - -The plot created by the ``plot_fitness()`` method will show the data -collected from both the runs. - -Note that the 2 attributes (``self.best_solutions`` and -``self.best_solutions_fitness``) only work if the -``save_best_solutions`` parameter is set to ``True``. Also, the 2 -attributes (``self.solutions`` and ``self.solutions_fitness``) only work -if the ``save_solutions`` parameter is ``True``. - -Prevent Duplicates in Gene Values -================================= - -In `PyGAD -2.13.0 `__, -a new bool parameter called ``allow_duplicate_genes`` is supported to -control whether duplicates are supported in the chromosome or not. In -other words, whether 2 or more genes might have the same exact value. - -If ``allow_duplicate_genes=True`` (which is the default case), genes may -have the same value. If ``allow_duplicate_genes=False``, then no 2 genes -will have the same value given that there are enough unique values for -the genes. - -The next code gives an example to use the ``allow_duplicate_genes`` -parameter. A callback generation function is implemented to print the -population after each generation. - -.. code:: python - - import pygad - - def fitness_func(ga_instance, solution, solution_idx): - return 0 - - def on_generation(ga): - print("Generation", ga.generations_completed) - print(ga.population) - - ga_instance = pygad.GA(num_generations=5, - sol_per_pop=5, - num_genes=4, - mutation_num_genes=3, - random_mutation_min_val=-5, - random_mutation_max_val=5, - num_parents_mating=2, - fitness_func=fitness_func, - gene_type=int, - on_generation=on_generation, - allow_duplicate_genes=False) - ga_instance.run() - -Here are the population after the 5 generations. Note how there are no -duplicate values. - -.. code:: python - - Generation 1 - [[ 2 -2 -3 3] - [ 0 1 2 3] - [ 5 -3 6 3] - [-3 1 -2 4] - [-1 0 -2 3]] - Generation 2 - [[-1 0 -2 3] - [-3 1 -2 4] - [ 0 -3 -2 6] - [-3 0 -2 3] - [ 1 -4 2 4]] - Generation 3 - [[ 1 -4 2 4] - [-3 0 -2 3] - [ 4 0 -2 1] - [-4 0 -2 -3] - [-4 2 0 3]] - Generation 4 - [[-4 2 0 3] - [-4 0 -2 -3] - [-2 5 4 -3] - [-1 2 -4 4] - [-4 2 0 -3]] - Generation 5 - [[-4 2 0 -3] - [-1 2 -4 4] - [ 3 4 -4 0] - [-1 0 2 -2] - [-4 2 -1 1]] - -The ``allow_duplicate_genes`` parameter is configured with use with the -``gene_space`` parameter. Here is an example where each of the 4 genes -has the same space of values that consists of 4 values (1, 2, 3, and 4). - -.. code:: python - - import pygad - - def fitness_func(ga_instance, solution, solution_idx): - return 0 - - def on_generation(ga): - print("Generation", ga.generations_completed) - print(ga.population) - - ga_instance = pygad.GA(num_generations=1, - sol_per_pop=5, - num_genes=4, - num_parents_mating=2, - fitness_func=fitness_func, - gene_type=int, - gene_space=[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]], - on_generation=on_generation, - allow_duplicate_genes=False) - ga_instance.run() - -Even that all the genes share the same space of values, no 2 genes -duplicate their values as provided by the next output. - -.. code:: python - - Generation 1 - [[2 3 1 4] - [2 3 1 4] - [2 4 1 3] - [2 3 1 4] - [1 3 2 4]] - Generation 2 - [[1 3 2 4] - [2 3 1 4] - [1 3 2 4] - [2 3 4 1] - [1 3 4 2]] - Generation 3 - [[1 3 4 2] - [2 3 4 1] - [1 3 4 2] - [3 1 4 2] - [3 2 4 1]] - Generation 4 - [[3 2 4 1] - [3 1 4 2] - [3 2 4 1] - [1 2 4 3] - [1 3 4 2]] - Generation 5 - [[1 3 4 2] - [1 2 4 3] - [2 1 4 3] - [1 2 4 3] - [1 2 4 3]] - -You should care of giving enough values for the genes so that PyGAD is -able to find alternatives for the gene value in case it duplicates with -another gene. - -There might be 2 duplicate genes where changing either of the 2 -duplicating genes will not solve the problem. For example, if -``gene_space=[[3, 0, 1], [4, 1, 2], [0, 2], [3, 2, 0]]`` and the -solution is ``[3 2 0 0]``, then the values of the last 2 genes -duplicate. There are no possible changes in the last 2 genes to solve -the problem. - -This problem can be solved by randomly changing one of the -non-duplicating genes that may make a room for a unique value in one the -2 duplicating genes. For example, by changing the second gene from 2 to -4, then any of the last 2 genes can take the value 2 and solve the -duplicates. The resultant gene is then ``[3 4 2 0]``. But this option is -not yet supported in PyGAD. - -User-Defined Crossover, Mutation, and Parent Selection Operators -================================================================ - -Previously, the user can select the the type of the crossover, mutation, -and parent selection operators by assigning the name of the operator to -the following parameters of the ``pygad.GA`` class's constructor: - -1. ``crossover_type`` - -2. ``mutation_type`` - -3. ``parent_selection_type`` - -This way, the user can only use the built-in functions for each of these -operators. - -Starting from `PyGAD -2.16.0 `__, -the user can create a custom crossover, mutation, and parent selection -operators and assign these functions to the above parameters. Thus, a -new operator can be plugged easily into the `PyGAD -Lifecycle `__. - -This is a sample code that does not use any custom function. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4,-2,3.5] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func) - - ga_instance.run() - ga_instance.plot_fitness() - -This section describes the expected input parameters and outputs. For -simplicity, all of these custom functions all accept the instance of the -``pygad.GA`` class as the last parameter. - -User-Defined Crossover Operator -------------------------------- - -The user-defined crossover function is a Python function that accepts 3 -parameters: - -1. The selected parents. - -2. The size of the offspring as a tuple of 2 numbers: (the offspring - size, number of genes). - -3. The instance from the ``pygad.GA`` class. This instance helps to - retrieve any property like ``population``, ``gene_type``, - ``gene_space``, etc. - -This function should return a NumPy array of shape equal to the value -passed to the second parameter. - -The next code creates a template for the user-defined crossover -operator. You can use any names for the parameters. Note how a NumPy -array is returned. - -.. code:: python - - def crossover_func(parents, offspring_size, ga_instance): - offspring = ... - ... - return numpy.array(offspring) - -As an example, the next code creates a single-point crossover function. -By randomly generating a random point (i.e. index of a gene), the -function simply uses 2 parents to produce an offspring by copying the -genes before the point from the first parent and the remaining from the -second parent. - -.. code:: python - - def crossover_func(parents, offspring_size, ga_instance): - offspring = [] - idx = 0 - while len(offspring) != offspring_size[0]: - parent1 = parents[idx % parents.shape[0], :].copy() - parent2 = parents[(idx + 1) % parents.shape[0], :].copy() - - random_split_point = numpy.random.choice(range(offspring_size[1])) - - parent1[random_split_point:] = parent2[random_split_point:] - - offspring.append(parent1) - - idx += 1 - - return numpy.array(offspring) - -To use this user-defined function, simply assign its name to the -``crossover_type`` parameter in the constructor of the ``pygad.GA`` -class. The next code gives an example. In this case, the custom function -will be called in each generation rather than calling the built-in -crossover functions defined in PyGAD. - -.. code:: python - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - crossover_type=crossover_func) - -User-Defined Mutation Operator ------------------------------- - -A user-defined mutation function/operator can be created the same way a -custom crossover operator/function is created. Simply, it is a Python -function that accepts 2 parameters: - -1. The offspring to be mutated. - -2. The instance from the ``pygad.GA`` class. This instance helps to - retrieve any property like ``population``, ``gene_type``, - ``gene_space``, etc. - -The template for the user-defined mutation function is given in the next -code. According to the user preference, the function should make some -random changes to the genes. - -.. code:: python - - def mutation_func(offspring, ga_instance): - ... - return offspring - -The next code builds the random mutation where a single gene from each -chromosome is mutated by adding a random number between 0 and 1 to the -gene's value. - -.. code:: python - - def mutation_func(offspring, ga_instance): - - for chromosome_idx in range(offspring.shape[0]): - random_gene_idx = numpy.random.choice(range(offspring.shape[1])) - - offspring[chromosome_idx, random_gene_idx] += numpy.random.random() - - return offspring - -Here is how this function is assigned to the ``mutation_type`` -parameter. - -.. code:: python - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - crossover_type=crossover_func, - mutation_type=mutation_func) - -Note that there are other things to take into consideration like: - -- Making sure that each gene conforms to the data type(s) listed in the - ``gene_type`` parameter. - -- If the ``gene_space`` parameter is used, then the new value for the - gene should conform to the values/ranges listed. - -- Mutating a number of genes that conforms to the parameters - ``mutation_percent_genes``, ``mutation_probability``, and - ``mutation_num_genes``. - -- Whether mutation happens with or without replacement based on the - ``mutation_by_replacement`` parameter. - -- The minimum and maximum values from which a random value is generated - based on the ``random_mutation_min_val`` and - ``random_mutation_max_val`` parameters. - -- Whether duplicates are allowed or not in the chromosome based on the - ``allow_duplicate_genes`` parameter. - -and more. - -It all depends on your objective from building the mutation function. -You may neglect or consider some of the considerations according to your -objective. - -User-Defined Parent Selection Operator --------------------------------------- - -No much to mention about building a user-defined parent selection -function as things are similar to building a crossover or mutation -function. Just create a Python function that accepts 3 parameters: - -1. The fitness values of the current population. - -2. The number of parents needed. - -3. The instance from the ``pygad.GA`` class. This instance helps to - retrieve any property like ``population``, ``gene_type``, - ``gene_space``, etc. - -The function should return 2 outputs: - -1. The selected parents as a NumPy array. Its shape is equal to (the - number of selected parents, ``num_genes``). Note that the number of - selected parents is equal to the value assigned to the second input - parameter. - -2. The indices of the selected parents inside the population. It is a 1D - list with length equal to the number of selected parents. - -The outputs must be of type ``numpy.ndarray``. - -Here is a template for building a custom parent selection function. - -.. code:: python - - def parent_selection_func(fitness, num_parents, ga_instance): - ... - return parents, fitness_sorted[:num_parents] - -The next code builds the steady-state parent selection where the best -parents are selected. The number of parents is equal to the value in the -``num_parents`` parameter. - -.. code:: python - - def parent_selection_func(fitness, num_parents, ga_instance): - - fitness_sorted = sorted(range(len(fitness)), key=lambda k: fitness[k]) - fitness_sorted.reverse() - - parents = numpy.empty((num_parents, ga_instance.population.shape[1])) - - for parent_num in range(num_parents): - parents[parent_num, :] = ga_instance.population[fitness_sorted[parent_num], :].copy() - - return parents, numpy.array(fitness_sorted[:num_parents]) - -Finally, the defined function is assigned to the -``parent_selection_type`` parameter as in the next code. - -.. code:: python - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - crossover_type=crossover_func, - mutation_type=mutation_func, - parent_selection_type=parent_selection_func) - -Example -------- - -By discussing how to customize the 3 operators, the next code uses the -previous 3 user-defined functions instead of the built-in functions. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4,-2,3.5] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - - return fitness - - def parent_selection_func(fitness, num_parents, ga_instance): - - fitness_sorted = sorted(range(len(fitness)), key=lambda k: fitness[k]) - fitness_sorted.reverse() - - parents = numpy.empty((num_parents, ga_instance.population.shape[1])) - - for parent_num in range(num_parents): - parents[parent_num, :] = ga_instance.population[fitness_sorted[parent_num], :].copy() - - return parents, numpy.array(fitness_sorted[:num_parents]) - - def crossover_func(parents, offspring_size, ga_instance): - - offspring = [] - idx = 0 - while len(offspring) != offspring_size[0]: - parent1 = parents[idx % parents.shape[0], :].copy() - parent2 = parents[(idx + 1) % parents.shape[0], :].copy() - - random_split_point = numpy.random.choice(range(offspring_size[1])) - - parent1[random_split_point:] = parent2[random_split_point:] - - offspring.append(parent1) - - idx += 1 - - return numpy.array(offspring) - - def mutation_func(offspring, ga_instance): - - for chromosome_idx in range(offspring.shape[0]): - random_gene_idx = numpy.random.choice(range(offspring.shape[0])) - - offspring[chromosome_idx, random_gene_idx] += numpy.random.random() - - return offspring - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - crossover_type=crossover_func, - mutation_type=mutation_func, - parent_selection_type=parent_selection_func) - - ga_instance.run() - ga_instance.plot_fitness() - -This is the same example but using methods instead of functions. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4,-2,3.5] - desired_output = 44 - - class Test: - def fitness_func(self, ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - - return fitness - - def parent_selection_func(self, fitness, num_parents, ga_instance): - - fitness_sorted = sorted(range(len(fitness)), key=lambda k: fitness[k]) - fitness_sorted.reverse() - - parents = numpy.empty((num_parents, ga_instance.population.shape[1])) - - for parent_num in range(num_parents): - parents[parent_num, :] = ga_instance.population[fitness_sorted[parent_num], :].copy() - - return parents, numpy.array(fitness_sorted[:num_parents]) - - def crossover_func(self, parents, offspring_size, ga_instance): - - offspring = [] - idx = 0 - while len(offspring) != offspring_size[0]: - parent1 = parents[idx % parents.shape[0], :].copy() - parent2 = parents[(idx + 1) % parents.shape[0], :].copy() - - random_split_point = numpy.random.choice(range(offspring_size[0])) - - parent1[random_split_point:] = parent2[random_split_point:] - - offspring.append(parent1) - - idx += 1 - - return numpy.array(offspring) - - def mutation_func(self, offspring, ga_instance): - - for chromosome_idx in range(offspring.shape[0]): - random_gene_idx = numpy.random.choice(range(offspring.shape[1])) - - offspring[chromosome_idx, random_gene_idx] += numpy.random.random() - - return offspring - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=Test().fitness_func, - parent_selection_type=Test().parent_selection_func, - crossover_type=Test().crossover_func, - mutation_type=Test().mutation_func) - - ga_instance.run() - ga_instance.plot_fitness() - -.. _more-about-the-genespace-parameter: - -More about the ``gene_space`` Parameter -======================================= - -The ``gene_space`` parameter customizes the space of values of each -gene. - -Assuming that all genes have the same global space which include the -values 0.3, 5.2, -4, and 8, then those values can be assigned to the -``gene_space`` parameter as a list, tuple, or range. Here is a list -assigned to this parameter. By doing that, then the gene values are -restricted to those assigned to the ``gene_space`` parameter. - -.. code:: python - - gene_space = [0.3, 5.2, -4, 8] - -If some genes have different spaces, then ``gene_space`` should accept a -nested list or tuple. In this case, the elements could be: - -1. Number (of ``int``, ``float``, or ``NumPy`` data types): A single - value to be assigned to the gene. This means this gene will have the - same value across all generations. - -2. ``list``, ``tuple``, ``numpy.ndarray``, or any range like ``range``, - ``numpy.arange()``, or ``numpy.linspace``: It holds the space for - each individual gene. But this space is usually discrete. That is - there is a set of finite values to select from. - -3. ``dict``: To sample a value for a gene from a continuous range. The - dictionary must have 2 mandatory keys which are ``"low"`` and - ``"high"`` in addition to an optional key which is ``"step"``. A - random value is returned between the values assigned to the items - with ``"low"`` and ``"high"`` keys. If the ``"step"`` exists, then - this works as the previous options (i.e. discrete set of values). - -4. ``None``: A gene with its space set to ``None`` is initialized - randomly from the range specified by the 2 parameters - ``init_range_low`` and ``init_range_high``. For mutation, its value - is mutated based on a random value from the range specified by the 2 - parameters ``random_mutation_min_val`` and - ``random_mutation_max_val``. If all elements in the ``gene_space`` - parameter are ``None``, the parameter will not have any effect. - -Assuming that a chromosome has 2 genes and each gene has a different -value space. Then the ``gene_space`` could be assigned a nested -list/tuple where each element determines the space of a gene. - -According to the next code, the space of the first gene is ``[0.4, -5]`` -which has 2 values and the space for the second gene is -``[0.5, -3.2, 8.8, -9]`` which has 4 values. - -.. code:: python - - gene_space = [[0.4, -5], [0.5, -3.2, 8.2, -9]] - -For a 2 gene chromosome, if the first gene space is restricted to the -discrete values from 0 to 4 and the second gene is restricted to the -values from 10 to 19, then it could be specified according to the next -code. - -.. code:: python - - gene_space = [range(5), range(10, 20)] - -The ``gene_space`` can also be assigned to a single range, as given -below, where the values of all genes are sampled from the same range. - -.. code:: python - - gene_space = numpy.arange(15) - -The ``gene_space`` can be assigned a dictionary to sample a value from a -continuous range. - -.. code:: python - - gene_space = {"low": 4, "high": 30} - -A step also can be assigned to the dictionary. This works as if a range -is used. - -.. code:: python - - gene_space = {"low": 4, "high": 30, "step": 2.5} - -If a ``None`` is assigned to only a single gene, then its value will be -randomly generated initially using the ``init_range_low`` and -``init_range_high`` parameters in the ``pygad.GA`` class's constructor. -During mutation, the value are sampled from the range defined by the 2 -parameters ``random_mutation_min_val`` and ``random_mutation_max_val``. -This is an example where the second gene is given a ``None`` value. - -.. code:: python - - gene_space = [range(5), None, numpy.linspace(10, 20, 300)] - -If the user did not assign the initial population to the -``initial_population`` parameter, the initial population is created -randomly based on the ``gene_space`` parameter. Moreover, the mutation -is applied based on this parameter. - -.. _more-about-the-genetype-parameter: - -More about the ``gene_type`` Parameter -====================================== - -The ``gene_type`` parameter allows the user to control the data type for -all genes at once or each individual gene. In `PyGAD -2.15.0 `__, -the ``gene_type`` parameter also supports customizing the precision for -``float`` data types. As a result, the ``gene_type`` parameter helps to: - -1. Select a data type for all genes with or without precision. - -2. Select a data type for each individual gene with or without - precision. - -Let's discuss things by examples. - -Data Type for All Genes without Precision ------------------------------------------ - -The data type for all genes can be specified by assigning the numeric -data type directly to the ``gene_type`` parameter. This is an example to -make all genes of ``int`` data types. - -.. code:: python - - gene_type=int - -Given that the supported numeric data types of PyGAD include Python's -``int`` and ``float`` in addition to all numeric types of ``NumPy``, -then any of these types can be assigned to the ``gene_type`` parameter. - -If no precision is specified for a ``float`` data type, then the -complete floating-point number is kept. - -The next code uses an ``int`` data type for all genes where the genes in -the initial and final population are only integers. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4, -2, 3.5, 8, -2] - desired_output = 2671.1234 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - gene_type=int) - - print("Initial Population") - print(ga_instance.initial_population) - - ga_instance.run() - - print("Final Population") - print(ga_instance.population) - -.. code:: python - - Initial Population - [[ 1 -1 2 0 -3] - [ 0 -2 0 -3 -1] - [ 0 -1 -1 2 0] - [-2 3 -2 3 3] - [ 0 0 2 -2 -2]] - - Final Population - [[ 1 -1 2 2 0] - [ 1 -1 2 2 0] - [ 1 -1 2 2 0] - [ 1 -1 2 2 0] - [ 1 -1 2 2 0]] - -Data Type for All Genes with Precision --------------------------------------- - -A precision can only be specified for a ``float`` data type and cannot -be specified for integers. Here is an example to use a precision of 3 -for the ``float`` data type. In this case, all genes are of type -``float`` and their maximum precision is 3. - -.. code:: python - - gene_type=[float, 3] - -The next code uses prints the initial and final population where the -genes are of type ``float`` with precision 3. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4, -2, 3.5, 8, -2] - desired_output = 2671.1234 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - - return fitness - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - gene_type=[float, 3]) - - print("Initial Population") - print(ga_instance.initial_population) - - ga_instance.run() - - print("Final Population") - print(ga_instance.population) - -.. code:: python - - Initial Population - [[-2.417 -0.487 3.623 2.457 -2.362] - [-1.231 0.079 -1.63 1.629 -2.637] - [ 0.692 -2.098 0.705 0.914 -3.633] - [ 2.637 -1.339 -1.107 -0.781 -3.896] - [-1.495 1.378 -1.026 3.522 2.379]] - - Final Population - [[ 1.714 -1.024 3.623 3.185 -2.362] - [ 0.692 -1.024 3.623 3.185 -2.362] - [ 0.692 -1.024 3.623 3.375 -2.362] - [ 0.692 -1.024 4.041 3.185 -2.362] - [ 1.714 -0.644 3.623 3.185 -2.362]] - -Data Type for each Individual Gene without Precision ----------------------------------------------------- - -In `PyGAD -2.14.0 `__, -the ``gene_type`` parameter allows customizing the gene type for each -individual gene. This is by using a ``list``/``tuple``/``numpy.ndarray`` -with number of elements equal to the number of genes. For each element, -a type is specified for the corresponding gene. - -This is an example for a 5-gene problem where different types are -assigned to the genes. - -.. code:: python - - gene_type=[int, float, numpy.float16, numpy.int8, float] - -This is a complete code that prints the initial and final population for -a custom-gene data type. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4, -2, 3.5, 8, -2] - desired_output = 2671.1234 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - gene_type=[int, float, numpy.float16, numpy.int8, float]) - - print("Initial Population") - print(ga_instance.initial_population) - - ga_instance.run() - - print("Final Population") - print(ga_instance.population) - -.. code:: python - - Initial Population - [[0 0.8615522360026828 0.7021484375 -2 3.5301821368185866] - [-3 2.648189378595294 -3.830078125 1 -0.9586271572917742] - [3 3.7729827570110714 1.2529296875 -3 1.395741994211889] - [0 1.0490687178053282 1.51953125 -2 0.7243617940450235] - [0 -0.6550158436937226 -2.861328125 -2 1.8212734549263097]] - - Final Population - [[3 3.7729827570110714 2.055 0 0.7243617940450235] - [3 3.7729827570110714 1.458 0 -0.14638754050305036] - [3 3.7729827570110714 1.458 0 0.0869406120516778] - [3 3.7729827570110714 1.458 0 0.7243617940450235] - [3 3.7729827570110714 1.458 0 -0.14638754050305036]] - -Data Type for each Individual Gene with Precision -------------------------------------------------- - -The precision can also be specified for the ``float`` data types as in -the next line where the second gene precision is 2 and last gene -precision is 1. - -.. code:: python - - gene_type=[int, [float, 2], numpy.float16, numpy.int8, [float, 1]] - -This is a complete example where the initial and final populations are -printed where the genes comply with the data types and precisions -specified. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4, -2, 3.5, 8, -2] - desired_output = 2671.1234 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - gene_type=[int, [float, 2], numpy.float16, numpy.int8, [float, 1]]) - - print("Initial Population") - print(ga_instance.initial_population) - - ga_instance.run() - - print("Final Population") - print(ga_instance.population) - -.. code:: python - - Initial Population - [[-2 -1.22 1.716796875 -1 0.2] - [-1 -1.58 -3.091796875 0 -1.3] - [3 3.35 -0.107421875 1 -3.3] - [-2 -3.58 -1.779296875 0 0.6] - [2 -3.73 2.65234375 3 -0.5]] - - Final Population - [[2 -4.22 3.47 3 -1.3] - [2 -3.73 3.47 3 -1.3] - [2 -4.22 3.47 2 -1.3] - [2 -4.58 3.47 3 -1.3] - [2 -3.73 3.47 3 -1.3]] - -Visualization in PyGAD -====================== - -This section discusses the different options to visualize the results in -PyGAD through these methods: - -1. ``plot_fitness()`` - -2. ``plot_genes()`` - -3. ``plot_new_solution_rate()`` - -In the following code, the ``save_solutions`` flag is set to ``True`` -which means all solutions are saved in the ``solutions`` attribute. The -code runs for only 10 generations. - -.. code:: python - - import pygad - import numpy - - equation_inputs = [4, -2, 3.5, 8, -2, 3.5, 8] - desired_output = 2671.1234 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=10, - num_parents_mating=5, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - gene_space=[range(1, 10), range(10, 20), range(15, 30), range(20, 40), range(25, 50), range(10, 30), range(20, 50)], - gene_type=int, - save_solutions=True) - - ga_instance.run() - -Let's explore how to visualize the results by the above mentioned -methods. - -.. _plotfitness-2: - -``plot_fitness()`` ------------------- - -The ``plot_fitness()`` method shows the fitness value for each -generation. - -.. _plottypeplot: - -``plot_type="plot"`` -~~~~~~~~~~~~~~~~~~~~ - -The simplest way to call this method is as follows leaving the -``plot_type`` with its default value ``"plot"`` to create a continuous -line connecting the fitness values across all generations: - -.. code:: python - - ga_instance.plot_fitness() - # ga_instance.plot_fitness(plot_type="plot") - -.. figure:: https://user-images.githubusercontent.com/16560492/122472609-d02f5280-cf8e-11eb-88a7-f9366ff6e7c6.png - :alt: - -.. _plottypescatter: - -``plot_type="scatter"`` -~~~~~~~~~~~~~~~~~~~~~~~ - -The ``plot_type`` can also be set to ``"scatter"`` to create a scatter -graph with each individual fitness represented as a dot. The size of -these dots can be changed using the ``linewidth`` parameter. - -.. code:: python - - ga_instance.plot_fitness(plot_type="scatter") - -.. figure:: https://user-images.githubusercontent.com/16560492/122473159-75e2c180-cf8f-11eb-942d-31279b286dbd.png - :alt: - -.. _plottypebar: - -``plot_type="bar"`` -~~~~~~~~~~~~~~~~~~~ - -The third value for the ``plot_type`` parameter is ``"bar"`` to create a -bar graph with each individual fitness represented as a bar. - -.. code:: python - - ga_instance.plot_fitness(plot_type="bar") - -.. figure:: https://user-images.githubusercontent.com/16560492/122473340-b7736c80-cf8f-11eb-89c5-4f7db3b653cc.png - :alt: - -.. _plotnewsolutionrate-2: - -``plot_new_solution_rate()`` ----------------------------- - -The ``plot_new_solution_rate()`` method presents the number of new -solutions explored in each generation. This helps to figure out if the -genetic algorithm is able to find new solutions as an indication of more -possible evolution. If no new solutions are explored, this is an -indication that no further evolution is possible. - -The ``plot_new_solution_rate()`` method accepts the same parameters as -in the ``plot_fitness()`` method with 3 possible values for -``plot_type`` parameter. - -.. _plottypeplot-2: - -``plot_type="plot"`` -~~~~~~~~~~~~~~~~~~~~ - -The default value for the ``plot_type`` parameter is ``"plot"``. - -.. code:: python - - ga_instance.plot_new_solution_rate() - # ga_instance.plot_new_solution_rate(plot_type="plot") - -The next figure shows that, for example, generation 6 has the least -number of new solutions which is 4. The number of new solutions in the -first generation is always equal to the number of solutions in the -population (i.e. the value assigned to the ``sol_per_pop`` parameter in -the constructor of the ``pygad.GA`` class) which is 10 in this example. - -.. figure:: https://user-images.githubusercontent.com/16560492/122475815-3322e880-cf93-11eb-9648-bf66f823234b.png - :alt: - -.. _plottypescatter-2: - -``plot_type="scatter"`` -~~~~~~~~~~~~~~~~~~~~~~~ - -The previous graph can be represented as scattered points by setting -``plot_type="scatter"``. - -.. code:: python - - ga_instance.plot_new_solution_rate(plot_type="scatter") - -.. figure:: https://user-images.githubusercontent.com/16560492/122476108-adec0380-cf93-11eb-80ac-7588bf90492f.png - :alt: - -.. _plottypebar-2: - -``plot_type="bar"`` -~~~~~~~~~~~~~~~~~~~ - -By setting ``plot_type="scatter"``, each value is represented as a -vertical bar. - -.. code:: python - - ga_instance.plot_new_solution_rate(plot_type="bar") - -.. figure:: https://user-images.githubusercontent.com/16560492/122476173-c2c89700-cf93-11eb-9e77-d39737cd3a96.png - :alt: - -.. _plotgenes-2: - -``plot_genes()`` ----------------- - -The ``plot_genes()`` method is the third option to visualize the PyGAD -results. This method has 3 control variables: - -1. ``graph_type="plot"``: Can be ``"plot"`` (default), ``"boxplot"``, or - ``"histogram"``. - -2. ``plot_type="plot"``: Identical to the ``plot_type`` parameter - explored in the ``plot_fitness()`` and ``plot_new_solution_rate()`` - methods. - -3. ``solutions="all"``: Can be ``"all"`` (default) or ``"best"``. - -These 3 parameters controls the style of the output figure. - -The ``graph_type`` parameter selects the type of the graph which helps -to explore the gene values as: - -1. A normal plot. - -2. A histogram. - -3. A box and whisker plot. - -The ``plot_type`` parameter works only when the type of the graph is set -to ``"plot"``. - -The ``solutions`` parameter selects whether the genes come from all -solutions in the population or from just the best solutions. - -.. _graphtypeplot: - -``graph_type="plot"`` -~~~~~~~~~~~~~~~~~~~~~ - -When ``graph_type="plot"``, then the figure creates a normal graph where -the relationship between the gene values and the generation numbers is -represented as a continuous plot, scattered points, or bars. - -.. _plottypeplot-3: - -``plot_type="plot"`` -^^^^^^^^^^^^^^^^^^^^ - -Because the default value for both ``graph_type`` and ``plot_type`` is -``"plot"``, then all of the lines below creates the same figure. This -figure is helpful to know whether a gene value lasts for more -generations as an indication of the best value for this gene. For -example, the value 16 for the gene with index 5 (at column 2 and row 2 -of the next graph) lasted for 83 generations. - -.. code:: python - - ga_instance.plot_genes() - - ga_instance.plot_genes(graph_type="plot") - - ga_instance.plot_genes(plot_type="plot") - - ga_instance.plot_genes(graph_type="plot", - plot_type="plot") - -.. figure:: https://user-images.githubusercontent.com/16560492/122477158-4a62d580-cf95-11eb-8c93-9b6e74cb814c.png - :alt: - -As the default value for the ``solutions`` parameter is ``"all"``, then -the following method calls generate the same plot. - -.. code:: python - - ga_instance.plot_genes(solutions="all") - - ga_instance.plot_genes(graph_type="plot", - solutions="all") - - ga_instance.plot_genes(plot_type="plot", - solutions="all") - - ga_instance.plot_genes(graph_type="plot", - plot_type="plot", - solutions="all") - -.. _plottypescatter-3: - -``plot_type="scatter"`` -^^^^^^^^^^^^^^^^^^^^^^^ - -The following calls of the ``plot_genes()`` method create the same -scatter plot. - -.. code:: python - - ga_instance.plot_genes(plot_type="scatter") - - ga_instance.plot_genes(graph_type="plot", - plot_type="scatter", - solutions='all') - -.. figure:: https://user-images.githubusercontent.com/16560492/122477273-73836600-cf95-11eb-828f-f357c7b0f815.png - :alt: - -.. _plottypebar-3: - -``plot_type="bar"`` -^^^^^^^^^^^^^^^^^^^ - -.. code:: python - - ga_instance.plot_genes(plot_type="bar") - - ga_instance.plot_genes(graph_type="plot", - plot_type="bar", - solutions='all') - -.. figure:: https://user-images.githubusercontent.com/16560492/122477370-99106f80-cf95-11eb-8643-865b55e6b844.png - :alt: - -.. _graphtypeboxplot: - -``graph_type="boxplot"`` -~~~~~~~~~~~~~~~~~~~~~~~~ - -By setting ``graph_type`` to ``"boxplot"``, then a box and whisker graph -is created. Now, the ``plot_type`` parameter has no effect. - -The following 2 calls of the ``plot_genes()`` method create the same -figure as the default value for the ``solutions`` parameter is -``"all"``. - -.. code:: python - - ga_instance.plot_genes(graph_type="boxplot") - - ga_instance.plot_genes(graph_type="boxplot", - solutions='all') - -.. figure:: https://user-images.githubusercontent.com/16560492/122479260-beeb4380-cf98-11eb-8f08-23707929b12c.png - :alt: - -.. _graphtypehistogram: - -``graph_type="histogram"`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -For ``graph_type="boxplot"``, then a histogram is created for each gene. -Similar to ``graph_type="boxplot"``, the ``plot_type`` parameter has no -effect. - -The following 2 calls of the ``plot_genes()`` method create the same -figure as the default value for the ``solutions`` parameter is -``"all"``. - -.. code:: python - - ga_instance.plot_genes(graph_type="histogram") - - ga_instance.plot_genes(graph_type="histogram", - solutions='all') - -.. figure:: https://user-images.githubusercontent.com/16560492/122477314-8007be80-cf95-11eb-9c95-da3f49204151.png - :alt: - -All the previous figures can be created for only the best solutions by -setting ``solutions="best"``. - -Parallel Processing in PyGAD -============================ - -Starting from `PyGAD -2.17.0 `__, -parallel processing becomes supported. This section explains how to use -parallel processing in PyGAD. - -According to the `PyGAD -lifecycle `__, -parallel processing can be parallelized in only 2 operations: - -1. Population fitness calculation. - -2. Mutation. - -The reason is that the calculations in these 2 operations are -independent (i.e. each solution/chromosome is handled independently from -the others) and can be distributed across different processes or -threads. - -For the mutation operation, it does not do intensive calculations on the -CPU. Its calculations are simple like flipping the values of some genes -from 0 to 1 or adding a random value to some genes. So, it does not take -much CPU processing time. Experiments proved that parallelizing the -mutation operation across the solutions increases the time instead of -reducing it. This is because running multiple processes or threads adds -overhead to manage them. Thus, parallel processing cannot be applied on -the mutation operation. - -For the population fitness calculation, parallel processing can help -make a difference and reduce the processing time. But this is -conditional on the type of calculations done in the fitness function. If -the fitness function makes intensive calculations and takes much -processing time from the CPU, then it is probably that parallel -processing will help to cut down the overall time. - -This section explains how parallel processing works in PyGAD and how to -use parallel processing in PyGAD - -How to Use Parallel Processing in PyGAD ---------------------------------------- - -Starting from `PyGAD -2.17.0 `__, -a new parameter called ``parallel_processing`` added to the constructor -of the ``pygad.GA`` class. - -.. code:: python - - import pygad - ... - ga_instance = pygad.GA(..., - parallel_processing=...) - ... - -This parameter allows the user to do the following: - -1. Enable parallel processing. - -2. Select whether processes or threads are used. - -3. Specify the number of processes or threads to be used. - -These are 3 possible values for the ``parallel_processing`` parameter: - -1. ``None``: (Default) It means no parallel processing is used. - -2. A positive integer referring to the number of threads to be used - (i.e. threads, not processes, are used. - -3. ``list``/``tuple``: If a list or a tuple of exactly 2 elements is - assigned, then: - - 1. The first element can be either ``'process'`` or ``'thread'`` to - specify whether processes or threads are used, respectively. - - 2. The second element can be: - - 1. A positive integer to select the maximum number of processes or - threads to be used - - 2. ``0`` to indicate that 0 processes or threads are used. It - means no parallel processing. This is identical to setting - ``parallel_processing=None``. - - 3. ``None`` to use the default value as calculated by the - ``concurrent.futures module``. - -These are examples of the values assigned to the ``parallel_processing`` -parameter: - -- ``parallel_processing=4``: Because the parameter is assigned a - positive integer, this means parallel processing is activated where 4 - threads are used. - -- ``parallel_processing=["thread", 5]``: Use parallel processing with 5 - threads. This is identical to ``parallel_processing=5``. - -- ``parallel_processing=["process", 8]``: Use parallel processing with - 8 processes. - -- ``parallel_processing=["process", 0]``: As the second element is - given the value 0, this means do not use parallel processing. This is - identical to ``parallel_processing=None``. - -Examples --------- - -The examples will help you know the difference between using processes -and threads. Moreover, it will give an idea when parallel processing -would make a difference and reduce the time. These are dummy examples -where the fitness function is made to always return 0. - -The first example uses 10 genes, 5 solutions in the population where -only 3 solutions mate, and 9999 generations. The fitness function uses a -``for`` loop with 100 iterations just to have some calculations. In the -constructor of the ``pygad.GA`` class, ``parallel_processing=None`` -means no parallel processing is used. - -.. code:: python - - import pygad - import time - - def fitness_func(ga_instance, solution, solution_idx): - for _ in range(99): - pass - return 0 - - ga_instance = pygad.GA(num_generations=9999, - num_parents_mating=3, - sol_per_pop=5, - num_genes=10, - fitness_func=fitness_func, - suppress_warnings=True, - parallel_processing=None) - - if __name__ == '__main__': - t1 = time.time() - - ga_instance.run() - - t2 = time.time() - print("Time is", t2-t1) - -When parallel processing is not used, the time it takes to run the -genetic algorithm is ``1.5`` seconds. - -In the comparison, let's do a second experiment where parallel -processing is used with 5 threads. In this case, it take ``5`` seconds. - -.. code:: python - - ... - ga_instance = pygad.GA(..., - parallel_processing=5) - ... - -For the third experiment, processes instead of threads are used. Also, -only 99 generations are used instead of 9999. The time it takes is -``99`` seconds. - -.. code:: python - - ... - ga_instance = pygad.GA(num_generations=99, - ..., - parallel_processing=["process", 5]) - ... - -This is the summary of the 3 experiments: - -1. No parallel processing & 9999 generations: 1.5 seconds. - -2. Parallel processing with 5 threads & 9999 generations: 5 seconds - -3. Parallel processing with 5 processes & 99 generations: 99 seconds - -Because the fitness function does not need much CPU time, the normal -processing takes the least time. Running processes for this simple -problem takes 99 compared to only 5 seconds for threads because managing -processes is much heavier than managing threads. Thus, most of the CPU -time is for swapping the processes instead of executing the code. - -In the second example, the loop makes 99999999 iterations and only 5 -generations are used. With no parallelization, it takes 22 seconds. - -.. code:: python - - import pygad - import time - - def fitness_func(ga_instance, solution, solution_idx): - for _ in range(99999999): - pass - return 0 - - ga_instance = pygad.GA(num_generations=5, - num_parents_mating=3, - sol_per_pop=5, - num_genes=10, - fitness_func=fitness_func, - suppress_warnings=True, - parallel_processing=None) - - if __name__ == '__main__': - t1 = time.time() - ga_instance.run() - t2 = time.time() - print("Time is", t2-t1) - -It takes 15 seconds when 10 processes are used. - -.. code:: python - - ... - ga_instance = pygad.GA(..., - parallel_processing=["process", 10]) - ... - -This is compared to 20 seconds when 10 threads are used. - -.. code:: python - - ... - ga_instance = pygad.GA(..., - parallel_processing=["thread", 10]) - ... - -Based on the second example, using parallel processing with 10 processes -takes the least time because there is much CPU work done. Generally, -processes are preferred over threads when most of the work in on the -CPU. Threads are preferred over processes in some situations like doing -input/output operations. - -*Before releasing* `PyGAD -2.17.0 `__\ *,* -`László -Fazekas `__ -*wrote an article to parallelize the fitness function with PyGAD. Check -it:* `How Genetic Algorithms Can Compete with Gradient Descent and -Backprop `__. - -Print Lifecycle Summary -======================= - -In `PyGAD -2.19.0 `__, -a new method called ``summary()`` is supported. It prints a Keras-like -summary of the PyGAD lifecycle showing the steps, callback functions, -parameters, etc. - -This method accepts the following parameters: - -- ``line_length=70``: An integer representing the length of the single - line in characters. - -- ``fill_character=" "``: A character to fill the lines. - -- ``line_character="-"``: A character for creating a line separator. - -- ``line_character2="="``: A secondary character to create a line - separator. - -- ``columns_equal_len=False``: The table rows are split into - equal-sized columns or split subjective to the width needed. - -- ``print_step_parameters=True``: Whether to print extra parameters - about each step inside the step. If ``print_step_parameters=False`` - and ``print_parameters_summary=True``, then the parameters of each - step are printed at the end of the table. - -- ``print_parameters_summary=True``: Whether to print parameters - summary at the end of the table. If ``print_step_parameters=False``, - then the parameters of each step are printed at the end of the table - too. - -This is a quick example to create a PyGAD example. - -.. code:: python - - import pygad - import numpy - - function_inputs = [4,-2,3.5,5,-11,-4.7] - desired_output = 44 - - def genetic_fitness(solution, solution_idx): - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - def on_gen(ga): - pass - - def on_crossover_callback(a, b): - pass - - ga_instance = pygad.GA(num_generations=100, - num_parents_mating=10, - sol_per_pop=20, - num_genes=len(function_inputs), - on_crossover=on_crossover_callback, - on_generation=on_gen, - parallel_processing=2, - stop_criteria="reach_10", - fitness_batch_size=4, - crossover_probability=0.4, - fitness_func=genetic_fitness) - -Then call the ``summary()`` method to print the summary with the default -parameters. Note that entries for the crossover and generation callback -function are created because their callback functions are implemented -through the ``on_crossover_callback()`` and ``on_gen()``, respectively. - -.. code:: python - - ga_instance.summary() - -.. code:: bash - - ---------------------------------------------------------------------- - PyGAD Lifecycle - ====================================================================== - Step Handler Output Shape - ====================================================================== - Fitness Function genetic_fitness() (1) - Fitness batch size: 4 - ---------------------------------------------------------------------- - Parent Selection steady_state_selection() (10, 6) - Number of Parents: 10 - ---------------------------------------------------------------------- - Crossover single_point_crossover() (10, 6) - Crossover probability: 0.4 - ---------------------------------------------------------------------- - On Crossover on_crossover_callback() None - ---------------------------------------------------------------------- - Mutation random_mutation() (10, 6) - Mutation Genes: 1 - Random Mutation Range: (-1.0, 1.0) - Mutation by Replacement: False - Allow Duplicated Genes: True - ---------------------------------------------------------------------- - On Generation on_gen() None - Stop Criteria: [['reach', 10.0]] - ---------------------------------------------------------------------- - ====================================================================== - Population Size: (20, 6) - Number of Generations: 100 - Initial Population Range: (-4, 4) - Keep Elitism: 1 - Gene DType: [, None] - Parallel Processing: ['thread', 2] - Save Best Solutions: False - Save Solutions: False - ====================================================================== - -We can set the ``print_step_parameters`` and -``print_parameters_summary`` parameters to ``False`` to not print the -parameters. - -.. code:: python - - ga_instance.summary(print_step_parameters=False, - print_parameters_summary=False) - -.. code:: bash - - ---------------------------------------------------------------------- - PyGAD Lifecycle - ====================================================================== - Step Handler Output Shape - ====================================================================== - Fitness Function genetic_fitness() (1) - ---------------------------------------------------------------------- - Parent Selection steady_state_selection() (10, 6) - ---------------------------------------------------------------------- - Crossover single_point_crossover() (10, 6) - ---------------------------------------------------------------------- - On Crossover on_crossover_callback() None - ---------------------------------------------------------------------- - Mutation random_mutation() (10, 6) - ---------------------------------------------------------------------- - On Generation on_gen() None - ---------------------------------------------------------------------- - ====================================================================== - -Logging Outputs -=============== - -In `PyGAD -3.0.0 `__, -the ``print()`` statement is no longer used and the outputs are printed -using the `logging `__ -module. A a new parameter called ``logger`` is supported to accept the -user-defined logger. - -.. code:: python - - import logging - - logger = ... - - ga_instance = pygad.GA(..., - logger=logger, - ...) - -The default value for this parameter is ``None``. If there is no logger -passed (i.e. ``logger=None``), then a default logger is created to log -the messages to the console exactly like how the ``print()`` statement -works. - -Some advantages of using the the -`logging `__ module -instead of the ``print()`` statement are: - -1. The user has more control over the printed messages specially if - there is a project that uses multiple modules where each module - prints its messages. A logger can organize the outputs. - -2. Using the proper ``Handler``, the user can log the output messages to - files and not only restricted to printing it to the console. So, it - is much easier to record the outputs. - -3. The format of the printed messages can be changed by customizing the - ``Formatter`` assigned to the Logger. - -This section gives some quick examples to use the ``logging`` module and -then gives an example to use the logger with PyGAD. - -Logging to the Console ----------------------- - -This is an example to create a logger to log the messages to the -console. - -.. code:: python - - import logging - - # Create a logger - logger = logging.getLogger(__name__) - - # Set the logger level to debug so that all the messages are printed. - logger.setLevel(logging.DEBUG) - - # Create a stream handler to log the messages to the console. - stream_handler = logging.StreamHandler() - - # Set the handler level to debug. - stream_handler.setLevel(logging.DEBUG) - - # Create a formatter - formatter = logging.Formatter('%(message)s') - - # Add the formatter to handler. - stream_handler.setFormatter(formatter) - - # Add the stream handler to the logger - logger.addHandler(stream_handler) - -Now, we can log messages to the console with the format specified in the -``Formatter``. - -.. code:: python - - logger.debug('Debug message.') - logger.info('Info message.') - logger.warning('Warn message.') - logger.error('Error message.') - logger.critical('Critical message.') - -The outputs are identical to those returned using the ``print()`` -statement. - -.. code:: - - Debug message. - Info message. - Warn message. - Error message. - Critical message. - -By changing the format of the output messages, we can have more -information about each message. - -.. code:: python - - formatter = logging.Formatter('%(asctime)s %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S') - -This is a sample output. - -.. code:: python - - 2023-04-03 18:46:27 DEBUG: Debug message. - 2023-04-03 18:46:27 INFO: Info message. - 2023-04-03 18:46:27 WARNING: Warn message. - 2023-04-03 18:46:27 ERROR: Error message. - 2023-04-03 18:46:27 CRITICAL: Critical message. - -Note that you may need to clear the handlers after finishing the -execution. This is to make sure no cached handlers are used in the next -run. If the cached handlers are not cleared, then the single output -message may be repeated. - -.. code:: python - - logger.handlers.clear() - -Logging to a File ------------------ - -This is another example to log the messages to a file named -``logfile.txt``. The formatter prints the following about each message: - -1. The date and time at which the message is logged. - -2. The log level. - -3. The message. - -4. The path of the file. - -5. The lone number of the log message. - -.. code:: python - - import logging - - level = logging.DEBUG - name = 'logfile.txt' - - logger = logging.getLogger(name) - logger.setLevel(level) - - file_handler = logging.FileHandler(name, 'a+', 'utf-8') - file_handler.setLevel(logging.DEBUG) - file_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s - %(pathname)s:%(lineno)d', datefmt='%Y-%m-%d %H:%M:%S') - file_handler.setFormatter(file_format) - logger.addHandler(file_handler) - -This is how the outputs look like. - -.. code:: python - - 2023-04-03 18:54:03 DEBUG: Debug message. - c:\users\agad069\desktop\logger\example2.py:46 - 2023-04-03 18:54:03 INFO: Info message. - c:\users\agad069\desktop\logger\example2.py:47 - 2023-04-03 18:54:03 WARNING: Warn message. - c:\users\agad069\desktop\logger\example2.py:48 - 2023-04-03 18:54:03 ERROR: Error message. - c:\users\agad069\desktop\logger\example2.py:49 - 2023-04-03 18:54:03 CRITICAL: Critical message. - c:\users\agad069\desktop\logger\example2.py:50 - -Consider clearing the handlers if necessary. - -.. code:: python - - logger.handlers.clear() - -Log to Both the Console and a File ----------------------------------- - -This is an example to create a single Logger associated with 2 handlers: - -1. A file handler. - -2. A stream handler. - -.. code:: python - - import logging - - level = logging.DEBUG - name = 'logfile.txt' - - logger = logging.getLogger(name) - logger.setLevel(level) - - file_handler = logging.FileHandler(name,'a+','utf-8') - file_handler.setLevel(logging.DEBUG) - file_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s - %(pathname)s:%(lineno)d', datefmt='%Y-%m-%d %H:%M:%S') - file_handler.setFormatter(file_format) - logger.addHandler(file_handler) - - console_handler = logging.StreamHandler() - console_handler.setLevel(logging.INFO) - console_format = logging.Formatter('%(message)s') - console_handler.setFormatter(console_format) - logger.addHandler(console_handler) - -When a log message is executed, then it is both printed to the console -and saved in the ``logfile.txt``. - -Consider clearing the handlers if necessary. - -.. code:: python - - logger.handlers.clear() - -PyGAD Example -------------- - -To use the logger in PyGAD, just create your custom logger and pass it -to the ``logger`` parameter. - -.. code:: python - - import logging - import pygad - import numpy - - level = logging.DEBUG - name = 'logfile.txt' - - logger = logging.getLogger(name) - logger.setLevel(level) - - file_handler = logging.FileHandler(name,'a+','utf-8') - file_handler.setLevel(logging.DEBUG) - file_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S') - file_handler.setFormatter(file_format) - logger.addHandler(file_handler) - - console_handler = logging.StreamHandler() - console_handler.setLevel(logging.INFO) - console_format = logging.Formatter('%(message)s') - console_handler.setFormatter(console_format) - logger.addHandler(console_handler) - - equation_inputs = [4, -2, 8] - desired_output = 2671.1234 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - def on_generation(ga_instance): - ga_instance.logger.info("Generation = {generation}".format(generation=ga_instance.generations_completed)) - ga_instance.logger.info("Fitness = {fitness}".format(fitness=ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1])) - - ga_instance = pygad.GA(num_generations=10, - sol_per_pop=40, - num_parents_mating=2, - keep_parents=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - on_generation=on_generation, - logger=logger) - ga_instance.run() - - logger.handlers.clear() - -By executing this code, the logged messages are printed to the console -and also saved in the text file. - -.. code:: python - - 2023-04-03 19:04:27 INFO: Generation = 1 - 2023-04-03 19:04:27 INFO: Fitness = 0.00038086960368076276 - 2023-04-03 19:04:27 INFO: Generation = 2 - 2023-04-03 19:04:27 INFO: Fitness = 0.00038214871408010853 - 2023-04-03 19:04:27 INFO: Generation = 3 - 2023-04-03 19:04:27 INFO: Fitness = 0.0003832795907974678 - 2023-04-03 19:04:27 INFO: Generation = 4 - 2023-04-03 19:04:27 INFO: Fitness = 0.00038398612055017196 - 2023-04-03 19:04:27 INFO: Generation = 5 - 2023-04-03 19:04:27 INFO: Fitness = 0.00038442348890867516 - 2023-04-03 19:04:27 INFO: Generation = 6 - 2023-04-03 19:04:27 INFO: Fitness = 0.0003854406039137763 - 2023-04-03 19:04:27 INFO: Generation = 7 - 2023-04-03 19:04:27 INFO: Fitness = 0.00038646083174063284 - 2023-04-03 19:04:27 INFO: Generation = 8 - 2023-04-03 19:04:27 INFO: Fitness = 0.0003875169193024936 - 2023-04-03 19:04:27 INFO: Generation = 9 - 2023-04-03 19:04:27 INFO: Fitness = 0.0003888816727311021 - 2023-04-03 19:04:27 INFO: Generation = 10 - 2023-04-03 19:04:27 INFO: Fitness = 0.000389832593101348 - -Batch Fitness Calculation -========================= - -In `PyGAD -2.19.0 `__, -a new optional parameter called ``fitness_batch_size`` is supported. A -new optional parameter called ``fitness_batch_size`` is supported to -calculate the fitness function in batches. Thanks to `Linan -Qiu `__ for opening the `GitHub issue -#136 `__. - -Its values can be: - -- ``1`` or ``None``: If the ``fitness_batch_size`` parameter is - assigned the value ``1`` or ``None`` (default), then the normal flow - is used where the fitness function is called for each individual - solution. That is if there are 15 solutions, then the fitness - function is called 15 times. - -- ``1 < fitness_batch_size <= sol_per_pop``: If the - ``fitness_batch_size`` parameter is assigned a value satisfying this - condition ``1 < fitness_batch_size <= sol_per_pop``, then the - solutions are grouped into batches of size ``fitness_batch_size`` and - the fitness function is called once for each batch. In this case, the - fitness function must return a list/tuple/numpy.ndarray with a length - equal to the number of solutions passed. - -.. _example-without-fitnessbatchsize-parameter: - -Example without ``fitness_batch_size`` Parameter ------------------------------------------------- - -This is an example where the ``fitness_batch_size`` parameter is given -the value ``None`` (which is the default value). This is equivalent to -using the value ``1``. In this case, the fitness function will be called -for each solution. This means the fitness function ``fitness_func`` will -receive only a single solution. This is an example of the passed -arguments to the fitness function: - -.. code:: - - solution: [ 2.52860734, -0.94178795, 2.97545704, 0.84131987, -3.78447118, 2.41008358] - solution_idx: 3 - -The fitness function also must return a single numeric value as the -fitness for the passed solution. - -As we have a population of ``20`` solutions, then the fitness function -is called 20 times per generation. For 5 generations, then the fitness -function is called ``20*5 = 100`` times. In PyGAD, the fitness function -is called after the last generation too and this adds additional 20 -times. So, the total number of calls to the fitness function is -``20*5 + 20 = 120``. - -Note that the ``keep_elitism`` and ``keep_parents`` parameters are set -to ``0`` to make sure no fitness values are reused and to force calling -the fitness function for each individual solution. - -.. code:: python - - import pygad - import numpy - - function_inputs = [4,-2,3.5,5,-11,-4.7] - desired_output = 44 - - number_of_calls = 0 - - def fitness_func(ga_instance, solution, solution_idx): - global number_of_calls - number_of_calls = number_of_calls + 1 - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - ga_instance = pygad.GA(num_generations=5, - num_parents_mating=10, - sol_per_pop=20, - fitness_func=fitness_func, - fitness_batch_size=None, - # fitness_batch_size=1, - num_genes=len(function_inputs), - keep_elitism=0, - keep_parents=0) - - ga_instance.run() - print(number_of_calls) - -.. code:: - - 120 - -.. _example-with-fitnessbatchsize-parameter: - -Example with ``fitness_batch_size`` Parameter ---------------------------------------------- - -This is an example where the ``fitness_batch_size`` parameter is used -and assigned the value ``4``. This means the solutions will be grouped -into batches of ``4`` solutions. The fitness function will be called -once for each patch (i.e. called once for each 4 solutions). - -This is an example of the arguments passed to it: - -.. code:: python - - solutions: - [[ 3.1129432 -0.69123589 1.93792414 2.23772968 -1.54616001 -0.53930799] - [ 3.38508121 0.19890812 1.93792414 2.23095014 -3.08955597 3.10194128] - [ 2.37079504 -0.88819803 2.97545704 1.41742256 -3.95594055 2.45028256] - [ 2.52860734 -0.94178795 2.97545704 0.84131987 -3.78447118 2.41008358]] - solutions_indices: - [16, 17, 18, 19] - -As we have 20 solutions, then there are ``20/4 = 5`` patches. As a -result, the fitness function is called only 5 times per generation -instead of 20. For each call to the fitness function, it receives a -batch of 4 solutions. - -As we have 5 generations, then the function will be called ``5*5 = 25`` -times. Given the call to the fitness function after the last generation, -then the total number of calls is ``5*5 + 5 = 30``. - -.. code:: python - - import pygad - import numpy - - function_inputs = [4,-2,3.5,5,-11,-4.7] - desired_output = 44 - - number_of_calls = 0 - - def fitness_func_batch(ga_instance, solutions, solutions_indices): - global number_of_calls - number_of_calls = number_of_calls + 1 - batch_fitness = [] - for solution in solutions: - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - batch_fitness.append(fitness) - return batch_fitness - - ga_instance = pygad.GA(num_generations=5, - num_parents_mating=10, - sol_per_pop=20, - fitness_func=fitness_func_batch, - fitness_batch_size=4, - num_genes=len(function_inputs), - keep_elitism=0, - keep_parents=0) - - ga_instance.run() - print(number_of_calls) - -.. code:: - - 30 - -When batch fitness calculation is used, then we saved ``120 - 30 = 90`` -calls to the fitness function. - -Use Functions and Methods to Build Fitness and Callbacks -======================================================== - -In PyGAD 2.19.0, it is possible to pass user-defined functions or -methods to the following parameters: - -1. ``fitness_func`` - -2. ``on_start`` - -3. ``on_fitness`` - -4. ``on_parents`` - -5. ``on_crossover`` - -6. ``on_mutation`` - -7. ``on_generation`` - -8. ``on_stop`` - -This section gives 2 examples to assign these parameters user-defined: - -1. Functions. - -2. Methods. - -Assign Functions ----------------- - -This is a dummy example where the fitness function returns a random -value. Note that the instance of the ``pygad.GA`` class is passed as the -last parameter of all functions. - -.. code:: python - - import pygad - import numpy - - def fitness_func(ga_instanse, solution, solution_idx): - return numpy.random.rand() - - def on_start(ga_instanse): - print("on_start") - - def on_fitness(ga_instanse, last_gen_fitness): - print("on_fitness") - - def on_parents(ga_instanse, last_gen_parents): - print("on_parents") - - def on_crossover(ga_instanse, last_gen_offspring): - print("on_crossover") - - def on_mutation(ga_instanse, last_gen_offspring): - print("on_mutation") - - def on_generation(ga_instanse): - print("on_generation\n") - - def on_stop(ga_instanse, last_gen_fitness): - print("on_stop") - - ga_instance = pygad.GA(num_generations=5, - num_parents_mating=4, - sol_per_pop=10, - num_genes=2, - on_start=on_start, - on_fitness=on_fitness, - on_parents=on_parents, - on_crossover=on_crossover, - on_mutation=on_mutation, - on_generation=on_generation, - on_stop=on_stop, - fitness_func=fitness_func) - - ga_instance.run() - -Assign Methods --------------- - -The next example has all the method defined inside the class ``Test``. -All of the methods accept an additional parameter representing the -method's object of the class ``Test``. - -All methods accept ``self`` as the first parameter and the instance of -the ``pygad.GA`` class as the last parameter. - -.. code:: python - - import pygad - import numpy - - class Test: - def fitness_func(self, ga_instanse, solution, solution_idx): - return numpy.random.rand() - - def on_start(self, ga_instanse): - print("on_start") - - def on_fitness(self, ga_instanse, last_gen_fitness): - print("on_fitness") - - def on_parents(self, ga_instanse, last_gen_parents): - print("on_parents") - - def on_crossover(self, ga_instanse, last_gen_offspring): - print("on_crossover") - - def on_mutation(self, ga_instanse, last_gen_offspring): - print("on_mutation") - - def on_generation(self, ga_instanse): - print("on_generation\n") - - def on_stop(self, ga_instanse, last_gen_fitness): - print("on_stop") - - ga_instance = pygad.GA(num_generations=5, - num_parents_mating=4, - sol_per_pop=10, - num_genes=2, - on_start=Test().on_start, - on_fitness=Test().on_fitness, - on_parents=Test().on_parents, - on_crossover=Test().on_crossover, - on_mutation=Test().on_mutation, - on_generation=Test().on_generation, - on_stop=Test().on_stop, - fitness_func=Test().fitness_func) - - ga_instance.run() - -.. _examples-2: - -Examples -======== - -This section gives the complete code of some examples that use -``pygad``. Each subsection builds a different example. - -Linear Model Optimization -------------------------- - -This example is discussed in the `Steps to Use -PyGAD `__ -section which optimizes a linear model. Its complete code is listed -below. - -.. code:: python - - import pygad - import numpy - - """ - Given the following function: - y = f(w1:w6) = w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + 6wx6 - where (x1,x2,x3,x4,x5,x6)=(4,-2,3.5,5,-11,-4.7) and y=44 - What are the best values for the 6 weights (w1 to w6)? We are going to use the genetic algorithm to optimize this function. - """ - - function_inputs = [4,-2,3.5,5,-11,-4.7] # Function inputs. - desired_output = 44 # Function output. - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution*function_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - num_generations = 100 # Number of generations. - num_parents_mating = 10 # Number of solutions to be selected as parents in the mating pool. - - sol_per_pop = 20 # Number of solutions in the population. - num_genes = len(function_inputs) - - last_fitness = 0 - def on_generation(ga_instance): - global last_fitness - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1])) - print("Change = {change}".format(change=ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1] - last_fitness)) - last_fitness = ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1] - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - sol_per_pop=sol_per_pop, - num_genes=num_genes, - fitness_func=fitness_func, - on_generation=on_generation) - - # Running the GA to optimize the parameters of the function. - ga_instance.run() - - ga_instance.plot_fitness() - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution(ga_instance.last_generation_fitness) - print("Parameters of the best solution : {solution}".format(solution=solution)) - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - prediction = numpy.sum(numpy.array(function_inputs)*solution) - print("Predicted output based on the best solution : {prediction}".format(prediction=prediction)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - # Saving the GA instance. - filename = 'genetic' # The filename to which the instance is saved. The name is without extension. - ga_instance.save(filename=filename) - - # Loading the saved GA instance. - loaded_ga_instance = pygad.load(filename=filename) - loaded_ga_instance.plot_fitness() - -Reproducing Images ------------------- - -This project reproduces a single image using PyGAD by evolving pixel -values. This project works with both color and gray images. Check this -project at `GitHub `__: -https://github.com/ahmedfgad/GARI. - -For more information about this project, read this tutorial titled -`Reproducing Images using a Genetic Algorithm with -Python `__ -available at these links: - -- `Heartbeat `__: - https://heartbeat.fritz.ai/reproducing-images-using-a-genetic-algorithm-with-python-91fc701ff84 - -- `LinkedIn `__: - https://www.linkedin.com/pulse/reproducing-images-using-genetic-algorithm-python-ahmed-gad - -Project Steps -~~~~~~~~~~~~~ - -The steps to follow in order to reproduce an image are as follows: - -- Read an image - -- Prepare the fitness function - -- Create an instance of the pygad.GA class with the appropriate - parameters - -- Run PyGAD - -- Plot results - -- Calculate some statistics - -The next sections discusses the code of each of these steps. - -Read an Image -~~~~~~~~~~~~~ - -There is an image named ``fruit.jpg`` in the `GARI -project `__ which is read according -to the next code. - -.. code:: python - - import imageio - import numpy - - target_im = imageio.imread('fruit.jpg') - target_im = numpy.asarray(target_im/255, dtype=float) - -Here is the read image. - -.. figure:: https://user-images.githubusercontent.com/16560492/36948808-f0ac882e-1fe8-11e8-8d07-1307e3477fd0.jpg - :alt: - -Based on the chromosome representation used in the example, the pixel -values can be either in the 0-255, 0-1, or any other ranges. - -Note that the range of pixel values affect other parameters like the -range from which the random values are selected during mutation and also -the range of the values used in the initial population. So, be -consistent. - -Prepare the Fitness Function -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The next code creates a function that will be used as a fitness function -for calculating the fitness value for each solution in the population. -This function must be a maximization function that accepts 3 parameters -representing the instance of the ``pygad.GA`` class, a solution, and its -index. It returns a value representing the fitness value. - -.. code:: python - - import gari - - target_chromosome = gari.img2chromosome(target_im) - - def fitness_fun(ga_instance, solution, solution_idx): - fitness = numpy.sum(numpy.abs(target_chromosome-solution)) - - # Negating the fitness value to make it increasing rather than decreasing. - fitness = numpy.sum(target_chromosome) - fitness - return fitness - -The fitness value is calculated using the sum of absolute difference -between genes values in the original and reproduced chromosomes. The -``gari.img2chromosome()`` function is called before the fitness function -to represent the image as a vector because the genetic algorithm can -work with 1D chromosomes. - -The implementation of the ``gari`` module is available at the `GARI -GitHub -project `__ and -its code is listed below. - -.. code:: python - - import numpy - import functools - import operator - - def img2chromosome(img_arr): - return numpy.reshape(a=img_arr, newshape=(functools.reduce(operator.mul, img_arr.shape))) - - def chromosome2img(vector, shape): - if len(vector) != functools.reduce(operator.mul, shape): - raise ValueError("A vector of length {vector_length} into an array of shape {shape}.".format(vector_length=len(vector), shape=shape)) - - return numpy.reshape(a=vector, newshape=shape) - -.. _create-an-instance-of-the-pygadga-class-2: - -Create an Instance of the ``pygad.GA`` Class -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -It is very important to use random mutation and set the -``mutation_by_replacement`` to ``True``. Based on the range of pixel -values, the values assigned to the ``init_range_low``, -``init_range_high``, ``random_mutation_min_val``, and -``random_mutation_max_val`` parameters should be changed. - -If the image pixel values range from 0 to 255, then set -``init_range_low`` and ``random_mutation_min_val`` to 0 as they are but -change ``init_range_high`` and ``random_mutation_max_val`` to 255. - -Feel free to change the other parameters or add other parameters. Please -check the `PyGAD's documentation `__ for -the full list of parameters. - -.. code:: python - - import pygad - - ga_instance = pygad.GA(num_generations=20000, - num_parents_mating=10, - fitness_func=fitness_fun, - sol_per_pop=20, - num_genes=target_im.size, - init_range_low=0.0, - init_range_high=1.0, - mutation_percent_genes=0.01, - mutation_type="random", - mutation_by_replacement=True, - random_mutation_min_val=0.0, - random_mutation_max_val=1.0) - -Run PyGAD -~~~~~~~~~ - -Simply, call the ``run()`` method to run PyGAD. - -.. code:: python - - ga_instance.run() - -Plot Results -~~~~~~~~~~~~ - -After the ``run()`` method completes, the fitness values of all -generations can be viewed in a plot using the ``plot_fitness()`` method. - -.. code:: python - - ga_instance.plot_fitness() - -Here is the plot after 20,000 generations. - -.. figure:: https://user-images.githubusercontent.com/16560492/82232124-77762c00-992e-11ea-9fc6-14a1cd7a04ff.png - :alt: - -Calculate Some Statistics -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Here is some information about the best solution. - -.. code:: python - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - if ga_instance.best_solution_generation != -1: - print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - - result = gari.chromosome2img(solution, target_im.shape) - matplotlib.pyplot.imshow(result) - matplotlib.pyplot.title("PyGAD & GARI for Reproducing Images") - matplotlib.pyplot.show() - -Evolution by Generation -~~~~~~~~~~~~~~~~~~~~~~~ - -The solution reached after the 20,000 generations is shown below. - -.. figure:: https://user-images.githubusercontent.com/16560492/82232405-e0f63a80-992e-11ea-984f-b6ed76465bd1.png - :alt: - -After more generations, the result can be enhanced like what shown -below. - -.. figure:: https://user-images.githubusercontent.com/16560492/82232345-cf149780-992e-11ea-8390-bf1a57a19de7.png - :alt: - -The results can also be enhanced by changing the parameters passed to -the constructor of the ``pygad.GA`` class. - -Here is how the image is evolved from generation 0 to generation -20,000s. - -Generation 0 - -.. figure:: https://user-images.githubusercontent.com/16560492/36948589-b47276f0-1fe5-11e8-8efe-0cd1a225ea3a.png - :alt: - -Generation 1,000 - -.. figure:: https://user-images.githubusercontent.com/16560492/36948823-16f490ee-1fe9-11e8-97db-3e8905ad5440.png - :alt: - -Generation 2,500 - -.. figure:: https://user-images.githubusercontent.com/16560492/36948832-3f314b60-1fe9-11e8-8f4a-4d9a53b99f3d.png - :alt: - -Generation 4,500 - -.. figure:: https://user-images.githubusercontent.com/16560492/36948837-53d1849a-1fe9-11e8-9b36-e9e9291e347b.png - :alt: - -Generation 7,000 - -.. figure:: https://user-images.githubusercontent.com/16560492/36948852-66f1b176-1fe9-11e8-9f9b-460804e94004.png - :alt: - -Generation 8,000 - -.. figure:: https://user-images.githubusercontent.com/16560492/36948865-7fbb5158-1fe9-11e8-8c04-8ac3c1f7b1b1.png - :alt: - -Generation 20,000 - -.. figure:: https://user-images.githubusercontent.com/16560492/82232405-e0f63a80-992e-11ea-984f-b6ed76465bd1.png - :alt: - -Clustering ----------- - -For a 2-cluster problem, the code is available -`here `__. -For a 3-cluster problem, the code is -`here `__. -The 2 examples are using artificial samples. - -Soon a tutorial will be published at -`Paperspace `__ to explain how -clustering works using the genetic algorithm with examples in PyGAD. - -CoinTex Game Playing using PyGAD --------------------------------- - -The code is available the `CoinTex GitHub -project `__. -CoinTex is an Android game written in Python using the Kivy framework. -Find CoinTex at `Google -Play `__: -https://play.google.com/store/apps/details?id=coin.tex.cointexreactfast - -Check this `Paperspace -tutorial `__ -for how the genetic algorithm plays CoinTex: -https://blog.paperspace.com/building-agent-for-cointex-using-genetic-algorithm. -Check also this `YouTube video `__ showing -the genetic algorithm while playing CoinTex. +``pygad`` Module +================ + +This section of the PyGAD's library documentation discusses the +``pygad`` module. + +Using the ``pygad`` module, instances of the genetic algorithm can be +created, run, saved, and loaded. + +.. _pygadga-class: + +``pygad.GA`` Class +================== + +The first module available in PyGAD is named ``pygad`` and contains a +class named ``GA`` for building the genetic algorithm. The constructor, +methods, function, and attributes within the class are discussed in this +section. + +.. _init: + +``__init__()`` +-------------- + +For creating an instance of the ``pygad.GA`` class, the constructor +accepts several parameters that allow the user to customize the genetic +algorithm to different types of applications. + +The ``pygad.GA`` class constructor supports the following parameters: + +- ``num_generations``: Number of generations. + +- ``num_parents_mating``: Number of solutions to be selected as + parents. + +- ``fitness_func``: Accepts a function/method and returns the fitness + value of the solution. If a function is passed, then it must accept 3 + parameters (1. the instance of the ``pygad.GA`` class, 2. a single + solution, and 3. its index in the population). If method, then it + accepts a fourth parameter representing the method's class instance. + Check the `Preparing the fitness_func + Parameter `__ + section for information about creating such a function. + +- ``fitness_batch_size=None``: A new optional parameter called + ``fitness_batch_size`` is supported to calculate the fitness function + in batches. If it is assigned the value ``1`` or ``None`` (default), + then the normal flow is used where the fitness function is called for + each individual solution. If the ``fitness_batch_size`` parameter is + assigned a value satisfying this condition + ``1 < fitness_batch_size <= sol_per_pop``, then the solutions are + grouped into batches of size ``fitness_batch_size`` and the fitness + function is called once for each batch. Check the `Batch Fitness + Calculation `__ + section for more details and examples. Added in from `PyGAD + 2.19.0 `__. + +- ``initial_population``: A user-defined initial population. It is + useful when the user wants to start the generations with a custom + initial population. It defaults to ``None`` which means no initial + population is specified by the user. In this case, + `PyGAD `__ creates an initial + population using the ``sol_per_pop`` and ``num_genes`` parameters. An + exception is raised if the ``initial_population`` is ``None`` while + any of the 2 parameters (``sol_per_pop`` or ``num_genes``) is also + ``None``. Introduced in `PyGAD + 2.0.0 `__ + and higher. + +- ``sol_per_pop``: Number of solutions (i.e. chromosomes) within the + population. This parameter has no action if ``initial_population`` + parameter exists. + +- ``num_genes``: Number of genes in the solution/chromosome. This + parameter is not needed if the user feeds the initial population to + the ``initial_population`` parameter. + +- ``gene_type=float``: Controls the gene type. It can be assigned to a + single data type that is applied to all genes or can specify the data + type of each individual gene. It defaults to ``float`` which means + all genes are of ``float`` data type. Starting from `PyGAD + 2.9.0 `__, + the ``gene_type`` parameter can be assigned to a numeric value of any + of these types: ``int``, ``float``, and + ``numpy.int/uint/float(8-64)``. Starting from `PyGAD + 2.14.0 `__, + it can be assigned to a ``list``, ``tuple``, or a ``numpy.ndarray`` + which hold a data type for each gene (e.g. + ``gene_type=[int, float, numpy.int8]``). This helps to control the + data type of each individual gene. In `PyGAD + 2.15.0 `__, + a precision for the ``float`` data types can be specified (e.g. + ``gene_type=[float, 2]``. + +- ``init_range_low=-4``: The lower value of the random range from which + the gene values in the initial population are selected. + ``init_range_low`` defaults to ``-4``. Available in `PyGAD + 1.0.20 `__ + and higher. This parameter has no action if the + ``initial_population`` parameter exists. + +- ``init_range_high=4``: The upper value of the random range from which + the gene values in the initial population are selected. + ``init_range_high`` defaults to ``+4``. Available in `PyGAD + 1.0.20 `__ + and higher. This parameter has no action if the + ``initial_population`` parameter exists. + +- ``parent_selection_type="sss"``: The parent selection type. Supported + types are ``sss`` (for steady-state selection), ``rws`` (for roulette + wheel selection), ``sus`` (for stochastic universal selection), + ``rank`` (for rank selection), ``random`` (for random selection), and + ``tournament`` (for tournament selection). A custom parent selection + function can be passed starting from `PyGAD + 2.16.0 `__. + Check the `User-Defined Crossover, Mutation, and Parent Selection + Operators `__ + section for more details about building a user-defined parent + selection function. + +- ``keep_parents=-1``: Number of parents to keep in the current + population. ``-1`` (default) means to keep all parents in the next + population. ``0`` means keep no parents in the next population. A + value ``greater than 0`` means keeps the specified number of parents + in the next population. Note that the value assigned to + ``keep_parents`` cannot be ``< - 1`` or greater than the number of + solutions within the population ``sol_per_pop``. Starting from `PyGAD + 2.18.0 `__, + this parameter have an effect only when the ``keep_elitism`` + parameter is ``0``. Starting from `PyGAD + 2.20.0 `__, + the parents' fitness from the last generation will not be re-used if + ``keep_parents=0``. + +- ``keep_elitism=1``: Added in `PyGAD + 2.18.0 `__. + It can take the value ``0`` or a positive integer that satisfies + (``0 <= keep_elitism <= sol_per_pop``). It defaults to ``1`` which + means only the best solution in the current generation is kept in the + next generation. If assigned ``0``, this means it has no effect. If + assigned a positive integer ``K``, then the best ``K`` solutions are + kept in the next generation. It cannot be assigned a value greater + than the value assigned to the ``sol_per_pop`` parameter. If this + parameter has a value different than ``0``, then the ``keep_parents`` + parameter will have no effect. + +- ``K_tournament=3``: In case that the parent selection type is + ``tournament``, the ``K_tournament`` specifies the number of parents + participating in the tournament selection. It defaults to ``3``. + +- ``crossover_type="single_point"``: Type of the crossover operation. + Supported types are ``single_point`` (for single-point crossover), + ``two_points`` (for two points crossover), ``uniform`` (for uniform + crossover), and ``scattered`` (for scattered crossover). Scattered + crossover is supported from PyGAD + `2.9.0 `__ + and higher. It defaults to ``single_point``. A custom crossover + function can be passed starting from `PyGAD + 2.16.0 `__. + Check the `User-Defined Crossover, Mutation, and Parent Selection + Operators `__ + section for more details about creating a user-defined crossover + function. Starting from `PyGAD + 2.2.2 `__ + and higher, if ``crossover_type=None``, then the crossover step is + bypassed which means no crossover is applied and thus no offspring + will be created in the next generations. The next generation will use + the solutions in the current population. + +- ``crossover_probability=None``: The probability of selecting a parent + for applying the crossover operation. Its value must be between 0.0 + and 1.0 inclusive. For each parent, a random value between 0.0 and + 1.0 is generated. If this random value is less than or equal to the + value assigned to the ``crossover_probability`` parameter, then the + parent is selected. Added in `PyGAD + 2.5.0 `__ + and higher. + +- ``mutation_type="random"``: Type of the mutation operation. Supported + types are ``random`` (for random mutation), ``swap`` (for swap + mutation), ``inversion`` (for inversion mutation), ``scramble`` (for + scramble mutation), and ``adaptive`` (for adaptive mutation). It + defaults to ``random``. A custom mutation function can be passed + starting from `PyGAD + 2.16.0 `__. + Check the `User-Defined Crossover, Mutation, and Parent Selection + Operators `__ + section for more details about creating a user-defined mutation + function. Starting from `PyGAD + 2.2.2 `__ + and higher, if ``mutation_type=None``, then the mutation step is + bypassed which means no mutation is applied and thus no changes are + applied to the offspring created using the crossover operation. The + offspring will be used unchanged in the next generation. ``Adaptive`` + mutation is supported starting from `PyGAD + 2.10.0 `__. + For more information about adaptive mutation, go the the `Adaptive + Mutation `__ + section. For example about using adaptive mutation, check the `Use + Adaptive Mutation in + PyGAD `__ + section. + +- ``mutation_probability=None``: The probability of selecting a gene + for applying the mutation operation. Its value must be between 0.0 + and 1.0 inclusive. For each gene in a solution, a random value + between 0.0 and 1.0 is generated. If this random value is less than + or equal to the value assigned to the ``mutation_probability`` + parameter, then the gene is selected. If this parameter exists, then + there is no need for the 2 parameters ``mutation_percent_genes`` and + ``mutation_num_genes``. Added in `PyGAD + 2.5.0 `__ + and higher. + +- ``mutation_by_replacement=False``: An optional bool parameter. It + works only when the selected type of mutation is random + (``mutation_type="random"``). In this case, + ``mutation_by_replacement=True`` means replace the gene by the + randomly generated value. If False, then it has no effect and random + mutation works by adding the random value to the gene. Supported in + `PyGAD + 2.2.2 `__ + and higher. Check the changes in `PyGAD + 2.2.2 `__ + under the Release History section for an example. + +- ``mutation_percent_genes="default"``: Percentage of genes to mutate. + It defaults to the string ``"default"`` which is later translated + into the integer ``10`` which means 10% of the genes will be mutated. + It must be ``>0`` and ``<=100``. Out of this percentage, the number + of genes to mutate is deduced which is assigned to the + ``mutation_num_genes`` parameter. The ``mutation_percent_genes`` + parameter has no action if ``mutation_probability`` or + ``mutation_num_genes`` exist. Starting from `PyGAD + 2.2.2 `__ + and higher, this parameter has no action if ``mutation_type`` is + ``None``. + +- ``mutation_num_genes=None``: Number of genes to mutate which defaults + to ``None`` meaning that no number is specified. The + ``mutation_num_genes`` parameter has no action if the parameter + ``mutation_probability`` exists. Starting from `PyGAD + 2.2.2 `__ + and higher, this parameter has no action if ``mutation_type`` is + ``None``. + +- ``random_mutation_min_val=-1.0``: For ``random`` mutation, the + ``random_mutation_min_val`` parameter specifies the start value of + the range from which a random value is selected to be added to the + gene. It defaults to ``-1``. Starting from `PyGAD + 2.2.2 `__ + and higher, this parameter has no action if ``mutation_type`` is + ``None``. + +- ``random_mutation_max_val=1.0``: For ``random`` mutation, the + ``random_mutation_max_val`` parameter specifies the end value of the + range from which a random value is selected to be added to the gene. + It defaults to ``+1``. Starting from `PyGAD + 2.2.2 `__ + and higher, this parameter has no action if ``mutation_type`` is + ``None``. + +- ``gene_space=None``: It is used to specify the possible values for + each gene in case the user wants to restrict the gene values. It is + useful if the gene space is restricted to a certain range or to + discrete values. It accepts a ``list``, ``tuple``, ``range``, or + ``numpy.ndarray``. When all genes have the same global space, specify + their values as a ``list``/``tuple``/``range``/``numpy.ndarray``. For + example, ``gene_space = [0.3, 5.2, -4, 8]`` restricts the gene values + to the 4 specified values. If each gene has its own space, then the + ``gene_space`` parameter can be nested like + ``[[0.4, -5], [0.5, -3.2, 8.2, -9], ...]`` where the first sublist + determines the values for the first gene, the second sublist for the + second gene, and so on. If the nested list/tuple has a ``None`` + value, then the gene's initial value is selected randomly from the + range specified by the 2 parameters ``init_range_low`` and + ``init_range_high`` and its mutation value is selected randomly from + the range specified by the 2 parameters ``random_mutation_min_val`` + and ``random_mutation_max_val``. ``gene_space`` is added in `PyGAD + 2.5.0 `__. + Check the `Release History of PyGAD + 2.5.0 `__ + section of the documentation for more details. In `PyGAD + 2.9.0 `__, + NumPy arrays can be assigned to the ``gene_space`` parameter. In + `PyGAD + 2.11.0 `__, + the ``gene_space`` parameter itself or any of its elements can be + assigned to a dictionary to specify the lower and upper limits of the + genes. For example, ``{'low': 2, 'high': 4}`` means the minimum and + maximum values are 2 and 4, respectively. In `PyGAD + 2.15.0 `__, + a new key called ``"step"`` is supported to specify the step of + moving from the start to the end of the range specified by the 2 + existing keys ``"low"`` and ``"high"``. + +- ``on_start=None``: Accepts a function/method to be called only once + before the genetic algorithm starts its evolution. If function, then + it must accept a single parameter representing the instance of the + genetic algorithm. If method, then it must accept 2 parameters where + the second one refers to the method's object. Added in `PyGAD + 2.6.0 `__. + +- ``on_fitness=None``: Accepts a function/method to be called after + calculating the fitness values of all solutions in the population. If + function, then it must accept 2 parameters: 1) a list of all + solutions' fitness values 2) the instance of the genetic algorithm. + If method, then it must accept 3 parameters where the third one + refers to the method's object. Added in `PyGAD + 2.6.0 `__. + +- ``on_parents=None``: Accepts a function/method to be called after + selecting the parents that mates. If function, then it must accept 2 + parameters: 1) the selected parents 2) the instance of the genetic + algorithm If method, then it must accept 3 parameters where the third + one refers to the method's object. Added in `PyGAD + 2.6.0 `__. + +- ``on_crossover=None``: Accepts a function to be called each time the + crossover operation is applied. This function must accept 2 + parameters: the first one represents the instance of the genetic + algorithm and the second one represents the offspring generated using + crossover. Added in `PyGAD + 2.6.0 `__. + +- ``on_mutation=None``: Accepts a function to be called each time the + mutation operation is applied. This function must accept 2 + parameters: the first one represents the instance of the genetic + algorithm and the second one represents the offspring after applying + the mutation. Added in `PyGAD + 2.6.0 `__. + +- ``on_generation=None``: Accepts a function to be called after each + generation. This function must accept a single parameter representing + the instance of the genetic algorithm. If the function returned the + string ``stop``, then the ``run()`` method stops without completing + the other generations. Added in `PyGAD + 2.6.0 `__. + +- ``on_stop=None``: Accepts a function to be called only once exactly + before the genetic algorithm stops or when it completes all the + generations. This function must accept 2 parameters: the first one + represents the instance of the genetic algorithm and the second one + is a list of fitness values of the last population's solutions. Added + in `PyGAD + 2.6.0 `__. + +- ``delay_after_gen=0.0``: It accepts a non-negative number specifying + the time in seconds to wait after a generation completes and before + going to the next generation. It defaults to ``0.0`` which means no + delay after the generation. Available in `PyGAD + 2.4.0 `__ + and higher. + +- ``save_best_solutions=False``: When ``True``, then the best solution + after each generation is saved into an attribute named + ``best_solutions``. If ``False`` (default), then no solutions are + saved and the ``best_solutions`` attribute will be empty. Supported + in `PyGAD + 2.9.0 `__. + +- ``save_solutions=False``: If ``True``, then all solutions in each + generation are appended into an attribute called ``solutions`` which + is NumPy array. Supported in `PyGAD + 2.15.0 `__. + +- ``suppress_warnings=False``: A bool parameter to control whether the + warning messages are printed or not. It defaults to ``False``. + +- ``allow_duplicate_genes=True``: Added in `PyGAD + 2.13.0 `__. + If ``True``, then a solution/chromosome may have duplicate gene + values. If ``False``, then each gene will have a unique value in its + solution. + +- ``stop_criteria=None``: Some criteria to stop the evolution. Added in + `PyGAD + 2.15.0 `__. + Each criterion is passed as ``str`` which has a stop word. The + current 2 supported words are ``reach`` and ``saturate``. ``reach`` + stops the ``run()`` method if the fitness value is equal to or + greater than a given fitness value. An example for ``reach`` is + ``"reach_40"`` which stops the evolution if the fitness is >= 40. + ``saturate`` means stop the evolution if the fitness saturates for a + given number of consecutive generations. An example for ``saturate`` + is ``"saturate_7"`` which means stop the ``run()`` method if the + fitness does not change for 7 consecutive generations. + +- ``parallel_processing=None``: Added in `PyGAD + 2.17.0 `__. + If ``None`` (Default), this means no parallel processing is applied. + It can accept a list/tuple of 2 elements [1) Can be either + ``'process'`` or ``'thread'`` to indicate whether processes or + threads are used, respectively., 2) The number of processes or + threads to use.]. For example, + ``parallel_processing=['process', 10]`` applies parallel processing + with 10 processes. If a positive integer is assigned, then it is used + as the number of threads. For example, ``parallel_processing=5`` uses + 5 threads which is equivalent to + ``parallel_processing=["thread", 5]``. For more information, check + the `Parallel Processing in + PyGAD `__ + section. + +- ``random_seed=None``: Added in `PyGAD + 2.18.0 `__. + It defines the random seed to be used by the random function + generators (we use random functions in the NumPy and random modules). + This helps to reproduce the same results by setting the same random + seed (e.g. ``random_seed=2``). If given the value ``None``, then it + has no effect. + +- ``logger=None``: Accepts an instance of the ``logging.Logger`` class + to log the outputs. Any message is no longer printed using + ``print()`` but logged. If ``logger=None``, then a logger is created + that uses ``StreamHandler`` to logs the messages to the console. + Added in `PyGAD + 3.0.0 `__. + Check the `Logging + Outputs `__ + for more information. + +The user doesn't have to specify all of such parameters while creating +an instance of the GA class. A very important parameter you must care +about is ``fitness_func`` which defines the fitness function. + +It is OK to set the value of any of the 2 parameters ``init_range_low`` +and ``init_range_high`` to be equal, higher, or lower than the other +parameter (i.e. ``init_range_low`` is not needed to be lower than +``init_range_high``). The same holds for the ``random_mutation_min_val`` +and ``random_mutation_max_val`` parameters. + +If the 2 parameters ``mutation_type`` and ``crossover_type`` are +``None``, this disables any type of evolution the genetic algorithm can +make. As a result, the genetic algorithm cannot find a better solution +that the best solution in the initial population. + +The parameters are validated within the constructor. If at least a +parameter is not correct, an exception is thrown. + +.. _plotting-methods-in-pygadga-class: + +Plotting Methods in ``pygad.GA`` Class +-------------------------------------- + +- ``plot_fitness()``: Shows how the fitness evolves by generation. + +- ``plot_genes()``: Shows how the gene value changes for each + generation. + +- ``plot_new_solution_rate()``: Shows the number of new solutions + explored in each solution. + +Class Attributes +---------------- + +- ``supported_int_types``: A list of the supported types for the + integer numbers. + +- ``supported_float_types``: A list of the supported types for the + floating-point numbers. + +- ``supported_int_float_types``: A list of the supported types for all + numbers. It just concatenates the previous 2 lists. + +.. _other-instance-attributes--methods: + +Other Instance Attributes & Methods +----------------------------------- + +All the parameters and functions passed to the ``pygad.GA`` class +constructor are used as class attributes and methods in the instances of +the ``pygad.GA`` class. In addition to such attributes, there are other +attributes and methods added to the instances of the ``pygad.GA`` class: + +The next 2 subsections list such attributes and methods. + +Other Attributes +~~~~~~~~~~~~~~~~ + +- ``generations_completed``: Holds the number of the last completed + generation. + +- ``population``: A NumPy array holding the initial population. + +- ``valid_parameters``: Set to ``True`` when all the parameters passed + in the ``GA`` class constructor are valid. + +- ``run_completed``: Set to ``True`` only after the ``run()`` method + completes gracefully. + +- ``pop_size``: The population size. + +- ``best_solutions_fitness``: A list holding the fitness values of the + best solutions for all generations. + +- ``best_solution_generation``: The generation number at which the best + fitness value is reached. It is only assigned the generation number + after the ``run()`` method completes. Otherwise, its value is -1. + +- ``best_solutions``: A NumPy array holding the best solution per each + generation. It only exists when the ``save_best_solutions`` parameter + in the ``pygad.GA`` class constructor is set to ``True``. + +- ``last_generation_fitness``: The fitness values of the solutions in + the last generation. `Added in PyGAD + 2.12.0 `__. + +- ``previous_generation_fitness``: At the end of each generation, the + fitness of the most recent population is saved in the + ``last_generation_fitness`` attribute. The fitness of the population + exactly preceding this most recent population is saved in the + ``last_generation_fitness`` attribute. This + ``previous_generation_fitness`` attribute is used to fetch the + pre-calculated fitness instead of calling the fitness function for + already explored solutions. `Added in PyGAD + 2.16.2 `__. + +- ``last_generation_parents``: The parents selected from the last + generation. `Added in PyGAD + 2.12.0 `__. + +- ``last_generation_offspring_crossover``: The offspring generated + after applying the crossover in the last generation. `Added in PyGAD + 2.12.0 `__. + +- ``last_generation_offspring_mutation``: The offspring generated after + applying the mutation in the last generation. `Added in PyGAD + 2.12.0 `__. + +- ``gene_type_single``: A flag that is set to ``True`` if the + ``gene_type`` parameter is assigned to a single data type that is + applied to all genes. If ``gene_type`` is assigned a ``list``, + ``tuple``, or ``numpy.ndarray``, then the value of + ``gene_type_single`` will be ``False``. `Added in PyGAD + 2.14.0 `__. + +- ``last_generation_parents_indices``: This attribute holds the indices + of the selected parents in the last generation. Supported in `PyGAD + 2.15.0 `__. + +- ``last_generation_elitism``: This attribute holds the elitism of the + last generation. It is effective only if the ``keep_elitism`` + parameter has a non-zero value. Supported in `PyGAD + 2.18.0 `__. + +- ``last_generation_elitism_indices``: This attribute holds the indices + of the elitism of the last generation. It is effective only if the + ``keep_elitism`` parameter has a non-zero value. Supported in `PyGAD + 2.19.0 `__. + +- ``logger``: This attribute holds the logger from the ``logging`` + module. Supported in `PyGAD + 3.0.0 `__. + +Note that the attributes with its name start with ``last_generation_`` +are updated after each generation. + +Other Methods +~~~~~~~~~~~~~ + +- ``cal_pop_fitness()``: A method that calculates the fitness values + for all solutions within the population by calling the function + passed to the ``fitness_func`` parameter for each solution. + +- ``crossover()``: Refers to the method that applies the crossover + operator based on the selected type of crossover in the + ``crossover_type`` property. + +- ``mutation()``: Refers to the method that applies the mutation + operator based on the selected type of mutation in the + ``mutation_type`` property. + +- ``select_parents()``: Refers to a method that selects the parents + based on the parent selection type specified in the + ``parent_selection_type`` attribute. + +- ``adaptive_mutation_population_fitness()``: Returns the average + fitness value used in the adaptive mutation to filter the solutions. + +- ``solve_duplicate_genes_randomly()``: Solves the duplicates in a + solution by randomly selecting new values for the duplicating genes. + +- ``solve_duplicate_genes_by_space()``: Solves the duplicates in a + solution by selecting values for the duplicating genes from the gene + space + +- ``unique_int_gene_from_range()``: Finds a unique integer value for + the gene. + +- ``unique_genes_by_space()``: Loops through all the duplicating genes + to find unique values that from their gene spaces to solve the + duplicates. For each duplicating gene, a call to the + ``unique_gene_by_space()`` is made. + +- ``unique_gene_by_space()``: Returns a unique gene value for a single + gene based on its value space to solve the duplicates. + +- ``summary()``: Prints a Keras-like summary of the PyGAD lifecycle. + This helps to have an overview of the architecture. Supported in + `PyGAD + 2.19.0 `__. + Check the `Print Lifecycle + Summary `__ + section for more details and examples. + +The next sections discuss the methods available in the ``pygad.GA`` +class. + +.. _initializepopulation: + +``initialize_population()`` +--------------------------- + +It creates an initial population randomly as a NumPy array. The array is +saved in the instance attribute named ``population``. + +Accepts the following parameters: + +- ``low``: The lower value of the random range from which the gene + values in the initial population are selected. It defaults to -4. + Available in PyGAD 1.0.20 and higher. + +- ``high``: The upper value of the random range from which the gene + values in the initial population are selected. It defaults to -4. + Available in PyGAD 1.0.20. + +This method assigns the values of the following 3 instance attributes: + +1. ``pop_size``: Size of the population. + +2. ``population``: Initially, it holds the initial population and later + updated after each generation. + +3. ``initial_population``: Keeping the initial population. + +.. _calpopfitness: + +``cal_pop_fitness()`` +--------------------- + +The ``cal_pop_fitness()`` method calculates and returns the fitness +values of the solutions in the current population. + +This function is optimized to save time by making fewer calls the +fitness function. It follows this process: + +1. If the ``save_solutions`` parameter is set to ``True``, then it + checks if the solution is already explored and saved in the + ``solutions`` instance attribute. If so, then it just retrieves its + fitness from the ``solutions_fitness`` instance attribute without + calling the fitness function. + +2. If ``save_solutions`` is set to ``False`` or if it is ``True`` but + the solution was not explored yet, then the ``cal_pop_fitness()`` + method checks if the ``keep_elitism`` parameter is set to a positive + integer. If so, then it checks if the solution is saved into the + ``last_generation_elitism`` instance attribute. If so, then it + retrieves its fitness from the ``previous_generation_fitness`` + instance attribute. + +3. If neither of the above 3 conditions apply (1. ``save_solutions`` is + set to ``False`` or 2. if it is ``True`` but the solution was not + explored yet or 3. ``keep_elitism`` is set to zero), then the + ``cal_pop_fitness()`` method checks if the ``keep_parents`` parameter + is set to ``-1`` or a positive integer. If so, then it checks if the + solution is saved into the ``last_generation_parents`` instance + attribute. If so, then it retrieves its fitness from the + ``previous_generation_fitness`` instance attribute. + +4. If neither of the above 4 conditions apply, then we have to call the + fitness function to calculate the fitness for the solution. This is + by calling the function assigned to the ``fitness_func`` parameter. + +This function takes into consideration: + +1. The ``parallel_processing`` parameter to check whether parallel + processing is in effect. + +2. The ``fitness_batch_size`` parameter to check if the fitness should + be calculated in batches of solutions. + +It returns a vector of the solutions' fitness values. + +``run()`` +--------- + +Runs the genetic algorithm. This is the main method in which the genetic +algorithm is evolved through some generations. It accepts no parameters +as it uses the instance to access all of its requirements. + +For each generation, the fitness values of all solutions within the +population are calculated according to the ``cal_pop_fitness()`` method +which internally just calls the function assigned to the +``fitness_func`` parameter in the ``pygad.GA`` class constructor for +each solution. + +According to the fitness values of all solutions, the parents are +selected using the ``select_parents()`` method. This method behaviour is +determined according to the parent selection type in the +``parent_selection_type`` parameter in the ``pygad.GA`` class +constructor + +Based on the selected parents, offspring are generated by applying the +crossover and mutation operations using the ``crossover()`` and +``mutation()`` methods. The behaviour of such 2 methods is defined +according to the ``crossover_type`` and ``mutation_type`` parameters in +the ``pygad.GA`` class constructor. + +After the generation completes, the following takes place: + +- The ``population`` attribute is updated by the new population. + +- The ``generations_completed`` attribute is assigned by the number of + the last completed generation. + +- If there is a callback function assigned to the ``on_generation`` + attribute, then it will be called. + +After the ``run()`` method completes, the following takes place: + +- The ``best_solution_generation`` is assigned the generation number at + which the best fitness value is reached. + +- The ``run_completed`` attribute is set to ``True``. + +Parent Selection Methods +------------------------ + +The ``ParentSelection`` class in the ``pygad.utils.parent_selection`` +module has several methods for selecting the parents that will mate to +produce the offspring. All of such methods accept the same parameters +which are: + +- ``fitness``: The fitness values of the solutions in the current + population. + +- ``num_parents``: The number of parents to be selected. + +All of such methods return an array of the selected parents. + +The next subsections list the supported methods for parent selection. + +.. _steadystateselection: + +``steady_state_selection()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Selects the parents using the steady-state selection technique. + +.. _rankselection: + +``rank_selection()`` +~~~~~~~~~~~~~~~~~~~~ + +Selects the parents using the rank selection technique. + +.. _randomselection: + +``random_selection()`` +~~~~~~~~~~~~~~~~~~~~~~ + +Selects the parents randomly. + +.. _tournamentselection: + +``tournament_selection()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Selects the parents using the tournament selection technique. + +.. _roulettewheelselection: + +``roulette_wheel_selection()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Selects the parents using the roulette wheel selection technique. + +.. _stochasticuniversalselection: + +``stochastic_universal_selection()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Selects the parents using the stochastic universal selection technique. + +Crossover Methods +----------------- + +The ``Crossover`` class in the ``pygad.utils.crossover`` module supports +several methods for applying crossover between the selected parents. All +of these methods accept the same parameters which are: + +- ``parents``: The parents to mate for producing the offspring. + +- ``offspring_size``: The size of the offspring to produce. + +All of such methods return an array of the produced offspring. + +The next subsections list the supported methods for crossover. + +.. _singlepointcrossover: + +``single_point_crossover()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the single-point crossover. It selects a point randomly at which +crossover takes place between the pairs of parents. + +.. _twopointscrossover: + +``two_points_crossover()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the 2 points crossover. It selects the 2 points randomly at +which crossover takes place between the pairs of parents. + +.. _uniformcrossover: + +``uniform_crossover()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the uniform crossover. For each gene, a parent out of the 2 +mating parents is selected randomly and the gene is copied from it. + +.. _scatteredcrossover: + +``scattered_crossover()`` +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the scattered crossover. It randomly selects the gene from one +of the 2 parents. + +Mutation Methods +---------------- + +The ``Mutation`` class in the ``pygad.utils.mutation`` module supports +several methods for applying mutation. All of these methods accept the +same parameter which is: + +- ``offspring``: The offspring to mutate. + +All of such methods return an array of the mutated offspring. + +The next subsections list the supported methods for mutation. + +.. _randommutation: + +``random_mutation()`` +~~~~~~~~~~~~~~~~~~~~~ + +Applies the random mutation which changes the values of some genes +randomly. The number of genes is specified according to either the +``mutation_num_genes`` or the ``mutation_percent_genes`` attributes. + +For each gene, a random value is selected according to the range +specified by the 2 attributes ``random_mutation_min_val`` and +``random_mutation_max_val``. The random value is added to the selected +gene. + +.. _swapmutation: + +``swap_mutation()`` +~~~~~~~~~~~~~~~~~~~ + +Applies the swap mutation which interchanges the values of 2 randomly +selected genes. + +.. _inversionmutation: + +``inversion_mutation()`` +~~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the inversion mutation which selects a subset of genes and +inverts them. + +.. _scramblemutation: + +``scramble_mutation()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the scramble mutation which selects a subset of genes and +shuffles their order randomly. + +.. _adaptivemutation: + +``adaptive_mutation()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +Applies the adaptive mutation which selects a subset of genes and +shuffles their order randomly. + +.. _bestsolution: + +``best_solution()`` +------------------- + +Returns information about the best solution found by the genetic +algorithm. + +It accepts the following parameters: + +- ``pop_fitness=None``: An optional parameter that accepts a list of + the fitness values of the solutions in the population. If ``None``, + then the ``cal_pop_fitness()`` method is called to calculate the + fitness values of the population. + +It returns the following: + +- ``best_solution``: Best solution in the current population. + +- ``best_solution_fitness``: Fitness value of the best solution. + +- ``best_match_idx``: Index of the best solution in the current + population. + +.. _plotfitness: + +``plot_fitness()`` +------------------ + +Previously named ``plot_result()``, this method creates, shows, and +returns a figure that summarizes how the fitness value evolves by +generation. It works only after completing at least 1 generation. + +If no generation is completed (at least 1), an exception is raised. + +Starting from `PyGAD +2.15.0 `__ +and higher, this method accepts the following parameters: + +1. ``title``: Title of the figure. + +2. ``xlabel``: X-axis label. + +3. ``ylabel``: Y-axis label. + +4. ``linewidth``: Line width of the plot. Defaults to ``3``. + +5. ``font_size``: Font size for the labels and title. Defaults to + ``14``. + +6. ``plot_type``: Type of the plot which can be either ``"plot"`` + (default), ``"scatter"``, or ``"bar"``. + +7. ``color``: Color of the plot which defaults to ``"#3870FF"``. + +8. ``save_dir``: Directory to save the figure. + +.. _plotnewsolutionrate: + +``plot_new_solution_rate()`` +---------------------------- + +The ``plot_new_solution_rate()`` method creates, shows, and returns a +figure that shows the number of new solutions explored in each +generation. This method works only when ``save_solutions=True`` in the +constructor of the ``pygad.GA`` class. It also works only after +completing at least 1 generation. + +If no generation is completed (at least 1), an exception is raised. + +This method accepts the following parameters: + +1. ``title``: Title of the figure. + +2. ``xlabel``: X-axis label. + +3. ``ylabel``: Y-axis label. + +4. ``linewidth``: Line width of the plot. Defaults to ``3``. + +5. ``font_size``: Font size for the labels and title. Defaults to + ``14``. + +6. ``plot_type``: Type of the plot which can be either ``"plot"`` + (default), ``"scatter"``, or ``"bar"``. + +7. ``color``: Color of the plot which defaults to ``"#3870FF"``. + +8. ``save_dir``: Directory to save the figure. + +.. _plotgenes: + +``plot_genes()`` +---------------- + +The ``plot_genes()`` method creates, shows, and returns a figure that +describes each gene. It has different options to create the figures +which helps to: + +1. Explore the gene value for each generation by creating a normal plot. + +2. Create a histogram for each gene. + +3. Create a boxplot. + +This is controlled by the ``graph_type`` parameter. + +It works only after completing at least 1 generation. If no generation +is completed, an exception is raised. If no generation is completed (at +least 1), an exception is raised. + +This method accepts the following parameters: + +1. ``title``: Title of the figure. + +2. ``xlabel``: X-axis label. + +3. ``ylabel``: Y-axis label. + +4. ``linewidth``: Line width of the plot. Defaults to ``3``. + +5. ``font_size``: Font size for the labels and title. Defaults to + ``14``. + +6. ``plot_type``: Type of the plot which can be either ``"plot"`` + (default), ``"scatter"``, or ``"bar"``. + +7. ``graph_type``: Type of the graph which can be either ``"plot"`` + (default), ``"boxplot"``, or ``"histogram"``. + +8. ``fill_color``: Fill color of the graph which defaults to + ``"#3870FF"``. This has no effect if ``graph_type="plot"``. + +9. ``color``: Color of the plot which defaults to ``"#3870FF"``. + +10. ``solutions``: Defaults to ``"all"`` which means use all solutions. + If ``"best"`` then only the best solutions are used. + +11. ``save_dir``: Directory to save the figure. + +An exception is raised if: + +- ``solutions="all"`` while ``save_solutions=False`` in the constructor + of the ``pygad.GA`` class. . + +- ``solutions="best"`` while ``save_best_solutions=False`` in the + constructor of the ``pygad.GA`` class. . + +``save()`` +---------- + +Saves the genetic algorithm instance + +Accepts the following parameter: + +- ``filename``: Name of the file to save the instance. No extension is + needed. + +Functions in ``pygad`` +====================== + +Besides the methods available in the ``pygad.GA`` class, this section +discusses the functions available in ``pygad``. Up to this time, there +is only a single function named ``load()``. + +.. _pygadload: + +``pygad.load()`` +---------------- + +Reads a saved instance of the genetic algorithm. This is not a method +but a function that is indented under the ``pygad`` module. So, it could +be called by the pygad module as follows: ``pygad.load(filename)``. + +Accepts the following parameter: + +- ``filename``: Name of the file holding the saved instance of the + genetic algorithm. No extension is needed. + +Returns the genetic algorithm instance. + +Steps to Use ``pygad`` +====================== + +To use the ``pygad`` module, here is a summary of the required steps: + +1. Preparing the ``fitness_func`` parameter. + +2. Preparing Other Parameters. + +3. Import ``pygad``. + +4. Create an Instance of the ``pygad.GA`` Class. + +5. Run the Genetic Algorithm. + +6. Plotting Results. + +7. Information about the Best Solution. + +8. Saving & Loading the Results. + +Let's discuss how to do each of these steps. + +.. _preparing-the-fitnessfunc-parameter: + +Preparing the ``fitness_func`` Parameter +----------------------------------------- + +Even there are some steps in the genetic algorithm pipeline that can +work the same regardless of the problem being solved, one critical step +is the calculation of the fitness value. There is no unique way of +calculating the fitness value and it changes from one problem to +another. + +PyGAD has a parameter called ``fitness_func`` that allows the user to +specify a custom function/method to use when calculating the fitness. +This function/method must be a maximization function/method so that a +solution with a high fitness value returned is selected compared to a +solution with a low value. Doing that allows the user to freely use +PyGAD to solve any problem by passing the appropriate fitness +function/method. It is very important to understand the problem well for +creating it. + +Let's discuss an example: + + | Given the following function: + | y = f(w1:w6) = w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + 6wx6 + | where (x1,x2,x3,x4,x5,x6)=(4, -2, 3.5, 5, -11, -4.7) and y=44 + | What are the best values for the 6 weights (w1 to w6)? We are going + to use the genetic algorithm to optimize this function. + +So, the task is about using the genetic algorithm to find the best +values for the 6 weight ``W1`` to ``W6``. Thinking of the problem, it is +clear that the best solution is that returning an output that is close +to the desired output ``y=44``. So, the fitness function/method should +return a value that gets higher when the solution's output is closer to +``y=44``. Here is a function that does that: + +.. code:: python + + function_inputs = [4, -2, 3.5, 5, -11, -4.7] # Function inputs. + desired_output = 44 # Function output. + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / numpy.abs(output - desired_output) + return fitness + +Such a user-defined function must accept 3 parameters: + +1. The instance of the ``pygad.GA`` class. This helps the user to fetch + any property that helps when calculating the fitness. + +2. The solution(s) to calculate the fitness value(s). Note that the + fitness function can accept multiple solutions only if the + ``fitness_batch_size`` is given a value greater than 1. + +3. The indices of the solutions in the population. The number of indices + also depends on the ``fitness_batch_size`` parameter. + +If a method is passed to the ``fitness_func`` parameter, then it accepts +a fourth parameter representing the method's instance. + +The ``__code__`` object is used to check if this function accepts the +required number of parameters. If more or fewer parameters are passed, +an exception is thrown. + +By creating this function, you did a very important step towards using +PyGAD. + +Preparing Other Parameters +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Here is an example for preparing the other parameters: + +.. code:: python + + num_generations = 50 + num_parents_mating = 4 + + fitness_function = fitness_func + + sol_per_pop = 8 + num_genes = len(function_inputs) + + init_range_low = -2 + init_range_high = 5 + + parent_selection_type = "sss" + keep_parents = 1 + + crossover_type = "single_point" + + mutation_type = "random" + mutation_percent_genes = 10 + +.. _the-ongeneration-parameter: + +The ``on_generation`` Parameter +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +An optional parameter named ``on_generation`` is supported which allows +the user to call a function (with a single parameter) after each +generation. Here is a simple function that just prints the current +generation number and the fitness value of the best solution in the +current generation. The ``generations_completed`` attribute of the GA +class returns the number of the last completed generation. + +.. code:: python + + def on_gen(ga_instance): + print("Generation : ", ga_instance.generations_completed) + print("Fitness of the best solution :", ga_instance.best_solution()[1]) + +After being defined, the function is assigned to the ``on_generation`` +parameter of the GA class constructor. By doing that, the ``on_gen()`` +function will be called after each generation. + +.. code:: python + + ga_instance = pygad.GA(..., + on_generation=on_gen, + ...) + +After the parameters are prepared, we can import PyGAD and build an +instance of the ``pygad.GA`` class. + +Import ``pygad`` +---------------- + +The next step is to import PyGAD as follows: + +.. code:: python + + import pygad + +The ``pygad.GA`` class holds the implementation of all methods for +running the genetic algorithm. + +.. _create-an-instance-of-the-pygadga-class: + +Create an Instance of the ``pygad.GA`` Class +-------------------------------------------- + +The ``pygad.GA`` class is instantiated where the previously prepared +parameters are fed to its constructor. The constructor is responsible +for creating the initial population. + +.. code:: python + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + fitness_func=fitness_function, + sol_per_pop=sol_per_pop, + num_genes=num_genes, + init_range_low=init_range_low, + init_range_high=init_range_high, + parent_selection_type=parent_selection_type, + keep_parents=keep_parents, + crossover_type=crossover_type, + mutation_type=mutation_type, + mutation_percent_genes=mutation_percent_genes) + +Run the Genetic Algorithm +------------------------- + +After an instance of the ``pygad.GA`` class is created, the next step is +to call the ``run()`` method as follows: + +.. code:: python + + ga_instance.run() + +Inside this method, the genetic algorithm evolves over some generations +by doing the following tasks: + +1. Calculating the fitness values of the solutions within the current + population. + +2. Select the best solutions as parents in the mating pool. + +3. Apply the crossover & mutation operation + +4. Repeat the process for the specified number of generations. + +Plotting Results +---------------- + +There is a method named ``plot_fitness()`` which creates a figure +summarizing how the fitness values of the solutions change with the +generations. + +.. code:: python + + ga_instance.plot_fitness() + +.. figure:: https://user-images.githubusercontent.com/16560492/78830005-93111d00-79e7-11ea-9d8e-a8d8325a6101.png + :alt: + +Information about the Best Solution +----------------------------------- + +The following information about the best solution in the last population +is returned using the ``best_solution()`` method. + +- Solution + +- Fitness value of the solution + +- Index of the solution within the population + +.. code:: python + + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +Using the ``best_solution_generation`` attribute of the instance from +the ``pygad.GA`` class, the generation number at which the +``best fitness`` is reached could be fetched. + +.. code:: python + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + +.. _saving--loading-the-results: + +Saving & Loading the Results +---------------------------- + +After the ``run()`` method completes, it is possible to save the current +instance of the genetic algorithm to avoid losing the progress made. The +``save()`` method is available for that purpose. Just pass the file name +to it without an extension. According to the next code, a file named +``genetic.pkl`` will be created and saved in the current directory. + +.. code:: python + + filename = 'genetic' + ga_instance.save(filename=filename) + +You can also load the saved model using the ``load()`` function and +continue using it. For example, you might run the genetic algorithm for +some generations, save its current state using the ``save()`` method, +load the model using the ``load()`` function, and then call the +``run()`` method again. + +.. code:: python + + loaded_ga_instance = pygad.load(filename=filename) + +After the instance is loaded, you can use it to run any method or access +any property. + +.. code:: python + + print(loaded_ga_instance.best_solution()) + +Crossover, Mutation, and Parent Selection +========================================= + +PyGAD supports different types for selecting the parents and applying +the crossover & mutation operators. More features will be added in the +future. To ask for a new feature, please check the ``Ask for Feature`` +section. + +Supported Crossover Operations +------------------------------ + +The supported crossover operations at this time are: + +1. Single point: Implemented using the ``single_point_crossover()`` + method. + +2. Two points: Implemented using the ``two_points_crossover()`` method. + +3. Uniform: Implemented using the ``uniform_crossover()`` method. + +Supported Mutation Operations +----------------------------- + +The supported mutation operations at this time are: + +1. Random: Implemented using the ``random_mutation()`` method. + +2. Swap: Implemented using the ``swap_mutation()`` method. + +3. Inversion: Implemented using the ``inversion_mutation()`` method. + +4. Scramble: Implemented using the ``scramble_mutation()`` method. + +Supported Parent Selection Operations +------------------------------------- + +The supported parent selection techniques at this time are: + +1. Steady-state: Implemented using the ``steady_state_selection()`` + method. + +2. Roulette wheel: Implemented using the ``roulette_wheel_selection()`` + method. + +3. Stochastic universal: Implemented using the + ``stochastic_universal_selection()``\ method. + +4. Rank: Implemented using the ``rank_selection()`` method. + +5. Random: Implemented using the ``random_selection()`` method. + +6. Tournament: Implemented using the ``tournament_selection()`` method. + +Life Cycle of PyGAD +=================== + +The next figure lists the different stages in the lifecycle of an +instance of the ``pygad.GA`` class. Note that PyGAD stops when either +all generations are completed or when the function passed to the +``on_generation`` parameter returns the string ``stop``. + +.. figure:: https://user-images.githubusercontent.com/16560492/220486073-c5b6089d-81e4-44d9-a53c-385f479a7273.jpg + :alt: + +The next code implements all the callback functions to trace the +execution of the genetic algorithm. Each callback function prints its +name. + +.. code:: python + + import pygad + import numpy + + function_inputs = [4,-2,3.5,5,-11,-4.7] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + fitness_function = fitness_func + + def on_start(ga_instance): + print("on_start()") + + def on_fitness(ga_instance, population_fitness): + print("on_fitness()") + + def on_parents(ga_instance, selected_parents): + print("on_parents()") + + def on_crossover(ga_instance, offspring_crossover): + print("on_crossover()") + + def on_mutation(ga_instance, offspring_mutation): + print("on_mutation()") + + def on_generation(ga_instance): + print("on_generation()") + + def on_stop(ga_instance, last_population_fitness): + print("on_stop()") + + ga_instance = pygad.GA(num_generations=3, + num_parents_mating=5, + fitness_func=fitness_function, + sol_per_pop=10, + num_genes=len(function_inputs), + on_start=on_start, + on_fitness=on_fitness, + on_parents=on_parents, + on_crossover=on_crossover, + on_mutation=on_mutation, + on_generation=on_generation, + on_stop=on_stop) + + ga_instance.run() + +Based on the used 3 generations as assigned to the ``num_generations`` +argument, here is the output. + +.. code:: + + on_start() + + on_fitness() + on_parents() + on_crossover() + on_mutation() + on_generation() + + on_fitness() + on_parents() + on_crossover() + on_mutation() + on_generation() + + on_fitness() + on_parents() + on_crossover() + on_mutation() + on_generation() + + on_stop() + +Adaptive Mutation +================= + +In the regular genetic algorithm, the mutation works by selecting a +single fixed mutation rate for all solutions regardless of their fitness +values. So, regardless on whether this solution has high or low quality, +the same number of genes are mutated all the time. + +The pitfalls of using a constant mutation rate for all solutions are +summarized in this paper `Libelli, S. Marsili, and P. Alba. "Adaptive +mutation in genetic algorithms." Soft computing 4.2 (2000): +76-80 `__ +as follows: + + The weak point of "classical" GAs is the total randomness of + mutation, which is applied equally to all chromosomes, irrespective + of their fitness. Thus a very good chromosome is equally likely to be + disrupted by mutation as a bad one. + + On the other hand, bad chromosomes are less likely to produce good + ones through crossover, because of their lack of building blocks, + until they remain unchanged. They would benefit the most from + mutation and could be used to spread throughout the parameter space + to increase the search thoroughness. So there are two conflicting + needs in determining the best probability of mutation. + + Usually, a reasonable compromise in the case of a constant mutation + is to keep the probability low to avoid disruption of good + chromosomes, but this would prevent a high mutation rate of + low-fitness chromosomes. Thus a constant probability of mutation + would probably miss both goals and result in a slow improvement of + the population. + +According to `Libelli, S. Marsili, and P. +Alba. `__ +work, the adaptive mutation solves the problems of constant mutation. + +Adaptive mutation works as follows: + +1. Calculate the average fitness value of the population (``f_avg``). + +2. For each chromosome, calculate its fitness value (``f``). + +3. If ``ff_avg``, then this solution is regarded as a high-quality + solution and thus the mutation rate should be kept low to avoid + disrupting this high quality solution. + +In PyGAD, if ``f=f_avg``, then the solution is regarded of high quality. + +The next figure summarizes the previous steps. + +.. figure:: https://user-images.githubusercontent.com/16560492/103468973-e3c26600-4d2c-11eb-8af3-b3bb39b50540.jpg + :alt: + +This strategy is applied in PyGAD. + +Use Adaptive Mutation in PyGAD +------------------------------ + +In PyGAD 2.10.0, adaptive mutation is supported. To use it, just follow +the following 2 simple steps: + +1. In the constructor of the ``pygad.GA`` class, set + ``mutation_type="adaptive"`` to specify that the type of mutation is + adaptive. + +2. Specify the mutation rates for the low and high quality solutions + using one of these 3 parameters according to your preference: + ``mutation_probability``, ``mutation_num_genes``, and + ``mutation_percent_genes``. Please check the `documentation of each + of these + parameters `__ + for more information. + +When adaptive mutation is used, then the value assigned to any of the 3 +parameters can be of any of these data types: + +1. ``list`` + +2. ``tuple`` + +3. ``numpy.ndarray`` + +Whatever the data type used, the length of the ``list``, ``tuple``, or +the ``numpy.ndarray`` must be exactly 2. That is there are just 2 +values: + +1. The first value is the mutation rate for the low-quality solutions. + +2. The second value is the mutation rate for the high-quality solutions. + +PyGAD expects that the first value is higher than the second value and +thus a warning is printed in case the first value is lower than the +second one. + +Here are some examples to feed the mutation rates: + +.. code:: python + + # mutation_probability + mutation_probability = [0.25, 0.1] + mutation_probability = (0.35, 0.17) + mutation_probability = numpy.array([0.15, 0.05]) + + # mutation_num_genes + mutation_num_genes = [4, 2] + mutation_num_genes = (3, 1) + mutation_num_genes = numpy.array([7, 2]) + + # mutation_percent_genes + mutation_percent_genes = [25, 12] + mutation_percent_genes = (15, 8) + mutation_percent_genes = numpy.array([21, 13]) + +Assume that the average fitness is 12 and the fitness values of 2 +solutions are 15 and 7. If the mutation probabilities are specified as +follows: + +.. code:: python + + mutation_probability = [0.25, 0.1] + +Then the mutation probability of the first solution is 0.1 because its +fitness is 15 which is higher than the average fitness 12. The mutation +probability of the second solution is 0.25 because its fitness is 7 +which is lower than the average fitness 12. + +Here is an example that uses adaptive mutation. + +.. code:: python + + import pygad + import numpy + + function_inputs = [4,-2,3.5,5,-11,-4.7] # Function inputs. + desired_output = 44 # Function output. + + def fitness_func(ga_instance, solution, solution_idx): + # The fitness function calulates the sum of products between each input and its corresponding weight. + output = numpy.sum(solution*function_inputs) + # The value 0.000001 is used to avoid the Inf value when the denominator numpy.abs(output - desired_output) is 0.0. + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + # Creating an instance of the GA class inside the ga module. Some parameters are initialized within the constructor. + ga_instance = pygad.GA(num_generations=200, + fitness_func=fitness_func, + num_parents_mating=10, + sol_per_pop=20, + num_genes=len(function_inputs), + mutation_type="adaptive", + mutation_num_genes=(3, 1)) + + # Running the GA to optimize the parameters of the function. + ga_instance.run() + + ga_instance.plot_fitness(title="PyGAD with Adaptive Mutation", linewidth=5) + +Limit the Gene Value Range +========================== + +In `PyGAD +2.11.0 `__, +the ``gene_space`` parameter supported a new feature to allow +customizing the range of accepted values for each gene. Let's take a +quick review of the ``gene_space`` parameter to build over it. + +The ``gene_space`` parameter allows the user to feed the space of values +of each gene. This way the accepted values for each gene is retracted to +the user-defined values. Assume there is a problem that has 3 genes +where each gene has different set of values as follows: + +1. Gene 1: ``[0.4, 12, -5, 21.2]`` + +2. Gene 2: ``[-2, 0.3]`` + +3. Gene 3: ``[1.2, 63.2, 7.4]`` + +Then, the ``gene_space`` for this problem is as given below. Note that +the order is very important. + +.. code:: python + + gene_space = [[0.4, 12, -5, 21.2], + [-2, 0.3], + [1.2, 63.2, 7.4]] + +In case all genes share the same set of values, then simply feed a +single list to the ``gene_space`` parameter as follows. In this case, +all genes can only take values from this list of 6 values. + +.. code:: python + + gene_space = [33, 7, 0.5, 95. 6.3, 0.74] + +The previous example restricts the gene values to just a set of fixed +number of discrete values. In case you want to use a range of discrete +values to the gene, then you can use the ``range()`` function. For +example, ``range(1, 7)`` means the set of allowed values for the gene +are ``1, 2, 3, 4, 5, and 6``. You can also use the ``numpy.arange()`` or +``numpy.linspace()`` functions for the same purpose. + +The previous discussion only works with a range of discrete values not +continuous values. In `PyGAD +2.11.0 `__, +the ``gene_space`` parameter can be assigned a dictionary that allows +the gene to have values from a continuous range. + +Assuming you want to restrict the gene within this half-open range [1 to +5) where 1 is included and 5 is not. Then simply create a dictionary +with 2 items where the keys of the 2 items are: + +1. ``'low'``: The minimum value in the range which is 1 in the example. + +2. ``'high'``: The maximum value in the range which is 5 in the example. + +The dictionary will look like that: + +.. code:: python + + {'low': 1, + 'high': 5} + +It is not acceptable to add more than 2 items in the dictionary or use +other keys than ``'low'`` and ``'high'``. + +For a 3-gene problem, the next code creates a dictionary for each gene +to restrict its values in a continuous range. For the first gene, it can +take any floating-point value from the range that starts from 1 +(inclusive) and ends at 5 (exclusive). + +.. code:: python + + gene_space = [{'low': 1, 'high': 5}, {'low': 0.3, 'high': 1.4}, {'low': -0.2, 'high': 4.5}] + +Stop at Any Generation +====================== + +In `PyGAD +2.4.0 `__, +it is possible to stop the genetic algorithm after any generation. All +you need to do it to return the string ``"stop"`` in the callback +function ``on_generation``. When this callback function is implemented +and assigned to the ``on_generation`` parameter in the constructor of +the ``pygad.GA`` class, then the algorithm immediately stops after +completing its current generation. Let's discuss an example. + +Assume that the user wants to stop algorithm either after the 100 +generations or if a condition is met. The user may assign a value of 100 +to the ``num_generations`` parameter of the ``pygad.GA`` class +constructor. + +The condition that stops the algorithm is written in a callback function +like the one in the next code. If the fitness value of the best solution +exceeds 70, then the string ``"stop"`` is returned. + +.. code:: python + + def func_generation(ga_instance): + if ga_instance.best_solution()[1] >= 70: + return "stop" + +Stop Criteria +============= + +In `PyGAD +2.15.0 `__, +a new parameter named ``stop_criteria`` is added to the constructor of +the ``pygad.GA`` class. It helps to stop the evolution based on some +criteria. It can be assigned to one or more criterion. + +Each criterion is passed as ``str`` that consists of 2 parts: + +1. Stop word. + +2. Number. + +It takes this form: + +.. code:: python + + "word_num" + +The current 2 supported words are ``reach`` and ``saturate``. + +The ``reach`` word stops the ``run()`` method if the fitness value is +equal to or greater than a given fitness value. An example for ``reach`` +is ``"reach_40"`` which stops the evolution if the fitness is >= 40. + +``saturate`` stops the evolution if the fitness saturates for a given +number of consecutive generations. An example for ``saturate`` is +``"saturate_7"`` which means stop the ``run()`` method if the fitness +does not change for 7 consecutive generations. + +Here is an example that stops the evolution if either the fitness value +reached ``127.4`` or if the fitness saturates for ``15`` generations. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4, -2, 3.5, 8, 9, 4] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + + return fitness + + ga_instance = pygad.GA(num_generations=200, + sol_per_pop=10, + num_parents_mating=4, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + stop_criteria=["reach_127.4", "saturate_15"]) + + ga_instance.run() + print("Number of generations passed is {generations_completed}".format(generations_completed=ga_instance.generations_completed)) + +Elitism Selection +================= + +In `PyGAD +2.18.0 `__, +a new parameter called ``keep_elitism`` is supported. It accepts an +integer to define the number of elitism (i.e. best solutions) to keep in +the next generation. This parameter defaults to ``1`` which means only +the best solution is kept in the next generation. + +In the next example, the ``keep_elitism`` parameter in the constructor +of the ``pygad.GA`` class is set to 2. Thus, the best 2 solutions in +each generation are kept in the next generation. + +.. code:: python + + import numpy + import pygad + + function_inputs = [4,-2,3.5,5,-11,-4.7] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / numpy.abs(output - desired_output) + return fitness + + ga_instance = pygad.GA(num_generations=2, + num_parents_mating=3, + fitness_func=fitness_func, + num_genes=6, + sol_per_pop=5, + keep_elitism=2) + + ga_instance.run() + +The value passed to the ``keep_elitism`` parameter must satisfy 2 +conditions: + +1. It must be ``>= 0``. + +2. It must be ``<= sol_per_pop``. That is its value cannot exceed the + number of solutions in the current population. + +In the previous example, if the ``keep_elitism`` parameter is set equal +to the value passed to the ``sol_per_pop`` parameter, which is 5, then +there will be no evolution at all as in the next figure. This is because +all the 5 solutions are used as elitism in the next generation and no +offspring will be created. + +.. code:: python + + ... + + ga_instance = pygad.GA(..., + sol_per_pop=5, + keep_elitism=5) + + ga_instance.run() + +.. figure:: https://user-images.githubusercontent.com/16560492/189273225-67ffad41-97ab-45e1-9324-429705e17b20.png + :alt: + +Note that if the ``keep_elitism`` parameter is effective (i.e. is +assigned a positive integer, not zero), then the ``keep_parents`` +parameter will have no effect. Because the default value of the +``keep_elitism`` parameter is 1, then the ``keep_parents`` parameter has +no effect by default. The ``keep_parents`` parameter is only effective +when ``keep_elitism=0``. + +Random Seed +=========== + +In `PyGAD +2.18.0 `__, +a new parameter called ``random_seed`` is supported. Its value is used +as a seed for the random function generators. + +PyGAD uses random functions in these 2 libraries: + +1. NumPy + +2. random + +The ``random_seed`` parameter defaults to ``None`` which means no seed +is used. As a result, different random numbers are generated for each +run of PyGAD. + +If this parameter is assigned a proper seed, then the results will be +reproducible. In the next example, the integer 2 is used as a random +seed. + +.. code:: python + + import numpy + import pygad + + function_inputs = [4,-2,3.5,5,-11,-4.7] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / numpy.abs(output - desired_output) + return fitness + + ga_instance = pygad.GA(num_generations=2, + num_parents_mating=3, + fitness_func=fitness_func, + sol_per_pop=5, + num_genes=6, + random_seed=2) + + ga_instance.run() + best_solution, best_solution_fitness, best_match_idx = ga_instance.best_solution() + print(best_solution) + print(best_solution_fitness) + +This is the best solution found and its fitness value. + +.. code:: + + [ 2.77249188 -4.06570662 0.04196872 -3.47770796 -0.57502138 -3.22775267] + 0.04872203136549972 + +After running the code again, it will find the same result. + +.. code:: + + [ 2.77249188 -4.06570662 0.04196872 -3.47770796 -0.57502138 -3.22775267] + 0.04872203136549972 + +Continue without Loosing Progress +================================= + +In `PyGAD +2.18.0 `__, +and thanks for `Felix Bernhard `__ for +opening `this GitHub +issue `__, +the values of these 4 instance attributes are no longer reset after each +call to the ``run()`` method. + +1. ``self.best_solutions`` + +2. ``self.best_solutions_fitness`` + +3. ``self.solutions`` + +4. ``self.solutions_fitness`` + +This helps the user to continue where the last run stopped without +loosing the values of these 4 attributes. + +Now, the user can save the model by calling the ``save()`` method. + +.. code:: python + + import pygad + + def fitness_func(ga_instance, solution, solution_idx): + ... + return fitness + + ga_instance = pygad.GA(...) + + ga_instance.run() + + ga_instance.plot_fitness() + + ga_instance.save("pygad_GA") + +Then the saved model is loaded by calling the ``load()`` function. After +calling the ``run()`` method over the loaded instance, then the data +from the previous 4 attributes are not reset but extended with the new +data. + +.. code:: python + + import pygad + + def fitness_func(ga_instance, solution, solution_idx): + ... + return fitness + + loaded_ga_instance = pygad.load("pygad_GA") + + loaded_ga_instance.run() + + loaded_ga_instance.plot_fitness() + +The plot created by the ``plot_fitness()`` method will show the data +collected from both the runs. + +Note that the 2 attributes (``self.best_solutions`` and +``self.best_solutions_fitness``) only work if the +``save_best_solutions`` parameter is set to ``True``. Also, the 2 +attributes (``self.solutions`` and ``self.solutions_fitness``) only work +if the ``save_solutions`` parameter is ``True``. + +Prevent Duplicates in Gene Values +================================= + +In `PyGAD +2.13.0 `__, +a new bool parameter called ``allow_duplicate_genes`` is supported to +control whether duplicates are supported in the chromosome or not. In +other words, whether 2 or more genes might have the same exact value. + +If ``allow_duplicate_genes=True`` (which is the default case), genes may +have the same value. If ``allow_duplicate_genes=False``, then no 2 genes +will have the same value given that there are enough unique values for +the genes. + +The next code gives an example to use the ``allow_duplicate_genes`` +parameter. A callback generation function is implemented to print the +population after each generation. + +.. code:: python + + import pygad + + def fitness_func(ga_instance, solution, solution_idx): + return 0 + + def on_generation(ga): + print("Generation", ga.generations_completed) + print(ga.population) + + ga_instance = pygad.GA(num_generations=5, + sol_per_pop=5, + num_genes=4, + mutation_num_genes=3, + random_mutation_min_val=-5, + random_mutation_max_val=5, + num_parents_mating=2, + fitness_func=fitness_func, + gene_type=int, + on_generation=on_generation, + allow_duplicate_genes=False) + ga_instance.run() + +Here are the population after the 5 generations. Note how there are no +duplicate values. + +.. code:: python + + Generation 1 + [[ 2 -2 -3 3] + [ 0 1 2 3] + [ 5 -3 6 3] + [-3 1 -2 4] + [-1 0 -2 3]] + Generation 2 + [[-1 0 -2 3] + [-3 1 -2 4] + [ 0 -3 -2 6] + [-3 0 -2 3] + [ 1 -4 2 4]] + Generation 3 + [[ 1 -4 2 4] + [-3 0 -2 3] + [ 4 0 -2 1] + [-4 0 -2 -3] + [-4 2 0 3]] + Generation 4 + [[-4 2 0 3] + [-4 0 -2 -3] + [-2 5 4 -3] + [-1 2 -4 4] + [-4 2 0 -3]] + Generation 5 + [[-4 2 0 -3] + [-1 2 -4 4] + [ 3 4 -4 0] + [-1 0 2 -2] + [-4 2 -1 1]] + +The ``allow_duplicate_genes`` parameter is configured with use with the +``gene_space`` parameter. Here is an example where each of the 4 genes +has the same space of values that consists of 4 values (1, 2, 3, and 4). + +.. code:: python + + import pygad + + def fitness_func(ga_instance, solution, solution_idx): + return 0 + + def on_generation(ga): + print("Generation", ga.generations_completed) + print(ga.population) + + ga_instance = pygad.GA(num_generations=1, + sol_per_pop=5, + num_genes=4, + num_parents_mating=2, + fitness_func=fitness_func, + gene_type=int, + gene_space=[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]], + on_generation=on_generation, + allow_duplicate_genes=False) + ga_instance.run() + +Even that all the genes share the same space of values, no 2 genes +duplicate their values as provided by the next output. + +.. code:: python + + Generation 1 + [[2 3 1 4] + [2 3 1 4] + [2 4 1 3] + [2 3 1 4] + [1 3 2 4]] + Generation 2 + [[1 3 2 4] + [2 3 1 4] + [1 3 2 4] + [2 3 4 1] + [1 3 4 2]] + Generation 3 + [[1 3 4 2] + [2 3 4 1] + [1 3 4 2] + [3 1 4 2] + [3 2 4 1]] + Generation 4 + [[3 2 4 1] + [3 1 4 2] + [3 2 4 1] + [1 2 4 3] + [1 3 4 2]] + Generation 5 + [[1 3 4 2] + [1 2 4 3] + [2 1 4 3] + [1 2 4 3] + [1 2 4 3]] + +You should care of giving enough values for the genes so that PyGAD is +able to find alternatives for the gene value in case it duplicates with +another gene. + +There might be 2 duplicate genes where changing either of the 2 +duplicating genes will not solve the problem. For example, if +``gene_space=[[3, 0, 1], [4, 1, 2], [0, 2], [3, 2, 0]]`` and the +solution is ``[3 2 0 0]``, then the values of the last 2 genes +duplicate. There are no possible changes in the last 2 genes to solve +the problem. + +This problem can be solved by randomly changing one of the +non-duplicating genes that may make a room for a unique value in one the +2 duplicating genes. For example, by changing the second gene from 2 to +4, then any of the last 2 genes can take the value 2 and solve the +duplicates. The resultant gene is then ``[3 4 2 0]``. But this option is +not yet supported in PyGAD. + +User-Defined Crossover, Mutation, and Parent Selection Operators +================================================================ + +Previously, the user can select the the type of the crossover, mutation, +and parent selection operators by assigning the name of the operator to +the following parameters of the ``pygad.GA`` class's constructor: + +1. ``crossover_type`` + +2. ``mutation_type`` + +3. ``parent_selection_type`` + +This way, the user can only use the built-in functions for each of these +operators. + +Starting from `PyGAD +2.16.0 `__, +the user can create a custom crossover, mutation, and parent selection +operators and assign these functions to the above parameters. Thus, a +new operator can be plugged easily into the `PyGAD +Lifecycle `__. + +This is a sample code that does not use any custom function. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4,-2,3.5] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func) + + ga_instance.run() + ga_instance.plot_fitness() + +This section describes the expected input parameters and outputs. For +simplicity, all of these custom functions all accept the instance of the +``pygad.GA`` class as the last parameter. + +User-Defined Crossover Operator +------------------------------- + +The user-defined crossover function is a Python function that accepts 3 +parameters: + +1. The selected parents. + +2. The size of the offspring as a tuple of 2 numbers: (the offspring + size, number of genes). + +3. The instance from the ``pygad.GA`` class. This instance helps to + retrieve any property like ``population``, ``gene_type``, + ``gene_space``, etc. + +This function should return a NumPy array of shape equal to the value +passed to the second parameter. + +The next code creates a template for the user-defined crossover +operator. You can use any names for the parameters. Note how a NumPy +array is returned. + +.. code:: python + + def crossover_func(parents, offspring_size, ga_instance): + offspring = ... + ... + return numpy.array(offspring) + +As an example, the next code creates a single-point crossover function. +By randomly generating a random point (i.e. index of a gene), the +function simply uses 2 parents to produce an offspring by copying the +genes before the point from the first parent and the remaining from the +second parent. + +.. code:: python + + def crossover_func(parents, offspring_size, ga_instance): + offspring = [] + idx = 0 + while len(offspring) != offspring_size[0]: + parent1 = parents[idx % parents.shape[0], :].copy() + parent2 = parents[(idx + 1) % parents.shape[0], :].copy() + + random_split_point = numpy.random.choice(range(offspring_size[1])) + + parent1[random_split_point:] = parent2[random_split_point:] + + offspring.append(parent1) + + idx += 1 + + return numpy.array(offspring) + +To use this user-defined function, simply assign its name to the +``crossover_type`` parameter in the constructor of the ``pygad.GA`` +class. The next code gives an example. In this case, the custom function +will be called in each generation rather than calling the built-in +crossover functions defined in PyGAD. + +.. code:: python + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + crossover_type=crossover_func) + +User-Defined Mutation Operator +------------------------------ + +A user-defined mutation function/operator can be created the same way a +custom crossover operator/function is created. Simply, it is a Python +function that accepts 2 parameters: + +1. The offspring to be mutated. + +2. The instance from the ``pygad.GA`` class. This instance helps to + retrieve any property like ``population``, ``gene_type``, + ``gene_space``, etc. + +The template for the user-defined mutation function is given in the next +code. According to the user preference, the function should make some +random changes to the genes. + +.. code:: python + + def mutation_func(offspring, ga_instance): + ... + return offspring + +The next code builds the random mutation where a single gene from each +chromosome is mutated by adding a random number between 0 and 1 to the +gene's value. + +.. code:: python + + def mutation_func(offspring, ga_instance): + + for chromosome_idx in range(offspring.shape[0]): + random_gene_idx = numpy.random.choice(range(offspring.shape[1])) + + offspring[chromosome_idx, random_gene_idx] += numpy.random.random() + + return offspring + +Here is how this function is assigned to the ``mutation_type`` +parameter. + +.. code:: python + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + crossover_type=crossover_func, + mutation_type=mutation_func) + +Note that there are other things to take into consideration like: + +- Making sure that each gene conforms to the data type(s) listed in the + ``gene_type`` parameter. + +- If the ``gene_space`` parameter is used, then the new value for the + gene should conform to the values/ranges listed. + +- Mutating a number of genes that conforms to the parameters + ``mutation_percent_genes``, ``mutation_probability``, and + ``mutation_num_genes``. + +- Whether mutation happens with or without replacement based on the + ``mutation_by_replacement`` parameter. + +- The minimum and maximum values from which a random value is generated + based on the ``random_mutation_min_val`` and + ``random_mutation_max_val`` parameters. + +- Whether duplicates are allowed or not in the chromosome based on the + ``allow_duplicate_genes`` parameter. + +and more. + +It all depends on your objective from building the mutation function. +You may neglect or consider some of the considerations according to your +objective. + +User-Defined Parent Selection Operator +-------------------------------------- + +No much to mention about building a user-defined parent selection +function as things are similar to building a crossover or mutation +function. Just create a Python function that accepts 3 parameters: + +1. The fitness values of the current population. + +2. The number of parents needed. + +3. The instance from the ``pygad.GA`` class. This instance helps to + retrieve any property like ``population``, ``gene_type``, + ``gene_space``, etc. + +The function should return 2 outputs: + +1. The selected parents as a NumPy array. Its shape is equal to (the + number of selected parents, ``num_genes``). Note that the number of + selected parents is equal to the value assigned to the second input + parameter. + +2. The indices of the selected parents inside the population. It is a 1D + list with length equal to the number of selected parents. + +The outputs must be of type ``numpy.ndarray``. + +Here is a template for building a custom parent selection function. + +.. code:: python + + def parent_selection_func(fitness, num_parents, ga_instance): + ... + return parents, fitness_sorted[:num_parents] + +The next code builds the steady-state parent selection where the best +parents are selected. The number of parents is equal to the value in the +``num_parents`` parameter. + +.. code:: python + + def parent_selection_func(fitness, num_parents, ga_instance): + + fitness_sorted = sorted(range(len(fitness)), key=lambda k: fitness[k]) + fitness_sorted.reverse() + + parents = numpy.empty((num_parents, ga_instance.population.shape[1])) + + for parent_num in range(num_parents): + parents[parent_num, :] = ga_instance.population[fitness_sorted[parent_num], :].copy() + + return parents, numpy.array(fitness_sorted[:num_parents]) + +Finally, the defined function is assigned to the +``parent_selection_type`` parameter as in the next code. + +.. code:: python + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + crossover_type=crossover_func, + mutation_type=mutation_func, + parent_selection_type=parent_selection_func) + +Example +------- + +By discussing how to customize the 3 operators, the next code uses the +previous 3 user-defined functions instead of the built-in functions. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4,-2,3.5] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + + return fitness + + def parent_selection_func(fitness, num_parents, ga_instance): + + fitness_sorted = sorted(range(len(fitness)), key=lambda k: fitness[k]) + fitness_sorted.reverse() + + parents = numpy.empty((num_parents, ga_instance.population.shape[1])) + + for parent_num in range(num_parents): + parents[parent_num, :] = ga_instance.population[fitness_sorted[parent_num], :].copy() + + return parents, numpy.array(fitness_sorted[:num_parents]) + + def crossover_func(parents, offspring_size, ga_instance): + + offspring = [] + idx = 0 + while len(offspring) != offspring_size[0]: + parent1 = parents[idx % parents.shape[0], :].copy() + parent2 = parents[(idx + 1) % parents.shape[0], :].copy() + + random_split_point = numpy.random.choice(range(offspring_size[1])) + + parent1[random_split_point:] = parent2[random_split_point:] + + offspring.append(parent1) + + idx += 1 + + return numpy.array(offspring) + + def mutation_func(offspring, ga_instance): + + for chromosome_idx in range(offspring.shape[0]): + random_gene_idx = numpy.random.choice(range(offspring.shape[0])) + + offspring[chromosome_idx, random_gene_idx] += numpy.random.random() + + return offspring + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + crossover_type=crossover_func, + mutation_type=mutation_func, + parent_selection_type=parent_selection_func) + + ga_instance.run() + ga_instance.plot_fitness() + +This is the same example but using methods instead of functions. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4,-2,3.5] + desired_output = 44 + + class Test: + def fitness_func(self, ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + + return fitness + + def parent_selection_func(self, fitness, num_parents, ga_instance): + + fitness_sorted = sorted(range(len(fitness)), key=lambda k: fitness[k]) + fitness_sorted.reverse() + + parents = numpy.empty((num_parents, ga_instance.population.shape[1])) + + for parent_num in range(num_parents): + parents[parent_num, :] = ga_instance.population[fitness_sorted[parent_num], :].copy() + + return parents, numpy.array(fitness_sorted[:num_parents]) + + def crossover_func(self, parents, offspring_size, ga_instance): + + offspring = [] + idx = 0 + while len(offspring) != offspring_size[0]: + parent1 = parents[idx % parents.shape[0], :].copy() + parent2 = parents[(idx + 1) % parents.shape[0], :].copy() + + random_split_point = numpy.random.choice(range(offspring_size[0])) + + parent1[random_split_point:] = parent2[random_split_point:] + + offspring.append(parent1) + + idx += 1 + + return numpy.array(offspring) + + def mutation_func(self, offspring, ga_instance): + + for chromosome_idx in range(offspring.shape[0]): + random_gene_idx = numpy.random.choice(range(offspring.shape[1])) + + offspring[chromosome_idx, random_gene_idx] += numpy.random.random() + + return offspring + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=Test().fitness_func, + parent_selection_type=Test().parent_selection_func, + crossover_type=Test().crossover_func, + mutation_type=Test().mutation_func) + + ga_instance.run() + ga_instance.plot_fitness() + +.. _more-about-the-genespace-parameter: + +More about the ``gene_space`` Parameter +======================================= + +The ``gene_space`` parameter customizes the space of values of each +gene. + +Assuming that all genes have the same global space which include the +values 0.3, 5.2, -4, and 8, then those values can be assigned to the +``gene_space`` parameter as a list, tuple, or range. Here is a list +assigned to this parameter. By doing that, then the gene values are +restricted to those assigned to the ``gene_space`` parameter. + +.. code:: python + + gene_space = [0.3, 5.2, -4, 8] + +If some genes have different spaces, then ``gene_space`` should accept a +nested list or tuple. In this case, the elements could be: + +1. Number (of ``int``, ``float``, or ``NumPy`` data types): A single + value to be assigned to the gene. This means this gene will have the + same value across all generations. + +2. ``list``, ``tuple``, ``numpy.ndarray``, or any range like ``range``, + ``numpy.arange()``, or ``numpy.linspace``: It holds the space for + each individual gene. But this space is usually discrete. That is + there is a set of finite values to select from. + +3. ``dict``: To sample a value for a gene from a continuous range. The + dictionary must have 2 mandatory keys which are ``"low"`` and + ``"high"`` in addition to an optional key which is ``"step"``. A + random value is returned between the values assigned to the items + with ``"low"`` and ``"high"`` keys. If the ``"step"`` exists, then + this works as the previous options (i.e. discrete set of values). + +4. ``None``: A gene with its space set to ``None`` is initialized + randomly from the range specified by the 2 parameters + ``init_range_low`` and ``init_range_high``. For mutation, its value + is mutated based on a random value from the range specified by the 2 + parameters ``random_mutation_min_val`` and + ``random_mutation_max_val``. If all elements in the ``gene_space`` + parameter are ``None``, the parameter will not have any effect. + +Assuming that a chromosome has 2 genes and each gene has a different +value space. Then the ``gene_space`` could be assigned a nested +list/tuple where each element determines the space of a gene. + +According to the next code, the space of the first gene is ``[0.4, -5]`` +which has 2 values and the space for the second gene is +``[0.5, -3.2, 8.8, -9]`` which has 4 values. + +.. code:: python + + gene_space = [[0.4, -5], [0.5, -3.2, 8.2, -9]] + +For a 2 gene chromosome, if the first gene space is restricted to the +discrete values from 0 to 4 and the second gene is restricted to the +values from 10 to 19, then it could be specified according to the next +code. + +.. code:: python + + gene_space = [range(5), range(10, 20)] + +The ``gene_space`` can also be assigned to a single range, as given +below, where the values of all genes are sampled from the same range. + +.. code:: python + + gene_space = numpy.arange(15) + +The ``gene_space`` can be assigned a dictionary to sample a value from a +continuous range. + +.. code:: python + + gene_space = {"low": 4, "high": 30} + +A step also can be assigned to the dictionary. This works as if a range +is used. + +.. code:: python + + gene_space = {"low": 4, "high": 30, "step": 2.5} + +If a ``None`` is assigned to only a single gene, then its value will be +randomly generated initially using the ``init_range_low`` and +``init_range_high`` parameters in the ``pygad.GA`` class's constructor. +During mutation, the value are sampled from the range defined by the 2 +parameters ``random_mutation_min_val`` and ``random_mutation_max_val``. +This is an example where the second gene is given a ``None`` value. + +.. code:: python + + gene_space = [range(5), None, numpy.linspace(10, 20, 300)] + +If the user did not assign the initial population to the +``initial_population`` parameter, the initial population is created +randomly based on the ``gene_space`` parameter. Moreover, the mutation +is applied based on this parameter. + +.. _more-about-the-genetype-parameter: + +More about the ``gene_type`` Parameter +====================================== + +The ``gene_type`` parameter allows the user to control the data type for +all genes at once or each individual gene. In `PyGAD +2.15.0 `__, +the ``gene_type`` parameter also supports customizing the precision for +``float`` data types. As a result, the ``gene_type`` parameter helps to: + +1. Select a data type for all genes with or without precision. + +2. Select a data type for each individual gene with or without + precision. + +Let's discuss things by examples. + +Data Type for All Genes without Precision +----------------------------------------- + +The data type for all genes can be specified by assigning the numeric +data type directly to the ``gene_type`` parameter. This is an example to +make all genes of ``int`` data types. + +.. code:: python + + gene_type=int + +Given that the supported numeric data types of PyGAD include Python's +``int`` and ``float`` in addition to all numeric types of ``NumPy``, +then any of these types can be assigned to the ``gene_type`` parameter. + +If no precision is specified for a ``float`` data type, then the +complete floating-point number is kept. + +The next code uses an ``int`` data type for all genes where the genes in +the initial and final population are only integers. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4, -2, 3.5, 8, -2] + desired_output = 2671.1234 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + gene_type=int) + + print("Initial Population") + print(ga_instance.initial_population) + + ga_instance.run() + + print("Final Population") + print(ga_instance.population) + +.. code:: python + + Initial Population + [[ 1 -1 2 0 -3] + [ 0 -2 0 -3 -1] + [ 0 -1 -1 2 0] + [-2 3 -2 3 3] + [ 0 0 2 -2 -2]] + + Final Population + [[ 1 -1 2 2 0] + [ 1 -1 2 2 0] + [ 1 -1 2 2 0] + [ 1 -1 2 2 0] + [ 1 -1 2 2 0]] + +Data Type for All Genes with Precision +-------------------------------------- + +A precision can only be specified for a ``float`` data type and cannot +be specified for integers. Here is an example to use a precision of 3 +for the ``float`` data type. In this case, all genes are of type +``float`` and their maximum precision is 3. + +.. code:: python + + gene_type=[float, 3] + +The next code uses prints the initial and final population where the +genes are of type ``float`` with precision 3. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4, -2, 3.5, 8, -2] + desired_output = 2671.1234 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + + return fitness + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + gene_type=[float, 3]) + + print("Initial Population") + print(ga_instance.initial_population) + + ga_instance.run() + + print("Final Population") + print(ga_instance.population) + +.. code:: python + + Initial Population + [[-2.417 -0.487 3.623 2.457 -2.362] + [-1.231 0.079 -1.63 1.629 -2.637] + [ 0.692 -2.098 0.705 0.914 -3.633] + [ 2.637 -1.339 -1.107 -0.781 -3.896] + [-1.495 1.378 -1.026 3.522 2.379]] + + Final Population + [[ 1.714 -1.024 3.623 3.185 -2.362] + [ 0.692 -1.024 3.623 3.185 -2.362] + [ 0.692 -1.024 3.623 3.375 -2.362] + [ 0.692 -1.024 4.041 3.185 -2.362] + [ 1.714 -0.644 3.623 3.185 -2.362]] + +Data Type for each Individual Gene without Precision +---------------------------------------------------- + +In `PyGAD +2.14.0 `__, +the ``gene_type`` parameter allows customizing the gene type for each +individual gene. This is by using a ``list``/``tuple``/``numpy.ndarray`` +with number of elements equal to the number of genes. For each element, +a type is specified for the corresponding gene. + +This is an example for a 5-gene problem where different types are +assigned to the genes. + +.. code:: python + + gene_type=[int, float, numpy.float16, numpy.int8, float] + +This is a complete code that prints the initial and final population for +a custom-gene data type. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4, -2, 3.5, 8, -2] + desired_output = 2671.1234 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + gene_type=[int, float, numpy.float16, numpy.int8, float]) + + print("Initial Population") + print(ga_instance.initial_population) + + ga_instance.run() + + print("Final Population") + print(ga_instance.population) + +.. code:: python + + Initial Population + [[0 0.8615522360026828 0.7021484375 -2 3.5301821368185866] + [-3 2.648189378595294 -3.830078125 1 -0.9586271572917742] + [3 3.7729827570110714 1.2529296875 -3 1.395741994211889] + [0 1.0490687178053282 1.51953125 -2 0.7243617940450235] + [0 -0.6550158436937226 -2.861328125 -2 1.8212734549263097]] + + Final Population + [[3 3.7729827570110714 2.055 0 0.7243617940450235] + [3 3.7729827570110714 1.458 0 -0.14638754050305036] + [3 3.7729827570110714 1.458 0 0.0869406120516778] + [3 3.7729827570110714 1.458 0 0.7243617940450235] + [3 3.7729827570110714 1.458 0 -0.14638754050305036]] + +Data Type for each Individual Gene with Precision +------------------------------------------------- + +The precision can also be specified for the ``float`` data types as in +the next line where the second gene precision is 2 and last gene +precision is 1. + +.. code:: python + + gene_type=[int, [float, 2], numpy.float16, numpy.int8, [float, 1]] + +This is a complete example where the initial and final populations are +printed where the genes comply with the data types and precisions +specified. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4, -2, 3.5, 8, -2] + desired_output = 2671.1234 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + gene_type=[int, [float, 2], numpy.float16, numpy.int8, [float, 1]]) + + print("Initial Population") + print(ga_instance.initial_population) + + ga_instance.run() + + print("Final Population") + print(ga_instance.population) + +.. code:: python + + Initial Population + [[-2 -1.22 1.716796875 -1 0.2] + [-1 -1.58 -3.091796875 0 -1.3] + [3 3.35 -0.107421875 1 -3.3] + [-2 -3.58 -1.779296875 0 0.6] + [2 -3.73 2.65234375 3 -0.5]] + + Final Population + [[2 -4.22 3.47 3 -1.3] + [2 -3.73 3.47 3 -1.3] + [2 -4.22 3.47 2 -1.3] + [2 -4.58 3.47 3 -1.3] + [2 -3.73 3.47 3 -1.3]] + +Visualization in PyGAD +====================== + +This section discusses the different options to visualize the results in +PyGAD through these methods: + +1. ``plot_fitness()`` + +2. ``plot_genes()`` + +3. ``plot_new_solution_rate()`` + +In the following code, the ``save_solutions`` flag is set to ``True`` +which means all solutions are saved in the ``solutions`` attribute. The +code runs for only 10 generations. + +.. code:: python + + import pygad + import numpy + + equation_inputs = [4, -2, 3.5, 8, -2, 3.5, 8] + desired_output = 2671.1234 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=10, + num_parents_mating=5, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + gene_space=[range(1, 10), range(10, 20), range(15, 30), range(20, 40), range(25, 50), range(10, 30), range(20, 50)], + gene_type=int, + save_solutions=True) + + ga_instance.run() + +Let's explore how to visualize the results by the above mentioned +methods. + +.. _plotfitness-2: + +``plot_fitness()`` +------------------ + +The ``plot_fitness()`` method shows the fitness value for each +generation. + +.. _plottypeplot: + +``plot_type="plot"`` +~~~~~~~~~~~~~~~~~~~~ + +The simplest way to call this method is as follows leaving the +``plot_type`` with its default value ``"plot"`` to create a continuous +line connecting the fitness values across all generations: + +.. code:: python + + ga_instance.plot_fitness() + # ga_instance.plot_fitness(plot_type="plot") + +.. figure:: https://user-images.githubusercontent.com/16560492/122472609-d02f5280-cf8e-11eb-88a7-f9366ff6e7c6.png + :alt: + +.. _plottypescatter: + +``plot_type="scatter"`` +~~~~~~~~~~~~~~~~~~~~~~~ + +The ``plot_type`` can also be set to ``"scatter"`` to create a scatter +graph with each individual fitness represented as a dot. The size of +these dots can be changed using the ``linewidth`` parameter. + +.. code:: python + + ga_instance.plot_fitness(plot_type="scatter") + +.. figure:: https://user-images.githubusercontent.com/16560492/122473159-75e2c180-cf8f-11eb-942d-31279b286dbd.png + :alt: + +.. _plottypebar: + +``plot_type="bar"`` +~~~~~~~~~~~~~~~~~~~ + +The third value for the ``plot_type`` parameter is ``"bar"`` to create a +bar graph with each individual fitness represented as a bar. + +.. code:: python + + ga_instance.plot_fitness(plot_type="bar") + +.. figure:: https://user-images.githubusercontent.com/16560492/122473340-b7736c80-cf8f-11eb-89c5-4f7db3b653cc.png + :alt: + +.. _plotnewsolutionrate-2: + +``plot_new_solution_rate()`` +---------------------------- + +The ``plot_new_solution_rate()`` method presents the number of new +solutions explored in each generation. This helps to figure out if the +genetic algorithm is able to find new solutions as an indication of more +possible evolution. If no new solutions are explored, this is an +indication that no further evolution is possible. + +The ``plot_new_solution_rate()`` method accepts the same parameters as +in the ``plot_fitness()`` method with 3 possible values for +``plot_type`` parameter. + +.. _plottypeplot-2: + +``plot_type="plot"`` +~~~~~~~~~~~~~~~~~~~~ + +The default value for the ``plot_type`` parameter is ``"plot"``. + +.. code:: python + + ga_instance.plot_new_solution_rate() + # ga_instance.plot_new_solution_rate(plot_type="plot") + +The next figure shows that, for example, generation 6 has the least +number of new solutions which is 4. The number of new solutions in the +first generation is always equal to the number of solutions in the +population (i.e. the value assigned to the ``sol_per_pop`` parameter in +the constructor of the ``pygad.GA`` class) which is 10 in this example. + +.. figure:: https://user-images.githubusercontent.com/16560492/122475815-3322e880-cf93-11eb-9648-bf66f823234b.png + :alt: + +.. _plottypescatter-2: + +``plot_type="scatter"`` +~~~~~~~~~~~~~~~~~~~~~~~ + +The previous graph can be represented as scattered points by setting +``plot_type="scatter"``. + +.. code:: python + + ga_instance.plot_new_solution_rate(plot_type="scatter") + +.. figure:: https://user-images.githubusercontent.com/16560492/122476108-adec0380-cf93-11eb-80ac-7588bf90492f.png + :alt: + +.. _plottypebar-2: + +``plot_type="bar"`` +~~~~~~~~~~~~~~~~~~~ + +By setting ``plot_type="scatter"``, each value is represented as a +vertical bar. + +.. code:: python + + ga_instance.plot_new_solution_rate(plot_type="bar") + +.. figure:: https://user-images.githubusercontent.com/16560492/122476173-c2c89700-cf93-11eb-9e77-d39737cd3a96.png + :alt: + +.. _plotgenes-2: + +``plot_genes()`` +---------------- + +The ``plot_genes()`` method is the third option to visualize the PyGAD +results. This method has 3 control variables: + +1. ``graph_type="plot"``: Can be ``"plot"`` (default), ``"boxplot"``, or + ``"histogram"``. + +2. ``plot_type="plot"``: Identical to the ``plot_type`` parameter + explored in the ``plot_fitness()`` and ``plot_new_solution_rate()`` + methods. + +3. ``solutions="all"``: Can be ``"all"`` (default) or ``"best"``. + +These 3 parameters controls the style of the output figure. + +The ``graph_type`` parameter selects the type of the graph which helps +to explore the gene values as: + +1. A normal plot. + +2. A histogram. + +3. A box and whisker plot. + +The ``plot_type`` parameter works only when the type of the graph is set +to ``"plot"``. + +The ``solutions`` parameter selects whether the genes come from all +solutions in the population or from just the best solutions. + +.. _graphtypeplot: + +``graph_type="plot"`` +~~~~~~~~~~~~~~~~~~~~~ + +When ``graph_type="plot"``, then the figure creates a normal graph where +the relationship between the gene values and the generation numbers is +represented as a continuous plot, scattered points, or bars. + +.. _plottypeplot-3: + +``plot_type="plot"`` +^^^^^^^^^^^^^^^^^^^^ + +Because the default value for both ``graph_type`` and ``plot_type`` is +``"plot"``, then all of the lines below creates the same figure. This +figure is helpful to know whether a gene value lasts for more +generations as an indication of the best value for this gene. For +example, the value 16 for the gene with index 5 (at column 2 and row 2 +of the next graph) lasted for 83 generations. + +.. code:: python + + ga_instance.plot_genes() + + ga_instance.plot_genes(graph_type="plot") + + ga_instance.plot_genes(plot_type="plot") + + ga_instance.plot_genes(graph_type="plot", + plot_type="plot") + +.. figure:: https://user-images.githubusercontent.com/16560492/122477158-4a62d580-cf95-11eb-8c93-9b6e74cb814c.png + :alt: + +As the default value for the ``solutions`` parameter is ``"all"``, then +the following method calls generate the same plot. + +.. code:: python + + ga_instance.plot_genes(solutions="all") + + ga_instance.plot_genes(graph_type="plot", + solutions="all") + + ga_instance.plot_genes(plot_type="plot", + solutions="all") + + ga_instance.plot_genes(graph_type="plot", + plot_type="plot", + solutions="all") + +.. _plottypescatter-3: + +``plot_type="scatter"`` +^^^^^^^^^^^^^^^^^^^^^^^ + +The following calls of the ``plot_genes()`` method create the same +scatter plot. + +.. code:: python + + ga_instance.plot_genes(plot_type="scatter") + + ga_instance.plot_genes(graph_type="plot", + plot_type="scatter", + solutions='all') + +.. figure:: https://user-images.githubusercontent.com/16560492/122477273-73836600-cf95-11eb-828f-f357c7b0f815.png + :alt: + +.. _plottypebar-3: + +``plot_type="bar"`` +^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + ga_instance.plot_genes(plot_type="bar") + + ga_instance.plot_genes(graph_type="plot", + plot_type="bar", + solutions='all') + +.. figure:: https://user-images.githubusercontent.com/16560492/122477370-99106f80-cf95-11eb-8643-865b55e6b844.png + :alt: + +.. _graphtypeboxplot: + +``graph_type="boxplot"`` +~~~~~~~~~~~~~~~~~~~~~~~~ + +By setting ``graph_type`` to ``"boxplot"``, then a box and whisker graph +is created. Now, the ``plot_type`` parameter has no effect. + +The following 2 calls of the ``plot_genes()`` method create the same +figure as the default value for the ``solutions`` parameter is +``"all"``. + +.. code:: python + + ga_instance.plot_genes(graph_type="boxplot") + + ga_instance.plot_genes(graph_type="boxplot", + solutions='all') + +.. figure:: https://user-images.githubusercontent.com/16560492/122479260-beeb4380-cf98-11eb-8f08-23707929b12c.png + :alt: + +.. _graphtypehistogram: + +``graph_type="histogram"`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For ``graph_type="boxplot"``, then a histogram is created for each gene. +Similar to ``graph_type="boxplot"``, the ``plot_type`` parameter has no +effect. + +The following 2 calls of the ``plot_genes()`` method create the same +figure as the default value for the ``solutions`` parameter is +``"all"``. + +.. code:: python + + ga_instance.plot_genes(graph_type="histogram") + + ga_instance.plot_genes(graph_type="histogram", + solutions='all') + +.. figure:: https://user-images.githubusercontent.com/16560492/122477314-8007be80-cf95-11eb-9c95-da3f49204151.png + :alt: + +All the previous figures can be created for only the best solutions by +setting ``solutions="best"``. + +Parallel Processing in PyGAD +============================ + +Starting from `PyGAD +2.17.0 `__, +parallel processing becomes supported. This section explains how to use +parallel processing in PyGAD. + +According to the `PyGAD +lifecycle `__, +parallel processing can be parallelized in only 2 operations: + +1. Population fitness calculation. + +2. Mutation. + +The reason is that the calculations in these 2 operations are +independent (i.e. each solution/chromosome is handled independently from +the others) and can be distributed across different processes or +threads. + +For the mutation operation, it does not do intensive calculations on the +CPU. Its calculations are simple like flipping the values of some genes +from 0 to 1 or adding a random value to some genes. So, it does not take +much CPU processing time. Experiments proved that parallelizing the +mutation operation across the solutions increases the time instead of +reducing it. This is because running multiple processes or threads adds +overhead to manage them. Thus, parallel processing cannot be applied on +the mutation operation. + +For the population fitness calculation, parallel processing can help +make a difference and reduce the processing time. But this is +conditional on the type of calculations done in the fitness function. If +the fitness function makes intensive calculations and takes much +processing time from the CPU, then it is probably that parallel +processing will help to cut down the overall time. + +This section explains how parallel processing works in PyGAD and how to +use parallel processing in PyGAD + +How to Use Parallel Processing in PyGAD +--------------------------------------- + +Starting from `PyGAD +2.17.0 `__, +a new parameter called ``parallel_processing`` added to the constructor +of the ``pygad.GA`` class. + +.. code:: python + + import pygad + ... + ga_instance = pygad.GA(..., + parallel_processing=...) + ... + +This parameter allows the user to do the following: + +1. Enable parallel processing. + +2. Select whether processes or threads are used. + +3. Specify the number of processes or threads to be used. + +These are 3 possible values for the ``parallel_processing`` parameter: + +1. ``None``: (Default) It means no parallel processing is used. + +2. A positive integer referring to the number of threads to be used + (i.e. threads, not processes, are used. + +3. ``list``/``tuple``: If a list or a tuple of exactly 2 elements is + assigned, then: + + 1. The first element can be either ``'process'`` or ``'thread'`` to + specify whether processes or threads are used, respectively. + + 2. The second element can be: + + 1. A positive integer to select the maximum number of processes or + threads to be used + + 2. ``0`` to indicate that 0 processes or threads are used. It + means no parallel processing. This is identical to setting + ``parallel_processing=None``. + + 3. ``None`` to use the default value as calculated by the + ``concurrent.futures module``. + +These are examples of the values assigned to the ``parallel_processing`` +parameter: + +- ``parallel_processing=4``: Because the parameter is assigned a + positive integer, this means parallel processing is activated where 4 + threads are used. + +- ``parallel_processing=["thread", 5]``: Use parallel processing with 5 + threads. This is identical to ``parallel_processing=5``. + +- ``parallel_processing=["process", 8]``: Use parallel processing with + 8 processes. + +- ``parallel_processing=["process", 0]``: As the second element is + given the value 0, this means do not use parallel processing. This is + identical to ``parallel_processing=None``. + +Examples +-------- + +The examples will help you know the difference between using processes +and threads. Moreover, it will give an idea when parallel processing +would make a difference and reduce the time. These are dummy examples +where the fitness function is made to always return 0. + +The first example uses 10 genes, 5 solutions in the population where +only 3 solutions mate, and 9999 generations. The fitness function uses a +``for`` loop with 100 iterations just to have some calculations. In the +constructor of the ``pygad.GA`` class, ``parallel_processing=None`` +means no parallel processing is used. + +.. code:: python + + import pygad + import time + + def fitness_func(ga_instance, solution, solution_idx): + for _ in range(99): + pass + return 0 + + ga_instance = pygad.GA(num_generations=9999, + num_parents_mating=3, + sol_per_pop=5, + num_genes=10, + fitness_func=fitness_func, + suppress_warnings=True, + parallel_processing=None) + + if __name__ == '__main__': + t1 = time.time() + + ga_instance.run() + + t2 = time.time() + print("Time is", t2-t1) + +When parallel processing is not used, the time it takes to run the +genetic algorithm is ``1.5`` seconds. + +In the comparison, let's do a second experiment where parallel +processing is used with 5 threads. In this case, it take ``5`` seconds. + +.. code:: python + + ... + ga_instance = pygad.GA(..., + parallel_processing=5) + ... + +For the third experiment, processes instead of threads are used. Also, +only 99 generations are used instead of 9999. The time it takes is +``99`` seconds. + +.. code:: python + + ... + ga_instance = pygad.GA(num_generations=99, + ..., + parallel_processing=["process", 5]) + ... + +This is the summary of the 3 experiments: + +1. No parallel processing & 9999 generations: 1.5 seconds. + +2. Parallel processing with 5 threads & 9999 generations: 5 seconds + +3. Parallel processing with 5 processes & 99 generations: 99 seconds + +Because the fitness function does not need much CPU time, the normal +processing takes the least time. Running processes for this simple +problem takes 99 compared to only 5 seconds for threads because managing +processes is much heavier than managing threads. Thus, most of the CPU +time is for swapping the processes instead of executing the code. + +In the second example, the loop makes 99999999 iterations and only 5 +generations are used. With no parallelization, it takes 22 seconds. + +.. code:: python + + import pygad + import time + + def fitness_func(ga_instance, solution, solution_idx): + for _ in range(99999999): + pass + return 0 + + ga_instance = pygad.GA(num_generations=5, + num_parents_mating=3, + sol_per_pop=5, + num_genes=10, + fitness_func=fitness_func, + suppress_warnings=True, + parallel_processing=None) + + if __name__ == '__main__': + t1 = time.time() + ga_instance.run() + t2 = time.time() + print("Time is", t2-t1) + +It takes 15 seconds when 10 processes are used. + +.. code:: python + + ... + ga_instance = pygad.GA(..., + parallel_processing=["process", 10]) + ... + +This is compared to 20 seconds when 10 threads are used. + +.. code:: python + + ... + ga_instance = pygad.GA(..., + parallel_processing=["thread", 10]) + ... + +Based on the second example, using parallel processing with 10 processes +takes the least time because there is much CPU work done. Generally, +processes are preferred over threads when most of the work in on the +CPU. Threads are preferred over processes in some situations like doing +input/output operations. + +*Before releasing* `PyGAD +2.17.0 `__\ *,* +`László +Fazekas `__ +*wrote an article to parallelize the fitness function with PyGAD. Check +it:* `How Genetic Algorithms Can Compete with Gradient Descent and +Backprop `__. + +Print Lifecycle Summary +======================= + +In `PyGAD +2.19.0 `__, +a new method called ``summary()`` is supported. It prints a Keras-like +summary of the PyGAD lifecycle showing the steps, callback functions, +parameters, etc. + +This method accepts the following parameters: + +- ``line_length=70``: An integer representing the length of the single + line in characters. + +- ``fill_character=" "``: A character to fill the lines. + +- ``line_character="-"``: A character for creating a line separator. + +- ``line_character2="="``: A secondary character to create a line + separator. + +- ``columns_equal_len=False``: The table rows are split into + equal-sized columns or split subjective to the width needed. + +- ``print_step_parameters=True``: Whether to print extra parameters + about each step inside the step. If ``print_step_parameters=False`` + and ``print_parameters_summary=True``, then the parameters of each + step are printed at the end of the table. + +- ``print_parameters_summary=True``: Whether to print parameters + summary at the end of the table. If ``print_step_parameters=False``, + then the parameters of each step are printed at the end of the table + too. + +This is a quick example to create a PyGAD example. + +.. code:: python + + import pygad + import numpy + + function_inputs = [4,-2,3.5,5,-11,-4.7] + desired_output = 44 + + def genetic_fitness(solution, solution_idx): + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + def on_gen(ga): + pass + + def on_crossover_callback(a, b): + pass + + ga_instance = pygad.GA(num_generations=100, + num_parents_mating=10, + sol_per_pop=20, + num_genes=len(function_inputs), + on_crossover=on_crossover_callback, + on_generation=on_gen, + parallel_processing=2, + stop_criteria="reach_10", + fitness_batch_size=4, + crossover_probability=0.4, + fitness_func=genetic_fitness) + +Then call the ``summary()`` method to print the summary with the default +parameters. Note that entries for the crossover and generation callback +function are created because their callback functions are implemented +through the ``on_crossover_callback()`` and ``on_gen()``, respectively. + +.. code:: python + + ga_instance.summary() + +.. code:: bash + + ---------------------------------------------------------------------- + PyGAD Lifecycle + ====================================================================== + Step Handler Output Shape + ====================================================================== + Fitness Function genetic_fitness() (1) + Fitness batch size: 4 + ---------------------------------------------------------------------- + Parent Selection steady_state_selection() (10, 6) + Number of Parents: 10 + ---------------------------------------------------------------------- + Crossover single_point_crossover() (10, 6) + Crossover probability: 0.4 + ---------------------------------------------------------------------- + On Crossover on_crossover_callback() None + ---------------------------------------------------------------------- + Mutation random_mutation() (10, 6) + Mutation Genes: 1 + Random Mutation Range: (-1.0, 1.0) + Mutation by Replacement: False + Allow Duplicated Genes: True + ---------------------------------------------------------------------- + On Generation on_gen() None + Stop Criteria: [['reach', 10.0]] + ---------------------------------------------------------------------- + ====================================================================== + Population Size: (20, 6) + Number of Generations: 100 + Initial Population Range: (-4, 4) + Keep Elitism: 1 + Gene DType: [, None] + Parallel Processing: ['thread', 2] + Save Best Solutions: False + Save Solutions: False + ====================================================================== + +We can set the ``print_step_parameters`` and +``print_parameters_summary`` parameters to ``False`` to not print the +parameters. + +.. code:: python + + ga_instance.summary(print_step_parameters=False, + print_parameters_summary=False) + +.. code:: bash + + ---------------------------------------------------------------------- + PyGAD Lifecycle + ====================================================================== + Step Handler Output Shape + ====================================================================== + Fitness Function genetic_fitness() (1) + ---------------------------------------------------------------------- + Parent Selection steady_state_selection() (10, 6) + ---------------------------------------------------------------------- + Crossover single_point_crossover() (10, 6) + ---------------------------------------------------------------------- + On Crossover on_crossover_callback() None + ---------------------------------------------------------------------- + Mutation random_mutation() (10, 6) + ---------------------------------------------------------------------- + On Generation on_gen() None + ---------------------------------------------------------------------- + ====================================================================== + +Logging Outputs +=============== + +In `PyGAD +3.0.0 `__, +the ``print()`` statement is no longer used and the outputs are printed +using the `logging `__ +module. A a new parameter called ``logger`` is supported to accept the +user-defined logger. + +.. code:: python + + import logging + + logger = ... + + ga_instance = pygad.GA(..., + logger=logger, + ...) + +The default value for this parameter is ``None``. If there is no logger +passed (i.e. ``logger=None``), then a default logger is created to log +the messages to the console exactly like how the ``print()`` statement +works. + +Some advantages of using the the +`logging `__ module +instead of the ``print()`` statement are: + +1. The user has more control over the printed messages specially if + there is a project that uses multiple modules where each module + prints its messages. A logger can organize the outputs. + +2. Using the proper ``Handler``, the user can log the output messages to + files and not only restricted to printing it to the console. So, it + is much easier to record the outputs. + +3. The format of the printed messages can be changed by customizing the + ``Formatter`` assigned to the Logger. + +This section gives some quick examples to use the ``logging`` module and +then gives an example to use the logger with PyGAD. + +Logging to the Console +---------------------- + +This is an example to create a logger to log the messages to the +console. + +.. code:: python + + import logging + + # Create a logger + logger = logging.getLogger(__name__) + + # Set the logger level to debug so that all the messages are printed. + logger.setLevel(logging.DEBUG) + + # Create a stream handler to log the messages to the console. + stream_handler = logging.StreamHandler() + + # Set the handler level to debug. + stream_handler.setLevel(logging.DEBUG) + + # Create a formatter + formatter = logging.Formatter('%(message)s') + + # Add the formatter to handler. + stream_handler.setFormatter(formatter) + + # Add the stream handler to the logger + logger.addHandler(stream_handler) + +Now, we can log messages to the console with the format specified in the +``Formatter``. + +.. code:: python + + logger.debug('Debug message.') + logger.info('Info message.') + logger.warning('Warn message.') + logger.error('Error message.') + logger.critical('Critical message.') + +The outputs are identical to those returned using the ``print()`` +statement. + +.. code:: + + Debug message. + Info message. + Warn message. + Error message. + Critical message. + +By changing the format of the output messages, we can have more +information about each message. + +.. code:: python + + formatter = logging.Formatter('%(asctime)s %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S') + +This is a sample output. + +.. code:: python + + 2023-04-03 18:46:27 DEBUG: Debug message. + 2023-04-03 18:46:27 INFO: Info message. + 2023-04-03 18:46:27 WARNING: Warn message. + 2023-04-03 18:46:27 ERROR: Error message. + 2023-04-03 18:46:27 CRITICAL: Critical message. + +Note that you may need to clear the handlers after finishing the +execution. This is to make sure no cached handlers are used in the next +run. If the cached handlers are not cleared, then the single output +message may be repeated. + +.. code:: python + + logger.handlers.clear() + +Logging to a File +----------------- + +This is another example to log the messages to a file named +``logfile.txt``. The formatter prints the following about each message: + +1. The date and time at which the message is logged. + +2. The log level. + +3. The message. + +4. The path of the file. + +5. The lone number of the log message. + +.. code:: python + + import logging + + level = logging.DEBUG + name = 'logfile.txt' + + logger = logging.getLogger(name) + logger.setLevel(level) + + file_handler = logging.FileHandler(name, 'a+', 'utf-8') + file_handler.setLevel(logging.DEBUG) + file_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s - %(pathname)s:%(lineno)d', datefmt='%Y-%m-%d %H:%M:%S') + file_handler.setFormatter(file_format) + logger.addHandler(file_handler) + +This is how the outputs look like. + +.. code:: python + + 2023-04-03 18:54:03 DEBUG: Debug message. - c:\users\agad069\desktop\logger\example2.py:46 + 2023-04-03 18:54:03 INFO: Info message. - c:\users\agad069\desktop\logger\example2.py:47 + 2023-04-03 18:54:03 WARNING: Warn message. - c:\users\agad069\desktop\logger\example2.py:48 + 2023-04-03 18:54:03 ERROR: Error message. - c:\users\agad069\desktop\logger\example2.py:49 + 2023-04-03 18:54:03 CRITICAL: Critical message. - c:\users\agad069\desktop\logger\example2.py:50 + +Consider clearing the handlers if necessary. + +.. code:: python + + logger.handlers.clear() + +Log to Both the Console and a File +---------------------------------- + +This is an example to create a single Logger associated with 2 handlers: + +1. A file handler. + +2. A stream handler. + +.. code:: python + + import logging + + level = logging.DEBUG + name = 'logfile.txt' + + logger = logging.getLogger(name) + logger.setLevel(level) + + file_handler = logging.FileHandler(name,'a+','utf-8') + file_handler.setLevel(logging.DEBUG) + file_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s - %(pathname)s:%(lineno)d', datefmt='%Y-%m-%d %H:%M:%S') + file_handler.setFormatter(file_format) + logger.addHandler(file_handler) + + console_handler = logging.StreamHandler() + console_handler.setLevel(logging.INFO) + console_format = logging.Formatter('%(message)s') + console_handler.setFormatter(console_format) + logger.addHandler(console_handler) + +When a log message is executed, then it is both printed to the console +and saved in the ``logfile.txt``. + +Consider clearing the handlers if necessary. + +.. code:: python + + logger.handlers.clear() + +PyGAD Example +------------- + +To use the logger in PyGAD, just create your custom logger and pass it +to the ``logger`` parameter. + +.. code:: python + + import logging + import pygad + import numpy + + level = logging.DEBUG + name = 'logfile.txt' + + logger = logging.getLogger(name) + logger.setLevel(level) + + file_handler = logging.FileHandler(name,'a+','utf-8') + file_handler.setLevel(logging.DEBUG) + file_format = logging.Formatter('%(asctime)s %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S') + file_handler.setFormatter(file_format) + logger.addHandler(file_handler) + + console_handler = logging.StreamHandler() + console_handler.setLevel(logging.INFO) + console_format = logging.Formatter('%(message)s') + console_handler.setFormatter(console_format) + logger.addHandler(console_handler) + + equation_inputs = [4, -2, 8] + desired_output = 2671.1234 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + def on_generation(ga_instance): + ga_instance.logger.info("Generation = {generation}".format(generation=ga_instance.generations_completed)) + ga_instance.logger.info("Fitness = {fitness}".format(fitness=ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1])) + + ga_instance = pygad.GA(num_generations=10, + sol_per_pop=40, + num_parents_mating=2, + keep_parents=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + on_generation=on_generation, + logger=logger) + ga_instance.run() + + logger.handlers.clear() + +By executing this code, the logged messages are printed to the console +and also saved in the text file. + +.. code:: python + + 2023-04-03 19:04:27 INFO: Generation = 1 + 2023-04-03 19:04:27 INFO: Fitness = 0.00038086960368076276 + 2023-04-03 19:04:27 INFO: Generation = 2 + 2023-04-03 19:04:27 INFO: Fitness = 0.00038214871408010853 + 2023-04-03 19:04:27 INFO: Generation = 3 + 2023-04-03 19:04:27 INFO: Fitness = 0.0003832795907974678 + 2023-04-03 19:04:27 INFO: Generation = 4 + 2023-04-03 19:04:27 INFO: Fitness = 0.00038398612055017196 + 2023-04-03 19:04:27 INFO: Generation = 5 + 2023-04-03 19:04:27 INFO: Fitness = 0.00038442348890867516 + 2023-04-03 19:04:27 INFO: Generation = 6 + 2023-04-03 19:04:27 INFO: Fitness = 0.0003854406039137763 + 2023-04-03 19:04:27 INFO: Generation = 7 + 2023-04-03 19:04:27 INFO: Fitness = 0.00038646083174063284 + 2023-04-03 19:04:27 INFO: Generation = 8 + 2023-04-03 19:04:27 INFO: Fitness = 0.0003875169193024936 + 2023-04-03 19:04:27 INFO: Generation = 9 + 2023-04-03 19:04:27 INFO: Fitness = 0.0003888816727311021 + 2023-04-03 19:04:27 INFO: Generation = 10 + 2023-04-03 19:04:27 INFO: Fitness = 0.000389832593101348 + +Batch Fitness Calculation +========================= + +In `PyGAD +2.19.0 `__, +a new optional parameter called ``fitness_batch_size`` is supported. A +new optional parameter called ``fitness_batch_size`` is supported to +calculate the fitness function in batches. Thanks to `Linan +Qiu `__ for opening the `GitHub issue +#136 `__. + +Its values can be: + +- ``1`` or ``None``: If the ``fitness_batch_size`` parameter is + assigned the value ``1`` or ``None`` (default), then the normal flow + is used where the fitness function is called for each individual + solution. That is if there are 15 solutions, then the fitness + function is called 15 times. + +- ``1 < fitness_batch_size <= sol_per_pop``: If the + ``fitness_batch_size`` parameter is assigned a value satisfying this + condition ``1 < fitness_batch_size <= sol_per_pop``, then the + solutions are grouped into batches of size ``fitness_batch_size`` and + the fitness function is called once for each batch. In this case, the + fitness function must return a list/tuple/numpy.ndarray with a length + equal to the number of solutions passed. + +.. _example-without-fitnessbatchsize-parameter: + +Example without ``fitness_batch_size`` Parameter +------------------------------------------------ + +This is an example where the ``fitness_batch_size`` parameter is given +the value ``None`` (which is the default value). This is equivalent to +using the value ``1``. In this case, the fitness function will be called +for each solution. This means the fitness function ``fitness_func`` will +receive only a single solution. This is an example of the passed +arguments to the fitness function: + +.. code:: + + solution: [ 2.52860734, -0.94178795, 2.97545704, 0.84131987, -3.78447118, 2.41008358] + solution_idx: 3 + +The fitness function also must return a single numeric value as the +fitness for the passed solution. + +As we have a population of ``20`` solutions, then the fitness function +is called 20 times per generation. For 5 generations, then the fitness +function is called ``20*5 = 100`` times. In PyGAD, the fitness function +is called after the last generation too and this adds additional 20 +times. So, the total number of calls to the fitness function is +``20*5 + 20 = 120``. + +Note that the ``keep_elitism`` and ``keep_parents`` parameters are set +to ``0`` to make sure no fitness values are reused and to force calling +the fitness function for each individual solution. + +.. code:: python + + import pygad + import numpy + + function_inputs = [4,-2,3.5,5,-11,-4.7] + desired_output = 44 + + number_of_calls = 0 + + def fitness_func(ga_instance, solution, solution_idx): + global number_of_calls + number_of_calls = number_of_calls + 1 + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + ga_instance = pygad.GA(num_generations=5, + num_parents_mating=10, + sol_per_pop=20, + fitness_func=fitness_func, + fitness_batch_size=None, + # fitness_batch_size=1, + num_genes=len(function_inputs), + keep_elitism=0, + keep_parents=0) + + ga_instance.run() + print(number_of_calls) + +.. code:: + + 120 + +.. _example-with-fitnessbatchsize-parameter: + +Example with ``fitness_batch_size`` Parameter +--------------------------------------------- + +This is an example where the ``fitness_batch_size`` parameter is used +and assigned the value ``4``. This means the solutions will be grouped +into batches of ``4`` solutions. The fitness function will be called +once for each patch (i.e. called once for each 4 solutions). + +This is an example of the arguments passed to it: + +.. code:: python + + solutions: + [[ 3.1129432 -0.69123589 1.93792414 2.23772968 -1.54616001 -0.53930799] + [ 3.38508121 0.19890812 1.93792414 2.23095014 -3.08955597 3.10194128] + [ 2.37079504 -0.88819803 2.97545704 1.41742256 -3.95594055 2.45028256] + [ 2.52860734 -0.94178795 2.97545704 0.84131987 -3.78447118 2.41008358]] + solutions_indices: + [16, 17, 18, 19] + +As we have 20 solutions, then there are ``20/4 = 5`` patches. As a +result, the fitness function is called only 5 times per generation +instead of 20. For each call to the fitness function, it receives a +batch of 4 solutions. + +As we have 5 generations, then the function will be called ``5*5 = 25`` +times. Given the call to the fitness function after the last generation, +then the total number of calls is ``5*5 + 5 = 30``. + +.. code:: python + + import pygad + import numpy + + function_inputs = [4,-2,3.5,5,-11,-4.7] + desired_output = 44 + + number_of_calls = 0 + + def fitness_func_batch(ga_instance, solutions, solutions_indices): + global number_of_calls + number_of_calls = number_of_calls + 1 + batch_fitness = [] + for solution in solutions: + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + batch_fitness.append(fitness) + return batch_fitness + + ga_instance = pygad.GA(num_generations=5, + num_parents_mating=10, + sol_per_pop=20, + fitness_func=fitness_func_batch, + fitness_batch_size=4, + num_genes=len(function_inputs), + keep_elitism=0, + keep_parents=0) + + ga_instance.run() + print(number_of_calls) + +.. code:: + + 30 + +When batch fitness calculation is used, then we saved ``120 - 30 = 90`` +calls to the fitness function. + +Use Functions and Methods to Build Fitness and Callbacks +======================================================== + +In PyGAD 2.19.0, it is possible to pass user-defined functions or +methods to the following parameters: + +1. ``fitness_func`` + +2. ``on_start`` + +3. ``on_fitness`` + +4. ``on_parents`` + +5. ``on_crossover`` + +6. ``on_mutation`` + +7. ``on_generation`` + +8. ``on_stop`` + +This section gives 2 examples to assign these parameters user-defined: + +1. Functions. + +2. Methods. + +Assign Functions +---------------- + +This is a dummy example where the fitness function returns a random +value. Note that the instance of the ``pygad.GA`` class is passed as the +last parameter of all functions. + +.. code:: python + + import pygad + import numpy + + def fitness_func(ga_instanse, solution, solution_idx): + return numpy.random.rand() + + def on_start(ga_instanse): + print("on_start") + + def on_fitness(ga_instanse, last_gen_fitness): + print("on_fitness") + + def on_parents(ga_instanse, last_gen_parents): + print("on_parents") + + def on_crossover(ga_instanse, last_gen_offspring): + print("on_crossover") + + def on_mutation(ga_instanse, last_gen_offspring): + print("on_mutation") + + def on_generation(ga_instanse): + print("on_generation\n") + + def on_stop(ga_instanse, last_gen_fitness): + print("on_stop") + + ga_instance = pygad.GA(num_generations=5, + num_parents_mating=4, + sol_per_pop=10, + num_genes=2, + on_start=on_start, + on_fitness=on_fitness, + on_parents=on_parents, + on_crossover=on_crossover, + on_mutation=on_mutation, + on_generation=on_generation, + on_stop=on_stop, + fitness_func=fitness_func) + + ga_instance.run() + +Assign Methods +-------------- + +The next example has all the method defined inside the class ``Test``. +All of the methods accept an additional parameter representing the +method's object of the class ``Test``. + +All methods accept ``self`` as the first parameter and the instance of +the ``pygad.GA`` class as the last parameter. + +.. code:: python + + import pygad + import numpy + + class Test: + def fitness_func(self, ga_instanse, solution, solution_idx): + return numpy.random.rand() + + def on_start(self, ga_instanse): + print("on_start") + + def on_fitness(self, ga_instanse, last_gen_fitness): + print("on_fitness") + + def on_parents(self, ga_instanse, last_gen_parents): + print("on_parents") + + def on_crossover(self, ga_instanse, last_gen_offspring): + print("on_crossover") + + def on_mutation(self, ga_instanse, last_gen_offspring): + print("on_mutation") + + def on_generation(self, ga_instanse): + print("on_generation\n") + + def on_stop(self, ga_instanse, last_gen_fitness): + print("on_stop") + + ga_instance = pygad.GA(num_generations=5, + num_parents_mating=4, + sol_per_pop=10, + num_genes=2, + on_start=Test().on_start, + on_fitness=Test().on_fitness, + on_parents=Test().on_parents, + on_crossover=Test().on_crossover, + on_mutation=Test().on_mutation, + on_generation=Test().on_generation, + on_stop=Test().on_stop, + fitness_func=Test().fitness_func) + + ga_instance.run() + +.. _examples-2: + +Examples +======== + +This section gives the complete code of some examples that use +``pygad``. Each subsection builds a different example. + +Linear Model Optimization +------------------------- + +This example is discussed in the `Steps to Use +PyGAD `__ +section which optimizes a linear model. Its complete code is listed +below. + +.. code:: python + + import pygad + import numpy + + """ + Given the following function: + y = f(w1:w6) = w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + 6wx6 + where (x1,x2,x3,x4,x5,x6)=(4,-2,3.5,5,-11,-4.7) and y=44 + What are the best values for the 6 weights (w1 to w6)? We are going to use the genetic algorithm to optimize this function. + """ + + function_inputs = [4,-2,3.5,5,-11,-4.7] # Function inputs. + desired_output = 44 # Function output. + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution*function_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + num_generations = 100 # Number of generations. + num_parents_mating = 10 # Number of solutions to be selected as parents in the mating pool. + + sol_per_pop = 20 # Number of solutions in the population. + num_genes = len(function_inputs) + + last_fitness = 0 + def on_generation(ga_instance): + global last_fitness + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1])) + print("Change = {change}".format(change=ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1] - last_fitness)) + last_fitness = ga_instance.best_solution(pop_fitness=ga_instance.last_generation_fitness)[1] + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + sol_per_pop=sol_per_pop, + num_genes=num_genes, + fitness_func=fitness_func, + on_generation=on_generation) + + # Running the GA to optimize the parameters of the function. + ga_instance.run() + + ga_instance.plot_fitness() + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution(ga_instance.last_generation_fitness) + print("Parameters of the best solution : {solution}".format(solution=solution)) + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + prediction = numpy.sum(numpy.array(function_inputs)*solution) + print("Predicted output based on the best solution : {prediction}".format(prediction=prediction)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + # Saving the GA instance. + filename = 'genetic' # The filename to which the instance is saved. The name is without extension. + ga_instance.save(filename=filename) + + # Loading the saved GA instance. + loaded_ga_instance = pygad.load(filename=filename) + loaded_ga_instance.plot_fitness() + +Reproducing Images +------------------ + +This project reproduces a single image using PyGAD by evolving pixel +values. This project works with both color and gray images. Check this +project at `GitHub `__: +https://github.com/ahmedfgad/GARI. + +For more information about this project, read this tutorial titled +`Reproducing Images using a Genetic Algorithm with +Python `__ +available at these links: + +- `Heartbeat `__: + https://heartbeat.fritz.ai/reproducing-images-using-a-genetic-algorithm-with-python-91fc701ff84 + +- `LinkedIn `__: + https://www.linkedin.com/pulse/reproducing-images-using-genetic-algorithm-python-ahmed-gad + +Project Steps +~~~~~~~~~~~~~ + +The steps to follow in order to reproduce an image are as follows: + +- Read an image + +- Prepare the fitness function + +- Create an instance of the pygad.GA class with the appropriate + parameters + +- Run PyGAD + +- Plot results + +- Calculate some statistics + +The next sections discusses the code of each of these steps. + +Read an Image +~~~~~~~~~~~~~ + +There is an image named ``fruit.jpg`` in the `GARI +project `__ which is read according +to the next code. + +.. code:: python + + import imageio + import numpy + + target_im = imageio.imread('fruit.jpg') + target_im = numpy.asarray(target_im/255, dtype=float) + +Here is the read image. + +.. figure:: https://user-images.githubusercontent.com/16560492/36948808-f0ac882e-1fe8-11e8-8d07-1307e3477fd0.jpg + :alt: + +Based on the chromosome representation used in the example, the pixel +values can be either in the 0-255, 0-1, or any other ranges. + +Note that the range of pixel values affect other parameters like the +range from which the random values are selected during mutation and also +the range of the values used in the initial population. So, be +consistent. + +Prepare the Fitness Function +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The next code creates a function that will be used as a fitness function +for calculating the fitness value for each solution in the population. +This function must be a maximization function that accepts 3 parameters +representing the instance of the ``pygad.GA`` class, a solution, and its +index. It returns a value representing the fitness value. + +.. code:: python + + import gari + + target_chromosome = gari.img2chromosome(target_im) + + def fitness_fun(ga_instance, solution, solution_idx): + fitness = numpy.sum(numpy.abs(target_chromosome-solution)) + + # Negating the fitness value to make it increasing rather than decreasing. + fitness = numpy.sum(target_chromosome) - fitness + return fitness + +The fitness value is calculated using the sum of absolute difference +between genes values in the original and reproduced chromosomes. The +``gari.img2chromosome()`` function is called before the fitness function +to represent the image as a vector because the genetic algorithm can +work with 1D chromosomes. + +The implementation of the ``gari`` module is available at the `GARI +GitHub +project `__ and +its code is listed below. + +.. code:: python + + import numpy + import functools + import operator + + def img2chromosome(img_arr): + return numpy.reshape(a=img_arr, newshape=(functools.reduce(operator.mul, img_arr.shape))) + + def chromosome2img(vector, shape): + if len(vector) != functools.reduce(operator.mul, shape): + raise ValueError("A vector of length {vector_length} into an array of shape {shape}.".format(vector_length=len(vector), shape=shape)) + + return numpy.reshape(a=vector, newshape=shape) + +.. _create-an-instance-of-the-pygadga-class-2: + +Create an Instance of the ``pygad.GA`` Class +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It is very important to use random mutation and set the +``mutation_by_replacement`` to ``True``. Based on the range of pixel +values, the values assigned to the ``init_range_low``, +``init_range_high``, ``random_mutation_min_val``, and +``random_mutation_max_val`` parameters should be changed. + +If the image pixel values range from 0 to 255, then set +``init_range_low`` and ``random_mutation_min_val`` to 0 as they are but +change ``init_range_high`` and ``random_mutation_max_val`` to 255. + +Feel free to change the other parameters or add other parameters. Please +check the `PyGAD's documentation `__ for +the full list of parameters. + +.. code:: python + + import pygad + + ga_instance = pygad.GA(num_generations=20000, + num_parents_mating=10, + fitness_func=fitness_fun, + sol_per_pop=20, + num_genes=target_im.size, + init_range_low=0.0, + init_range_high=1.0, + mutation_percent_genes=0.01, + mutation_type="random", + mutation_by_replacement=True, + random_mutation_min_val=0.0, + random_mutation_max_val=1.0) + +Run PyGAD +~~~~~~~~~ + +Simply, call the ``run()`` method to run PyGAD. + +.. code:: python + + ga_instance.run() + +Plot Results +~~~~~~~~~~~~ + +After the ``run()`` method completes, the fitness values of all +generations can be viewed in a plot using the ``plot_fitness()`` method. + +.. code:: python + + ga_instance.plot_fitness() + +Here is the plot after 20,000 generations. + +.. figure:: https://user-images.githubusercontent.com/16560492/82232124-77762c00-992e-11ea-9fc6-14a1cd7a04ff.png + :alt: + +Calculate Some Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Here is some information about the best solution. + +.. code:: python + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + if ga_instance.best_solution_generation != -1: + print("Best fitness value reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + + result = gari.chromosome2img(solution, target_im.shape) + matplotlib.pyplot.imshow(result) + matplotlib.pyplot.title("PyGAD & GARI for Reproducing Images") + matplotlib.pyplot.show() + +Evolution by Generation +~~~~~~~~~~~~~~~~~~~~~~~ + +The solution reached after the 20,000 generations is shown below. + +.. figure:: https://user-images.githubusercontent.com/16560492/82232405-e0f63a80-992e-11ea-984f-b6ed76465bd1.png + :alt: + +After more generations, the result can be enhanced like what shown +below. + +.. figure:: https://user-images.githubusercontent.com/16560492/82232345-cf149780-992e-11ea-8390-bf1a57a19de7.png + :alt: + +The results can also be enhanced by changing the parameters passed to +the constructor of the ``pygad.GA`` class. + +Here is how the image is evolved from generation 0 to generation +20,000s. + +Generation 0 + +.. figure:: https://user-images.githubusercontent.com/16560492/36948589-b47276f0-1fe5-11e8-8efe-0cd1a225ea3a.png + :alt: + +Generation 1,000 + +.. figure:: https://user-images.githubusercontent.com/16560492/36948823-16f490ee-1fe9-11e8-97db-3e8905ad5440.png + :alt: + +Generation 2,500 + +.. figure:: https://user-images.githubusercontent.com/16560492/36948832-3f314b60-1fe9-11e8-8f4a-4d9a53b99f3d.png + :alt: + +Generation 4,500 + +.. figure:: https://user-images.githubusercontent.com/16560492/36948837-53d1849a-1fe9-11e8-9b36-e9e9291e347b.png + :alt: + +Generation 7,000 + +.. figure:: https://user-images.githubusercontent.com/16560492/36948852-66f1b176-1fe9-11e8-9f9b-460804e94004.png + :alt: + +Generation 8,000 + +.. figure:: https://user-images.githubusercontent.com/16560492/36948865-7fbb5158-1fe9-11e8-8c04-8ac3c1f7b1b1.png + :alt: + +Generation 20,000 + +.. figure:: https://user-images.githubusercontent.com/16560492/82232405-e0f63a80-992e-11ea-984f-b6ed76465bd1.png + :alt: + +Clustering +---------- + +For a 2-cluster problem, the code is available +`here `__. +For a 3-cluster problem, the code is +`here `__. +The 2 examples are using artificial samples. + +Soon a tutorial will be published at +`Paperspace `__ to explain how +clustering works using the genetic algorithm with examples in PyGAD. + +CoinTex Game Playing using PyGAD +-------------------------------- + +The code is available the `CoinTex GitHub +project `__. +CoinTex is an Android game written in Python using the Kivy framework. +Find CoinTex at `Google +Play `__: +https://play.google.com/store/apps/details?id=coin.tex.cointexreactfast + +Check this `Paperspace +tutorial `__ +for how the genetic algorithm plays CoinTex: +https://blog.paperspace.com/building-agent-for-cointex-using-genetic-algorithm. +Check also this `YouTube video `__ showing +the genetic algorithm while playing CoinTex. diff --git a/docs/source/Footer.rst b/docs/source/releases.rst similarity index 97% rename from docs/source/Footer.rst rename to docs/source/releases.rst index 54814ba..3458aaf 100644 --- a/docs/source/Footer.rst +++ b/docs/source/releases.rst @@ -1,2046 +1,2046 @@ -Release History -=============== - -.. figure:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png - :alt: - -.. _pygad-1017: - -PyGAD 1.0.17 ------------- - -Release Date: 15 April 2020 - -1. The **pygad.GA** class accepts a new argument named ``fitness_func`` - which accepts a function to be used for calculating the fitness - values for the solutions. This allows the project to be customized to - any problem by building the right fitness function. - -.. _pygad-1020: - -PyGAD 1.0.20 -------------- - -Release Date: 4 May 2020 - -1. The **pygad.GA** attributes are moved from the class scope to the - instance scope. - -2. Raising an exception for incorrect values of the passed parameters. - -3. Two new parameters are added to the **pygad.GA** class constructor - (``init_range_low`` and ``init_range_high``) allowing the user to - customize the range from which the genes values in the initial - population are selected. - -4. The code object ``__code__`` of the passed fitness function is - checked to ensure it has the right number of parameters. - -.. _pygad-200: - -PyGAD 2.0.0 ------------- - -Release Date: 13 May 2020 - -1. The fitness function accepts a new argument named ``sol_idx`` - representing the index of the solution within the population. - -2. A new parameter to the **pygad.GA** class constructor named - ``initial_population`` is supported to allow the user to use a custom - initial population to be used by the genetic algorithm. If not None, - then the passed population will be used. If ``None``, then the - genetic algorithm will create the initial population using the - ``sol_per_pop`` and ``num_genes`` parameters. - -3. The parameters ``sol_per_pop`` and ``num_genes`` are optional and set - to ``None`` by default. - -4. A new parameter named ``callback_generation`` is introduced in the - **pygad.GA** class constructor. It accepts a function with a single - parameter representing the **pygad.GA** class instance. This function - is called after each generation. This helps the user to do - post-processing or debugging operations after each generation. - -.. _pygad-210: - -PyGAD 2.1.0 ------------ - -Release Date: 14 May 2020 - -1. The ``best_solution()`` method in the **pygad.GA** class returns a - new output representing the index of the best solution within the - population. Now, it returns a total of 3 outputs and their order is: - best solution, best solution fitness, and best solution index. Here - is an example: - -.. code:: python - - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Parameters of the best solution :", solution) - print("Fitness value of the best solution :", solution_fitness, "\n") - print("Index of the best solution :", solution_idx, "\n") - -1. | A new attribute named ``best_solution_generation`` is added to the - instances of the **pygad.GA** class. it holds the generation number - at which the best solution is reached. It is only assigned the - generation number after the ``run()`` method completes. Otherwise, - its value is -1. - | Example: - -.. code:: python - - print("Best solution reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) - -1. The ``best_solution_fitness`` attribute is renamed to - ``best_solutions_fitness`` (plural solution). - -2. Mutation is applied independently for the genes. - -.. _pygad-221: - -PyGAD 2.2.1 ------------ - -Release Date: 17 May 2020 - -1. Adding 2 extra modules (pygad.nn and pygad.gann) for building and - training neural networks with the genetic algorithm. - -.. _pygad-222: - -PyGAD 2.2.2 ------------ - -Release Date: 18 May 2020 - -1. The initial value of the ``generations_completed`` attribute of - instances from the pygad.GA class is ``0`` rather than ``None``. - -2. An optional bool parameter named ``mutation_by_replacement`` is added - to the constructor of the pygad.GA class. It works only when the - selected type of mutation is random (``mutation_type="random"``). In - this case, setting ``mutation_by_replacement=True`` means replace the - gene by the randomly generated value. If ``False``, then it has no - effect and random mutation works by adding the random value to the - gene. This parameter should be used when the gene falls within a - fixed range and its value must not go out of this range. Here are - some examples: - -Assume there is a gene with the value 0.5. - -If ``mutation_type="random"`` and ``mutation_by_replacement=False``, -then the generated random value (e.g. 0.1) will be added to the gene -value. The new gene value is **0.5+0.1=0.6**. - -If ``mutation_type="random"`` and ``mutation_by_replacement=True``, then -the generated random value (e.g. 0.1) will replace the gene value. The -new gene value is **0.1**. - -1. ``None`` value could be assigned to the ``mutation_type`` and - ``crossover_type`` parameters of the pygad.GA class constructor. When - ``None``, this means the step is bypassed and has no action. - -.. _pygad-230: - -PyGAD 2.3.0 ------------ - -Release date: 1 June 2020 - -1. A new module named ``pygad.cnn`` is supported for building - convolutional neural networks. - -2. A new module named ``pygad.gacnn`` is supported for training - convolutional neural networks using the genetic algorithm. - -3. The ``pygad.plot_result()`` method has 3 optional parameters named - ``title``, ``xlabel``, and ``ylabel`` to customize the plot title, - x-axis label, and y-axis label, respectively. - -4. The ``pygad.nn`` module supports the softmax activation function. - -5. The name of the ``pygad.nn.predict_outputs()`` function is changed to - ``pygad.nn.predict()``. - -6. The name of the ``pygad.nn.train_network()`` function is changed to - ``pygad.nn.train()``. - -.. _pygad-240: - -PyGAD 2.4.0 ------------ - -Release date: 5 July 2020 - -1. A new parameter named ``delay_after_gen`` is added which accepts a - non-negative number specifying the time in seconds to wait after a - generation completes and before going to the next generation. It - defaults to ``0.0`` which means no delay after the generation. - -2. The passed function to the ``callback_generation`` parameter of the - pygad.GA class constructor can terminate the execution of the genetic - algorithm if it returns the string ``stop``. This causes the - ``run()`` method to stop. - -One important use case for that feature is to stop the genetic algorithm -when a condition is met before passing though all the generations. The -user may assigned a value of 100 to the ``num_generations`` parameter of -the pygad.GA class constructor. Assuming that at generation 50, for -example, a condition is met and the user wants to stop the execution -before waiting the remaining 50 generations. To do that, just make the -function passed to the ``callback_generation`` parameter to return the -string ``stop``. - -Here is an example of a function to be passed to the -``callback_generation`` parameter which stops the execution if the -fitness value 70 is reached. The value 70 might be the best possible -fitness value. After being reached, then there is no need to pass -through more generations because no further improvement is possible. - -.. code:: python - - def func_generation(ga_instance): - if ga_instance.best_solution()[1] >= 70: - return "stop" - -.. _pygad-250: - -PyGAD 2.5.0 ------------ - -Release date: 19 July 2020 - -1. | 2 new optional parameters added to the constructor of the - ``pygad.GA`` class which are ``crossover_probability`` and - ``mutation_probability``. - | While applying the crossover operation, each parent has a random - value generated between 0.0 and 1.0. If this random value is less - than or equal to the value assigned to the - ``crossover_probability`` parameter, then the parent is selected - for the crossover operation. - | For the mutation operation, a random value between 0.0 and 1.0 is - generated for each gene in the solution. If this value is less than - or equal to the value assigned to the ``mutation_probability``, - then this gene is selected for mutation. - -2. A new optional parameter named ``linewidth`` is added to the - ``plot_result()`` method to specify the width of the curve in the - plot. It defaults to 3.0. - -3. Previously, the indices of the genes selected for mutation was - randomly generated once for all solutions within the generation. - Currently, the genes' indices are randomly generated for each - solution in the population. If the population has 4 solutions, the - indices are randomly generated 4 times inside the single generation, - 1 time for each solution. - -4. Previously, the position of the point(s) for the single-point and - two-points crossover was(were) randomly selected once for all - solutions within the generation. Currently, the position(s) is(are) - randomly selected for each solution in the population. If the - population has 4 solutions, the position(s) is(are) randomly - generated 4 times inside the single generation, 1 time for each - solution. - -5. A new optional parameter named ``gene_space`` as added to the - ``pygad.GA`` class constructor. It is used to specify the possible - values for each gene in case the user wants to restrict the gene - values. It is useful if the gene space is restricted to a certain - range or to discrete values. For more information, check the `More - about the ``gene_space`` - Parameter `__ - section. Thanks to `Prof. Tamer A. - Farrag `__ for requesting this useful - feature. - -.. _pygad-260: - -PyGAD 2.6.0 ------------- - -Release Date: 6 August 2020 - -1. A bug fix in assigning the value to the ``initial_population`` - parameter. - -2. A new parameter named ``gene_type`` is added to control the gene - type. It can be either ``int`` or ``float``. It has an effect only - when the parameter ``gene_space`` is ``None``. - -3. 7 new parameters that accept callback functions: ``on_start``, - ``on_fitness``, ``on_parents``, ``on_crossover``, ``on_mutation``, - ``on_generation``, and ``on_stop``. - -.. _pygad-270: - -PyGAD 2.7.0 ------------ - -Release Date: 11 September 2020 - -1. The ``learning_rate`` parameter in the ``pygad.nn.train()`` function - defaults to **0.01**. - -2. Added support of building neural networks for regression using the - new parameter named ``problem_type``. It is added as a parameter to - both ``pygad.nn.train()`` and ``pygad.nn.predict()`` functions. The - value of this parameter can be either **classification** or - **regression** to define the problem type. It defaults to - **classification**. - -3. The activation function for a layer can be set to the string - ``"None"`` to refer that there is no activation function at this - layer. As a result, the supported values for the activation function - are ``"sigmoid"``, ``"relu"``, ``"softmax"``, and ``"None"``. - -To build a regression network using the ``pygad.nn`` module, just do the -following: - -1. Set the ``problem_type`` parameter in the ``pygad.nn.train()`` and - ``pygad.nn.predict()`` functions to the string ``"regression"``. - -2. Set the activation function for the output layer to the string - ``"None"``. This sets no limits on the range of the outputs as it - will be from ``-infinity`` to ``+infinity``. If you are sure that all - outputs will be nonnegative values, then use the ReLU function. - -Check the documentation of the ``pygad.nn`` module for an example that -builds a neural network for regression. The regression example is also -available at `this GitHub -project `__: -https://github.com/ahmedfgad/NumPyANN - -To build and train a regression network using the ``pygad.gann`` module, -do the following: - -1. Set the ``problem_type`` parameter in the ``pygad.nn.train()`` and - ``pygad.nn.predict()`` functions to the string ``"regression"``. - -2. Set the ``output_activation`` parameter in the constructor of the - ``pygad.gann.GANN`` class to ``"None"``. - -Check the documentation of the ``pygad.gann`` module for an example that -builds and trains a neural network for regression. The regression -example is also available at `this GitHub -project `__: -https://github.com/ahmedfgad/NeuralGenetic - -To build a classification network, either ignore the ``problem_type`` -parameter or set it to ``"classification"`` (default value). In this -case, the activation function of the last layer can be set to any type -(e.g. softmax). - -.. _pygad-271: - -PyGAD 2.7.1 ------------ - -Release Date: 11 September 2020 - -1. A bug fix when the ``problem_type`` argument is set to - ``regression``. - -.. _pygad-272: - -PyGAD 2.7.2 ------------ - -Release Date: 14 September 2020 - -1. Bug fix to support building and training regression neural networks - with multiple outputs. - -.. _pygad-280: - -PyGAD 2.8.0 ------------ - -Release Date: 20 September 2020 - -1. Support of a new module named ``kerasga`` so that the Keras models - can be trained by the genetic algorithm using PyGAD. - -.. _pygad-281: - -PyGAD 2.8.1 ------------ - -Release Date: 3 October 2020 - -1. Bug fix in applying the crossover operation when the - ``crossover_probability`` parameter is used. Thanks to `Eng. Hamada - Kassem, Research and Teaching Assistant, Construction Engineering and - Management, Faculty of Engineering, Alexandria University, - Egypt `__. - -.. _pygad-290: - -PyGAD 2.9.0 ------------- - -Release Date: 06 December 2020 - -1. The fitness values of the initial population are considered in the - ``best_solutions_fitness`` attribute. - -2. An optional parameter named ``save_best_solutions`` is added. It - defaults to ``False``. When it is ``True``, then the best solution - after each generation is saved into an attribute named - ``best_solutions``. If ``False``, then no solutions are saved and the - ``best_solutions`` attribute will be empty. - -3. Scattered crossover is supported. To use it, assign the - ``crossover_type`` parameter the value ``"scattered"``. - -4. NumPy arrays are now supported by the ``gene_space`` parameter. - -5. The following parameters (``gene_type``, ``crossover_probability``, - ``mutation_probability``, ``delay_after_gen``) can be assigned to a - numeric value of any of these data types: ``int``, ``float``, - ``numpy.int``, ``numpy.int8``, ``numpy.int16``, ``numpy.int32``, - ``numpy.int64``, ``numpy.float``, ``numpy.float16``, - ``numpy.float32``, or ``numpy.float64``. - -.. _pygad-2100: - -PyGAD 2.10.0 ------------- - -Release Date: 03 January 2021 - -1. Support of a new module ``pygad.torchga`` to train PyTorch models - using PyGAD. Check `its - documentation `__. - -2. Support of adaptive mutation where the mutation rate is determined - by the fitness value of each solution. Read the `Adaptive - Mutation `__ - section for more details. Also, read this paper: `Libelli, S. - Marsili, and P. Alba. "Adaptive mutation in genetic algorithms." - Soft computing 4.2 (2000): - 76-80. `__ - -3. Before the ``run()`` method completes or exits, the fitness value of - the best solution in the current population is appended to the - ``best_solution_fitness`` list attribute. Note that the fitness - value of the best solution in the initial population is already - saved at the beginning of the list. So, the fitness value of the - best solution is saved before the genetic algorithm starts and after - it ends. - -4. When the parameter ``parent_selection_type`` is set to ``sss`` - (steady-state selection), then a warning message is printed if the - value of the ``keep_parents`` parameter is set to 0. - -5. More validations to the user input parameters. - -6. The default value of the ``mutation_percent_genes`` is set to the - string ``"default"`` rather than the integer 10. This change helps - to know whether the user explicitly passed a value to the - ``mutation_percent_genes`` parameter or it is left to its default - one. The ``"default"`` value is later translated into the integer - 10. - -7. The ``mutation_percent_genes`` parameter is no longer accepting the - value 0. It must be ``>0`` and ``<=100``. - -8. The built-in ``warnings`` module is used to show warning messages - rather than just using the ``print()`` function. - -9. A new ``bool`` parameter called ``suppress_warnings`` is added to - the constructor of the ``pygad.GA`` class. It allows the user to - control whether the warning messages are printed or not. It defaults - to ``False`` which means the messages are printed. - -10. A helper method called ``adaptive_mutation_population_fitness()`` is - created to calculate the average fitness value used in adaptive - mutation to filter the solutions. - -11. The ``best_solution()`` method accepts a new optional parameter - called ``pop_fitness``. It accepts a list of the fitness values of - the solutions in the population. If ``None``, then the - ``cal_pop_fitness()`` method is called to calculate the fitness - values of the population. - -.. _pygad-2101: - -PyGAD 2.10.1 ------------- - -Release Date: 10 January 2021 - -1. In the ``gene_space`` parameter, any ``None`` value (regardless of - its index or axis), is replaced by a randomly generated number based - on the 3 parameters ``init_range_low``, ``init_range_high``, and - ``gene_type``. So, the ``None`` value in ``[..., None, ...]`` or - ``[..., [..., None, ...], ...]`` are replaced with random values. - This gives more freedom in building the space of values for the - genes. - -2. All the numbers passed to the ``gene_space`` parameter are casted to - the type specified in the ``gene_type`` parameter. - -3. The ``numpy.uint`` data type is supported for the parameters that - accept integer values. - -4. In the ``pygad.kerasga`` module, the ``model_weights_as_vector()`` - function uses the ``trainable`` attribute of the model's layers to - only return the trainable weights in the network. So, only the - trainable layers with their ``trainable`` attribute set to ``True`` - (``trainable=True``), which is the default value, have their weights - evolved. All non-trainable layers with the ``trainable`` attribute - set to ``False`` (``trainable=False``) will not be evolved. Thanks to - `Prof. Tamer A. Farrag `__ for - pointing about that at - `GitHub `__. - -.. _pygad-2102: - -PyGAD 2.10.2 ------------- - -Release Date: 15 January 2021 - -1. A bug fix when ``save_best_solutions=True``. Refer to this issue for - more information: - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/25 - -.. _pygad-2110: - -PyGAD 2.11.0 ------------- - -Release Date: 16 February 2021 - -1. In the ``gene_space`` argument, the user can use a dictionary to - specify the lower and upper limits of the gene. This dictionary must - have only 2 items with keys ``low`` and ``high`` to specify the low - and high limits of the gene, respectively. This way, PyGAD takes care - of not exceeding the value limits of the gene. For a problem with - only 2 genes, then using - ``gene_space=[{'low': 1, 'high': 5}, {'low': 0.2, 'high': 0.81}]`` - means the accepted values in the first gene start from 1 (inclusive) - to 5 (exclusive) while the second one has values between 0.2 - (inclusive) and 0.85 (exclusive). For more information, please check - the `Limit the Gene Value - Range `__ - section of the documentation. - -2. The ``plot_result()`` method returns the figure so that the user can - save it. - -3. Bug fixes in copying elements from the gene space. - -4. For a gene with a set of discrete values (more than 1 value) in the - ``gene_space`` parameter like ``[0, 1]``, it was possible that the - gene value may not change after mutation. That is if the current - value is 0, then the randomly selected value could also be 0. Now, it - is verified that the new value is changed. So, if the current value - is 0, then the new value after mutation will not be 0 but 1. - -.. _pygad-2120: - -PyGAD 2.12.0 ------------- - -Release Date: 20 February 2021 - -1. 4 new instance attributes are added to hold temporary results after - each generation: ``last_generation_fitness`` holds the fitness values - of the solutions in the last generation, ``last_generation_parents`` - holds the parents selected from the last generation, - ``last_generation_offspring_crossover`` holds the offspring generated - after applying the crossover in the last generation, and - ``last_generation_offspring_mutation`` holds the offspring generated - after applying the mutation in the last generation. You can access - these attributes inside the ``on_generation()`` method for example. - -2. A bug fixed when the ``initial_population`` parameter is used. The - bug occurred due to a mismatch between the data type of the array - assigned to ``initial_population`` and the gene type in the - ``gene_type`` attribute. Assuming that the array assigned to the - ``initial_population`` parameter is - ``((1, 1), (3, 3), (5, 5), (7, 7))`` which has type ``int``. When - ``gene_type`` is set to ``float``, then the genes will not be float - but casted to ``int`` because the defined array has ``int`` type. The - bug is fixed by forcing the array assigned to ``initial_population`` - to have the data type in the ``gene_type`` attribute. Check the - `issue at - GitHub `__: - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/27 - -Thanks to Andrei Rozanski [PhD Bioinformatics Specialist, Department of -Tissue Dynamics and Regeneration, Max Planck Institute for Biophysical -Chemistry, Germany] for opening my eye to the first change. - -Thanks to `Marios -Giouvanakis `__, -a PhD candidate in Electrical & Computer Engineer, `Aristotle University -of Thessaloniki (Αριστοτέλειο Πανεπιστήμιο Θεσσαλονίκης), -Greece `__, for emailing me about the second -issue. - -.. _pygad-2130: - -PyGAD 2.13.0 -------------- - -Release Date: 12 March 2021 - -1. A new ``bool`` parameter called ``allow_duplicate_genes`` is - supported. If ``True``, which is the default, then a - solution/chromosome may have duplicate gene values. If ``False``, - then each gene will have a unique value in its solution. Check the - `Prevent Duplicates in Gene - Values `__ - section for more details. - -2. The ``last_generation_fitness`` is updated at the end of each - generation not at the beginning. This keeps the fitness values of the - most up-to-date population assigned to the - ``last_generation_fitness`` parameter. - -.. _pygad-2140: - -PyGAD 2.14.0 ------------- - -PyGAD 2.14.0 has an issue that is solved in PyGAD 2.14.1. Please -consider using 2.14.1 not 2.14.0. - -Release Date: 19 May 2021 - -1. `Issue - #40 `__ - is solved. Now, the ``None`` value works with the ``crossover_type`` - and ``mutation_type`` parameters: - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/40 - -2. The ``gene_type`` parameter supports accepting a - ``list/tuple/numpy.ndarray`` of numeric data types for the genes. - This helps to control the data type of each individual gene. - Previously, the ``gene_type`` can be assigned only to a single data - type that is applied for all genes. For more information, check the - `More about the ``gene_type`` - Parameter `__ - section. Thanks to `Rainer - Engel `__ - for asking about this feature in `this - discussion `__: - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/43 - -3. A new ``bool`` attribute named ``gene_type_single`` is added to the - ``pygad.GA`` class. It is ``True`` when there is a single data type - assigned to the ``gene_type`` parameter. When the ``gene_type`` - parameter is assigned a ``list/tuple/numpy.ndarray``, then - ``gene_type_single`` is set to ``False``. - -4. The ``mutation_by_replacement`` flag now has no effect if - ``gene_space`` exists except for the genes with ``None`` values. For - example, for ``gene_space=[None, [5, 6]]`` the - ``mutation_by_replacement`` flag affects only the first gene which - has ``None`` for its value space. - -5. When an element has a value of ``None`` in the ``gene_space`` - parameter (e.g. ``gene_space=[None, [5, 6]]``), then its value will - be randomly generated for each solution rather than being generate - once for all solutions. Previously, the gene with ``None`` value in - ``gene_space`` is the same across all solutions - -6. Some changes in the documentation according to `issue - #32 `__: - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/32 - -.. _pygad-2142: - -PyGAD 2.14.2 ------------- - -Release Date: 27 May 2021 - -1. Some bug fixes when the ``gene_type`` parameter is nested. Thanks to - `Rainer - Engel `__ - for opening `a - discussion `__ - to report this bug: - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/43#discussioncomment-763342 - -`Rainer -Engel `__ -helped a lot in suggesting new features and suggesting enhancements in -2.14.0 to 2.14.2 releases. - -.. _pygad-2143: - -PyGAD 2.14.3 ------------- - -Release Date: 6 June 2021 - -1. Some bug fixes when setting the ``save_best_solutions`` parameter to - ``True``. Previously, the best solution for generation ``i`` was - added into the ``best_solutions`` attribute at generation ``i+1``. - Now, the ``best_solutions`` attribute is updated by each best - solution at its exact generation. - -.. _pygad-2150: - -PyGAD 2.15.0 ------------- - -Release Date: 17 June 2021 - -1. Control the precision of all genes/individual genes. Thanks to - `Rainer `__ for asking about this - feature: - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/43#discussioncomment-763452 - -2. A new attribute named ``last_generation_parents_indices`` holds the - indices of the selected parents in the last generation. - -3. In adaptive mutation, no need to recalculate the fitness values of - the parents selected in the last generation as these values can be - returned based on the ``last_generation_fitness`` and - ``last_generation_parents_indices`` attributes. This speeds-up the - adaptive mutation. - -4. When a sublist has a value of ``None`` in the ``gene_space`` - parameter (e.g. ``gene_space=[[1, 2, 3], [5, 6, None]]``), then its - value will be randomly generated for each solution rather than being - generated once for all solutions. Previously, a value of ``None`` in - a sublist of the ``gene_space`` parameter was identical across all - solutions. - -5. The dictionary assigned to the ``gene_space`` parameter itself or - one of its elements has a new key called ``"step"`` to specify the - step of moving from the start to the end of the range specified by - the 2 existing keys ``"low"`` and ``"high"``. An example is - ``{"low": 0, "high": 30, "step": 2}`` to have only even values for - the gene(s) starting from 0 to 30. For more information, check the - `More about the ``gene_space`` - Parameter `__ - section. - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/48 - -6. A new function called ``predict()`` is added in both the - ``pygad.kerasga`` and ``pygad.torchga`` modules to make predictions. - This makes it easier than using custom code each time a prediction - is to be made. - -7. A new parameter called ``stop_criteria`` allows the user to specify - one or more stop criteria to stop the evolution based on some - conditions. Each criterion is passed as ``str`` which has a stop - word. The current 2 supported words are ``reach`` and ``saturate``. - ``reach`` stops the ``run()`` method if the fitness value is equal - to or greater than a given fitness value. An example for ``reach`` - is ``"reach_40"`` which stops the evolution if the fitness is >= 40. - ``saturate`` means stop the evolution if the fitness saturates for a - given number of consecutive generations. An example for ``saturate`` - is ``"saturate_7"`` which means stop the ``run()`` method if the - fitness does not change for 7 consecutive generations. Thanks to - `Rainer `__ for asking about this - feature: - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/44 - -8. A new bool parameter, defaults to ``False``, named - ``save_solutions`` is added to the constructor of the ``pygad.GA`` - class. If ``True``, then all solutions in each generation are - appended into an attribute called ``solutions`` which is NumPy - array. - -9. The ``plot_result()`` method is renamed to ``plot_fitness()``. The - users should migrate to the new name as the old name will be removed - in the future. - -10. Four new optional parameters are added to the ``plot_fitness()`` - function in the ``pygad.GA`` class which are ``font_size=14``, - ``save_dir=None``, ``color="#3870FF"``, and ``plot_type="plot"``. - Use ``font_size`` to change the font of the plot title and labels. - ``save_dir`` accepts the directory to which the figure is saved. It - defaults to ``None`` which means do not save the figure. ``color`` - changes the color of the plot. ``plot_type`` changes the plot type - which can be either ``"plot"`` (default), ``"scatter"``, or - ``"bar"``. - https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/47 - -11. The default value of the ``title`` parameter in the - ``plot_fitness()`` method is ``"PyGAD - Generation vs. Fitness"`` - rather than ``"PyGAD - Iteration vs. Fitness"``. - -12. A new method named ``plot_new_solution_rate()`` creates, shows, and - returns a figure showing the rate of new/unique solutions explored - in each generation. It accepts the same parameters as in the - ``plot_fitness()`` method. This method only works when - ``save_solutions=True`` in the ``pygad.GA`` class's constructor. - -13. A new method named ``plot_genes()`` creates, shows, and returns a - figure to show how each gene changes per each generation. It accepts - similar parameters like the ``plot_fitness()`` method in addition to - the ``graph_type``, ``fill_color``, and ``solutions`` parameters. - The ``graph_type`` parameter can be either ``"plot"`` (default), - ``"boxplot"``, or ``"histogram"``. ``fill_color`` accepts the fill - color which works when ``graph_type`` is either ``"boxplot"`` or - ``"histogram"``. ``solutions`` can be either ``"all"`` or ``"best"`` - to decide whether all solutions or only best solutions are used. - -14. The ``gene_type`` parameter now supports controlling the precision - of ``float`` data types. For a gene, rather than assigning just the - data type like ``float``, assign a - ``list``/``tuple``/``numpy.ndarray`` with 2 elements where the first - one is the type and the second one is the precision. For example, - ``[float, 2]`` forces a gene with a value like ``0.1234`` to be - ``0.12``. For more information, check the `More about the - ``gene_type`` - Parameter `__ - section. - -.. _pygad-2151: - -PyGAD 2.15.1 ------------- - -Release Date: 18 June 2021 - -1. Fix a bug when ``keep_parents`` is set to a positive integer. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/49 - -.. _pygad-2152: - -PyGAD 2.15.2 ------------- - -Release Date: 18 June 2021 - -1. Fix a bug when using the ``kerasga`` or ``torchga`` modules. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/51 - -.. _pygad-2160: - -PyGAD 2.16.0 ------------- - -Release Date: 19 June 2021 - -1. A user-defined function can be passed to the ``mutation_type``, - ``crossover_type``, and ``parent_selection_type`` parameters in the - ``pygad.GA`` class to create a custom mutation, crossover, and parent - selection operators. Check the `User-Defined Crossover, Mutation, and - Parent Selection - Operators `__ - section for more details. - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/50 - -.. _pygad-2161: - -PyGAD 2.16.1 ------------- - -Release Date: 28 September 2021 - -1. The user can use the ``tqdm`` library to show a progress bar. - https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/50. - -.. code:: python - - import pygad - import numpy - import tqdm - - equation_inputs = [4,-2,3.5] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - num_generations = 10000 - with tqdm.tqdm(total=num_generations) as pbar: - ga_instance = pygad.GA(num_generations=num_generations, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - on_generation=lambda _: pbar.update(1)) - - ga_instance.run() - - ga_instance.plot_result() - -But this work does not work if the ``ga_instance`` will be pickled (i.e. -the ``save()`` method will be called. - -.. code:: python - - ga_instance.save("test") - -To solve this issue, define a function and pass it to the -``on_generation`` parameter. In the next code, the -``on_generation_progress()`` function is defined which updates the -progress bar. - -.. code:: python - - import pygad - import numpy - import tqdm - - equation_inputs = [4,-2,3.5] - desired_output = 44 - - def fitness_func(ga_instance, solution, solution_idx): - output = numpy.sum(solution * equation_inputs) - fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) - return fitness - - def on_generation_progress(ga): - pbar.update(1) - - num_generations = 100 - with tqdm.tqdm(total=num_generations) as pbar: - ga_instance = pygad.GA(num_generations=num_generations, - sol_per_pop=5, - num_parents_mating=2, - num_genes=len(equation_inputs), - fitness_func=fitness_func, - on_generation=on_generation_progress) - - ga_instance.run() - - ga_instance.plot_result() - - ga_instance.save("test") - -1. Solved the issue of unequal length between the ``solutions`` and - ``solutions_fitness`` when the ``save_solutions`` parameter is set to - ``True``. Now, the fitness of the last population is appended to the - ``solutions_fitness`` array. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/64 - -2. There was an issue of getting the length of these 4 variables - (``solutions``, ``solutions_fitness``, ``best_solutions``, and - ``best_solutions_fitness``) doubled after each call of the ``run()`` - method. This is solved by resetting these variables at the beginning - of the ``run()`` method. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/62 - -3. Bug fixes when adaptive mutation is used - (``mutation_type="adaptive"``). - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/65 - -.. _pygad-2162: - -PyGAD 2.16.2 ------------- - -Release Date: 2 February 2022 - -1. A new instance attribute called ``previous_generation_fitness`` added - in the ``pygad.GA`` class. It holds the fitness values of one - generation before the fitness values saved in the - ``last_generation_fitness``. - -2. Issue in the ``cal_pop_fitness()`` method in getting the correct - indices of the previous parents. This is solved by using the previous - generation's fitness saved in the new attribute - ``previous_generation_fitness`` to return the parents' fitness - values. Thanks to Tobias Tischhauser (M.Sc. - `Mitarbeiter Institut - EMS, Departement Technik, OST – Ostschweizer Fachhochschule, - Switzerland `__) - for detecting this bug. - -.. _pygad-2163: - -PyGAD 2.16.3 ------------- - -Release Date: 2 February 2022 - -1. Validate the fitness value returned from the fitness function. An - exception is raised if something is wrong. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/67 - -.. _pygad-2170: - -PyGAD 2.17.0 ------------- - -Release Date: 8 July 2022 - -1. An issue is solved when the ``gene_space`` parameter is given a fixed - value. e.g. gene_space=[range(5), 4]. The second gene's value is - static (4) which causes an exception. - -2. Fixed the issue where the ``allow_duplicate_genes`` parameter did not - work when mutation is disabled (i.e. ``mutation_type=None``). This is - by checking for duplicates after crossover directly. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/39 - -3. Solve an issue in the ``tournament_selection()`` method as the - indices of the selected parents were incorrect. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/89 - -4. Reuse the fitness values of the previously explored solutions rather - than recalculating them. This feature only works if - ``save_solutions=True``. - -5. Parallel processing is supported. This is by the introduction of a - new parameter named ``parallel_processing`` in the constructor of the - ``pygad.GA`` class. Thanks to - `@windowshopr `__ for opening the - issue - `#78 `__ - at GitHub. Check the `Parallel Processing in - PyGAD `__ - section for more information and examples. - -.. _pygad-2180: - -PyGAD 2.18.0 ------------- - -Release Date: 9 September 2022 - -1. Raise an exception if the sum of fitness values is zero while either - roulette wheel or stochastic universal parent selection is used. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/129 - -2. Initialize the value of the ``run_completed`` property to ``False``. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/122 - -3. The values of these properties are no longer reset with each call to - the ``run()`` method - ``self.best_solutions, self.best_solutions_fitness, self.solutions, self.solutions_fitness``: - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/123. Now, - the user can have the flexibility of calling the ``run()`` method - more than once while extending the data collected after each - generation. Another advantage happens when the instance is loaded and - the ``run()`` method is called, as the old fitness value are shown on - the graph alongside with the new fitness values. Read more in this - section: `Continue without Loosing - Progress `__ - -4. Thanks `Prof. Fernando Jiménez - Barrionuevo `__ (Dept. of Information and - Communications Engineering, University of Murcia, Murcia, Spain) for - editing this - `comment `__ - in the code. - https://github.com/ahmedfgad/GeneticAlgorithmPython/commit/5315bbec02777df96ce1ec665c94dece81c440f4 - -5. A bug fixed when ``crossover_type=None``. - -6. Support of elitism selection through a new parameter named - ``keep_elitism``. It defaults to 1 which means for each generation - keep only the best solution in the next generation. If assigned 0, - then it has no effect. Read more in this section: `Elitism - Selection `__. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/74 - -7. A new instance attribute named ``last_generation_elitism`` added to - hold the elitism in the last generation. - -8. A new parameter called ``random_seed`` added to accept a seed for the - random function generators. Credit to this issue - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/70 and - `Prof. Fernando Jiménez Barrionuevo `__. - Read more in this section: `Random - Seed `__. - -9. Editing the ``pygad.TorchGA`` module to make sure the tensor data is - moved from GPU to CPU. Thanks to Rasmus Johansson for opening this - pull request: https://github.com/ahmedfgad/TorchGA/pull/2 - -.. _pygad-2181: - -PyGAD 2.18.1 ------------- - -Release Date: 19 September 2022 - -1. A big fix when ``keep_elitism`` is used. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/132 - -.. _pygad-2182: - -PyGAD 2.18.2 ------------- - -Release Date: 14 February 2023 - -1. Remove ``numpy.int`` and ``numpy.float`` from the list of supported - data types. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/151 - https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/152 - -2. Call the ``on_crossover()`` callback function even if - ``crossover_type`` is ``None``. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/138 - -3. Call the ``on_mutation()`` callback function even if - ``mutation_type`` is ``None``. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/138 - -.. _pygad-2183: - -PyGAD 2.18.3 ------------- - -Release Date: 14 February 2023 - -1. Bug fixes. - -.. _pygad-2190: - -PyGAD 2.19.0 ------------- - -Release Date: 22 February 2023 - -1. A new ``summary()`` method is supported to return a Keras-like - summary of the PyGAD lifecycle. - -2. A new optional parameter called ``fitness_batch_size`` is supported - to calculate the fitness in batches. If it is assigned the value - ``1`` or ``None`` (default), then the normal flow is used where the - fitness function is called for each individual solution. If the - ``fitness_batch_size`` parameter is assigned a value satisfying this - condition ``1 < fitness_batch_size <= sol_per_pop``, then the - solutions are grouped into batches of size ``fitness_batch_size`` - and the fitness function is called once for each batch. In this - case, the fitness function must return a list/tuple/numpy.ndarray - with a length equal to the number of solutions passed. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/136. - -3. The ``cloudpickle`` library - (https://github.com/cloudpipe/cloudpickle) is used instead of the - ``pickle`` library to pickle the ``pygad.GA`` objects. This solves - the issue of having to redefine the functions (e.g. fitness - function). The ``cloudpickle`` library is added as a dependency in - the ``requirements.txt`` file. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/159 - -4. Support of assigning methods to these parameters: ``fitness_func``, - ``crossover_type``, ``mutation_type``, ``parent_selection_type``, - ``on_start``, ``on_fitness``, ``on_parents``, ``on_crossover``, - ``on_mutation``, ``on_generation``, and ``on_stop``. - https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/92 - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/138 - -5. Validating the output of the parent selection, crossover, and - mutation functions. - -6. The built-in parent selection operators return the parent's indices - as a NumPy array. - -7. The outputs of the parent selection, crossover, and mutation - operators must be NumPy arrays. - -8. Fix an issue when ``allow_duplicate_genes=True``. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/39 - -9. Fix an issue creating scatter plots of the solutions' fitness. - -10. Sampling from a ``set()`` is no longer supported in Python 3.11. - Instead, sampling happens from a ``list()``. Thanks ``Marco Brenna`` - for pointing to this issue. - -11. The lifecycle is updated to reflect that the new population's - fitness is calculated at the end of the lifecycle not at the - beginning. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/154#issuecomment-1438739483 - -12. There was an issue when ``save_solutions=True`` that causes the - fitness function to be called for solutions already explored and - have their fitness pre-calculated. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/160 - -13. A new instance attribute named ``last_generation_elitism_indices`` - added to hold the indices of the selected elitism. This attribute - helps to re-use the fitness of the elitism instead of calling the - fitness function. - -14. Fewer calls to the ``best_solution()`` method which in turns saves - some calls to the fitness function. - -15. Some updates in the documentation to give more details about the - ``cal_pop_fitness()`` method. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/79#issuecomment-1439605442 - -.. _pygad-2191: - -PyGAD 2.19.1 ------------- - -Release Date: 22 February 2023 - -1. Add the `cloudpickle `__ - library as a dependency. - -.. _pygad-2192: - -PyGAD 2.19.2 ------------- - -Release Date 23 February 2023 - -1. Fix an issue when parallel processing was used where the elitism - solutions' fitness values are not re-used. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/160#issuecomment-1441718184 - -.. _pygad-300: - -PyGAD 3.0.0 ------------ - -Release Date 8 April 2023 - -1. The structure of the library is changed and some methods defined in - the ``pygad.py`` module are moved to the ``pygad.utils``, - ``pygad.helper``, and ``pygad.visualize`` submodules. - -2. The ``pygad.utils.parent_selection`` module has a class named - ``ParentSelection`` where all the parent selection operators exist. - The ``pygad.GA`` class extends this class. - -3. The ``pygad.utils.crossover`` module has a class named ``Crossover`` - where all the crossover operators exist. The ``pygad.GA`` class - extends this class. - -4. The ``pygad.utils.mutation`` module has a class named ``Mutation`` - where all the mutation operators exist. The ``pygad.GA`` class - extends this class. - -5. The ``pygad.helper.unique`` module has a class named ``Unique`` some - helper methods exist to solve duplicate genes and make sure every - gene is unique. The ``pygad.GA`` class extends this class. - -6. The ``pygad.visualize.plot`` module has a class named ``Plot`` where - all the methods that create plots exist. The ``pygad.GA`` class - extends this class. - -7. Support of using the ``logging`` module to log the outputs to both - the console and text file instead of using the ``print()`` function. - This is by assigning the ``logging.Logger`` to the new ``logger`` - parameter. Check the `Logging - Outputs `__ - for more information. - -8. A new instance attribute called ``logger`` to save the logger. - -9. The function/method passed to the ``fitness_func`` parameter accepts - a new parameter that refers to the instance of the ``pygad.GA`` - class. Check this for an example: `Use Functions and Methods to - Build Fitness Function and - Callbacks `__. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/163 - -10. Update the documentation to include an example of using functions - and methods to calculate the fitness and build callbacks. Check this - for more details: `Use Functions and Methods to Build Fitness - Function and - Callbacks `__. - https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/92#issuecomment-1443635003 - -11. Validate the value passed to the ``initial_population`` parameter. - -12. Validate the type and length of the ``pop_fitness`` parameter of the - ``best_solution()`` method. - -13. Some edits in the documentation. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/106 - -14. Fix an issue when building the initial population as (some) genes - have their value taken from the mutation range (defined by the - parameters ``random_mutation_min_val`` and - ``random_mutation_max_val``) instead of using the parameters - ``init_range_low`` and ``init_range_high``. - -15. The ``summary()`` method returns the summary as a single-line - string. Just log/print the returned string it to see it properly. - -16. The ``callback_generation`` parameter is removed. Use the - ``on_generation`` parameter instead. - -17. There was an issue when using the ``parallel_processing`` parameter - with Keras and PyTorch. As Keras/PyTorch are not thread-safe, the - ``predict()`` method gives incorrect and weird results when more - than 1 thread is used. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/145 - https://github.com/ahmedfgad/TorchGA/issues/5 - https://github.com/ahmedfgad/KerasGA/issues/6. Thanks to this - `StackOverflow - answer `__. - -18. Replace ``numpy.float`` by ``float`` in the 2 parent selection - operators roulette wheel and stochastic universal. - https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/168 - -.. _pygad-301: - -PyGAD 3.0.1 ------------ - -Release Date 20 April 2023 - -1. Fix an issue with passing user-defined function/method for parent - selection. - https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/179 - -PyGAD Projects at GitHub -======================== - -The PyGAD library is available at PyPI at this page -https://pypi.org/project/pygad. PyGAD is built out of a number of -open-source GitHub projects. A brief note about these projects is given -in the next subsections. - -`GeneticAlgorithmPython `__ --------------------------------------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/GeneticAlgorithmPython - -`GeneticAlgorithmPython `__ -is the first project which is an open-source Python 3 project for -implementing the genetic algorithm based on NumPy. - -`NumPyANN `__ ----------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/NumPyANN - -`NumPyANN `__ builds artificial -neural networks in **Python 3** using **NumPy** from scratch. The -purpose of this project is to only implement the **forward pass** of a -neural network without using a training algorithm. Currently, it only -supports classification and later regression will be also supported. -Moreover, only one class is supported per sample. - -`NeuralGenetic `__ --------------------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/NeuralGenetic - -`NeuralGenetic `__ trains -neural networks using the genetic algorithm based on the previous 2 -projects -`GeneticAlgorithmPython `__ -and `NumPyANN `__. - -`NumPyCNN `__ ----------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/NumPyCNN - -`NumPyCNN `__ builds -convolutional neural networks using NumPy. The purpose of this project -is to only implement the **forward pass** of a convolutional neural -network without using a training algorithm. - -`CNNGenetic `__ --------------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/CNNGenetic - -`CNNGenetic `__ trains -convolutional neural networks using the genetic algorithm. It uses the -`GeneticAlgorithmPython `__ -project for building the genetic algorithm. - -`KerasGA `__ --------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/KerasGA - -`KerasGA `__ trains -`Keras `__ models using the genetic algorithm. It uses -the -`GeneticAlgorithmPython `__ -project for building the genetic algorithm. - -`TorchGA `__ --------------------------------------------------- - -GitHub Link: https://github.com/ahmedfgad/TorchGA - -`TorchGA `__ trains -`PyTorch `__ models using the genetic algorithm. It -uses the -`GeneticAlgorithmPython `__ -project for building the genetic algorithm. - -`pygad.torchga `__: -https://github.com/ahmedfgad/TorchGA - -Stackoverflow Questions about PyGAD -=================================== - -.. _how-do-i-proceed-to-load-a-gainstance-as-pkl-format-in-pygad: - -`How do I proceed to load a ga_instance as “.pkl” format in PyGad? `__ ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -`Binary Classification NN Model Weights not being Trained in PyGAD `__ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -`How to solve TSP problem using pyGAD package? `__ ---------------------------------------------------------------------------------------------------------------------------------------------- - -`How can I save a matplotlib plot that is the output of a function in jupyter? `__ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -`How do I query the best solution of a pyGAD GA instance? `__ -------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -`Multi-Input Multi-Output in Genetic algorithm (python) `__ --------------------------------------------------------------------------------------------------------------------------------------------------------------- - -https://www.linkedin.com/pulse/validation-short-term-parametric-trading-model-genetic-landolfi - -https://itchef.ru/articles/397758 - -https://audhiaprilliant.medium.com/genetic-algorithm-based-clustering-algorithm-in-searching-robust-initial-centroids-for-k-means-e3b4d892a4be - -https://python.plainenglish.io/validation-of-a-short-term-parametric-trading-model-with-genetic-optimization-and-walk-forward-89708b789af6 - -https://ichi.pro/ko/pygadwa-hamkke-yujeon-algolijeum-eul-sayonghayeo-keras-model-eul-hunlyeonsikineun-bangbeob-173299286377169 - -https://ichi.pro/tr/pygad-ile-genetik-algoritmayi-kullanarak-keras-modelleri-nasil-egitilir-173299286377169 - -https://ichi.pro/ru/kak-obucit-modeli-keras-s-pomos-u-geneticeskogo-algoritma-s-pygad-173299286377169 - -https://blog.csdn.net/sinat_38079265/article/details/108449614 - -Submitting Issues -================= - -If there is an issue using PyGAD, then use any of your preferred option -to discuss that issue. - -One way is `submitting an -issue `__ -into this GitHub project -(`github.com/ahmedfgad/GeneticAlgorithmPython `__) -in case something is not working properly or to ask for questions. - -If this is not a proper option for you, then check the `Contact -Us `__ -section for more contact details. - -Ask for Feature -=============== - -PyGAD is actively developed with the goal of building a dynamic library -for suporting a wide-range of problems to be optimized using the genetic -algorithm. - -To ask for a new feature, either `submit an -issue `__ -into this GitHub project -(`github.com/ahmedfgad/GeneticAlgorithmPython `__) -or send an e-mail to ahmed.f.gad@gmail.com. - -Also check the `Contact -Us `__ -section for more contact details. - -Projects Built using PyGAD -========================== - -If you created a project that uses PyGAD, then we can support you by -mentioning this project here in PyGAD's documentation. - -To do that, please send a message at ahmed.f.gad@gmail.com or check the -`Contact -Us `__ -section for more contact details. - -Within your message, please send the following details: - -- Project title - -- Brief description - -- Preferably, a link that directs the readers to your project - -Tutorials about PyGAD -===================== - -`Adaptive Mutation in Genetic Algorithm with Python Examples `__ ------------------------------------------------------------------------------------------------------------------------------------------------------ - -In this tutorial, we’ll see why mutation with a fixed number of genes is -bad, and how to replace it with adaptive mutation. Using the `PyGAD -Python 3 library `__, we’ll discuss a few -examples that use both random and adaptive mutation. - -`Clustering Using the Genetic Algorithm in Python `__ -------------------------------------------------------------------------------------------------------------------------- - -This tutorial discusses how the genetic algorithm is used to cluster -data, starting from random clusters and running until the optimal -clusters are found. We'll start by briefly revising the K-means -clustering algorithm to point out its weak points, which are later -solved by the genetic algorithm. The code examples in this tutorial are -implemented in Python using the `PyGAD -library `__. - -`Working with Different Genetic Algorithm Representations in Python `__ --------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -Depending on the nature of the problem being optimized, the genetic -algorithm (GA) supports two different gene representations: binary, and -decimal. The binary GA has only two values for its genes, which are 0 -and 1. This is easier to manage as its gene values are limited compared -to the decimal GA, for which we can use different formats like float or -integer, and limited or unlimited ranges. - -This tutorial discusses how the -`PyGAD `__ library supports the two GA -representations, binary and decimal. - -.. _5-genetic-algorithm-applications-using-pygad: - -`5 Genetic Algorithm Applications Using PyGAD `__ -------------------------------------------------------------------------------------------------------------------------- - -This tutorial introduces PyGAD, an open-source Python library for -implementing the genetic algorithm and training machine learning -algorithms. PyGAD supports 19 parameters for customizing the genetic -algorithm for various applications. - -Within this tutorial we'll discuss 5 different applications of the -genetic algorithm and build them using PyGAD. - -`Train Neural Networks Using a Genetic Algorithm in Python with PyGAD `__ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -The genetic algorithm (GA) is a biologically-inspired optimization -algorithm. It has in recent years gained importance, as it’s simple -while also solving complex problems like travel route optimization, -training machine learning algorithms, working with single and -multi-objective problems, game playing, and more. - -Deep neural networks are inspired by the idea of how the biological -brain works. It’s a universal function approximator, which is capable of -simulating any function, and is now used to solve the most complex -problems in machine learning. What’s more, they’re able to work with all -types of data (images, audio, video, and text). - -Both genetic algorithms (GAs) and neural networks (NNs) are similar, as -both are biologically-inspired techniques. This similarity motivates us -to create a hybrid of both to see whether a GA can train NNs with high -accuracy. - -This tutorial uses `PyGAD `__, a Python -library that supports building and training NNs using a GA. -`PyGAD `__ offers both classification and -regression NNs. - -`Building a Game-Playing Agent for CoinTex Using the Genetic Algorithm `__ ----------------------------------------------------------------------------------------------------------------------------------------------------------- - -In this tutorial we'll see how to build a game-playing agent using only -the genetic algorithm to play a game called -`CoinTex `__, -which is developed in the Kivy Python framework. The objective of -CoinTex is to collect the randomly distributed coins while avoiding -collision with fire and monsters (that move randomly). The source code -of CoinTex can be found `on -GitHub `__. - -The genetic algorithm is the only AI used here; there is no other -machine/deep learning model used with it. We'll implement the genetic -algorithm using -`PyGad `__. -This tutorial starts with a quick overview of CoinTex followed by a -brief explanation of the genetic algorithm, and how it can be used to -create the playing agent. Finally, we'll see how to implement these -ideas in Python. - -The source code of the genetic algorithm agent is available -`here `__, -and you can download the code used in this tutorial from -`here `__. - -`How To Train Keras Models Using the Genetic Algorithm with PyGAD `__ --------------------------------------------------------------------------------------------------------------------------------------------------------- - -PyGAD is an open-source Python library for building the genetic -algorithm and training machine learning algorithms. It offers a wide -range of parameters to customize the genetic algorithm to work with -different types of problems. - -PyGAD has its own modules that support building and training neural -networks (NNs) and convolutional neural networks (CNNs). Despite these -modules working well, they are implemented in Python without any -additional optimization measures. This leads to comparatively high -computational times for even simple problems. - -The latest PyGAD version, 2.8.0 (released on 20 September 2020), -supports a new module to train Keras models. Even though Keras is built -in Python, it's fast. The reason is that Keras uses TensorFlow as a -backend, and TensorFlow is highly optimized. - -This tutorial discusses how to train Keras models using PyGAD. The -discussion includes building Keras models using either the Sequential -Model or the Functional API, building an initial population of Keras -model parameters, creating an appropriate fitness function, and more. - -|image1| - -`Train PyTorch Models Using Genetic Algorithm with PyGAD `__ ---------------------------------------------------------------------------------------------------------------------------------------------- - -`PyGAD `__ is a genetic algorithm Python -3 library for solving optimization problems. One of these problems is -training machine learning algorithms. - -PyGAD has a module called -`pygad.kerasga `__. It trains -Keras models using the genetic algorithm. On January 3rd, 2021, a new -release of `PyGAD 2.10.0 `__ brought a -new module called -`pygad.torchga `__ to train -PyTorch models. It’s very easy to use, but there are a few tricky steps. - -So, in this tutorial, we’ll explore how to use PyGAD to train PyTorch -models. - -|image2| - -`A Guide to Genetic ‘Learning’ Algorithms for Optimization `__ -------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -PyGAD in Other Languages -======================== - -French ------- - -`Cómo los algoritmos genéticos pueden competir con el descenso de -gradiente y el -backprop `__ - -Bien que la manière standard d'entraîner les réseaux de neurones soit la -descente de gradient et la rétropropagation, il y a d'autres joueurs -dans le jeu. L'un d'eux est les algorithmes évolutionnaires, tels que -les algorithmes génétiques. - -Utiliser un algorithme génétique pour former un réseau de neurones -simple pour résoudre le OpenAI CartPole Jeu. Dans cet article, nous -allons former un simple réseau de neurones pour résoudre le OpenAI -CartPole . J'utiliserai PyTorch et PyGAD . - -|image3| - -Spanish -------- - -`Cómo los algoritmos genéticos pueden competir con el descenso de -gradiente y el -backprop `__ - -Aunque la forma estandar de entrenar redes neuronales es el descenso de -gradiente y la retropropagacion, hay otros jugadores en el juego, uno de -ellos son los algoritmos evolutivos, como los algoritmos geneticos. - -Usa un algoritmo genetico para entrenar una red neuronal simple para -resolver el Juego OpenAI CartPole. En este articulo, entrenaremos una -red neuronal simple para resolver el OpenAI CartPole . Usare PyTorch y -PyGAD . - -|image4| - -Korean ------- - -`[PyGAD] Python 에서 Genetic Algorithm 을 사용해보기 `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -|image5| - -파이썬에서 genetic algorithm을 사용하는 패키지들을 다 사용해보진 -않았지만, 확장성이 있어보이고, 시도할 일이 있어서 살펴봤다. - -이 패키지에서 가장 인상 깊었던 것은 neural network에서 hyper parameter -탐색을 gradient descent 방식이 아닌 GA로도 할 수 있다는 것이다. - -개인적으로 이 부분이 어느정도 초기치를 잘 잡아줄 수 있는 역할로도 쓸 수 -있고, Loss가 gradient descent 하기 어려운 구조에서 대안으로 쓸 수 있을 -것으로도 생각된다. - -일단 큰 흐름은 다음과 같이 된다. - -사실 완전히 흐름이나 각 parameter에 대한 이해는 부족한 상황 - -Turkish -------- - -`PyGAD ile Genetik Algoritmayı Kullanarak Keras Modelleri Nasıl Eğitilir `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This is a translation of an original English tutorial published at -Paperspace: `How To Train Keras Models Using the Genetic Algorithm with -PyGAD `__ - -PyGAD, genetik algoritma oluşturmak ve makine öğrenimi algoritmalarını -eğitmek için kullanılan açık kaynaklı bir Python kitaplığıdır. Genetik -algoritmayı farklı problem türleri ile çalışacak şekilde özelleştirmek -için çok çeşitli parametreler sunar. - -PyGAD, sinir ağları (NN’ler) ve evrişimli sinir ağları (CNN’ler) -oluşturmayı ve eğitmeyi destekleyen kendi modüllerine sahiptir. Bu -modüllerin iyi çalışmasına rağmen, herhangi bir ek optimizasyon önlemi -olmaksızın Python’da uygulanırlar. Bu, basit problemler için bile -nispeten yüksek hesaplama sürelerine yol açar. - -En son PyGAD sürümü 2.8.0 (20 Eylül 2020'de piyasaya sürüldü), Keras -modellerini eğitmek için yeni bir modülü destekliyor. Keras Python’da -oluşturulmuş olsa da hızlıdır. Bunun nedeni, Keras’ın arka uç olarak -TensorFlow kullanması ve TensorFlow’un oldukça optimize edilmiş -olmasıdır. - -Bu öğreticide, PyGAD kullanılarak Keras modellerinin nasıl eğitileceği -anlatılmaktadır. Tartışma, Sıralı Modeli veya İşlevsel API’yi kullanarak -Keras modellerini oluşturmayı, Keras model parametrelerinin ilk -popülasyonunu oluşturmayı, uygun bir uygunluk işlevi oluşturmayı ve daha -fazlasını içerir. - -|image6| - -Hungarian ---------- - -.. _tensorflow-alapozó-10-neurális-hálózatok-tenyésztése-genetikus-algoritmussal-pygad-és-openai-gym-használatával: - -`Tensorflow alapozó 10. Neurális hálózatok tenyésztése genetikus algoritmussal PyGAD és OpenAI Gym használatával `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Hogy kontextusba helyezzem a genetikus algoritmusokat, ismételjük kicsit -át, hogy hogyan működik a gradient descent és a backpropagation, ami a -neurális hálók tanításának általános módszere. Az erről írt cikkemet itt -tudjátok elolvasni. - -A hálózatok tenyésztéséhez a -`PyGAD `__ nevű -programkönyvtárat használjuk, így mindenek előtt ezt kell telepítenünk, -valamint a Tensorflow-t és a Gym-et, amit Colabban már eleve telepítve -kapunk. - -Maga a PyGAD egy teljesen általános genetikus algoritmusok futtatására -képes rendszer. Ennek a kiterjesztése a KerasGA, ami az általános motor -Tensorflow (Keras) neurális hálókon történő futtatását segíti. A 47. -sorban létrehozott KerasGA objektum ennek a kiterjesztésnek a része és -arra szolgál, hogy a paraméterként átadott modellből a második -paraméterben megadott számosságú populációt hozzon létre. Mivel a -hálózatunk 386 állítható paraméterrel rendelkezik, ezért a DNS-ünk itt -386 elemből fog állni. A populáció mérete 10 egyed, így a kezdő -populációnk egy 10x386 elemű mátrix lesz. Ezt adjuk át az 51. sorban az -initial_population paraméterben. - -|image7| - -Russian -------- - -`PyGAD: библиотека для имплементации генетического алгоритма `__ -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -PyGAD — это библиотека для имплементации генетического алгоритма. Кроме -того, библиотека предоставляет доступ к оптимизированным реализациям -алгоритмов машинного обучения. PyGAD разрабатывали на Python 3. - -Библиотека PyGAD поддерживает разные типы скрещивания, мутации и -селекции родителя. PyGAD позволяет оптимизировать проблемы с помощью -генетического алгоритма через кастомизацию целевой функции. - -Кроме генетического алгоритма, библиотека содержит оптимизированные -имплементации алгоритмов машинного обучения. На текущий момент PyGAD -поддерживает создание и обучение нейросетей для задач классификации. - -Библиотека находится в стадии активной разработки. Создатели планируют -добавление функционала для решения бинарных задач и имплементации новых -алгоритмов. - -PyGAD разрабатывали на Python 3.7.3. Зависимости включают в себя NumPy -для создания и манипуляции массивами и Matplotlib для визуализации. Один -из изкейсов использования инструмента — оптимизация весов, которые -удовлетворяют заданной функции. - -|image8| - -Research Papers using PyGAD -=========================== - -A number of research papers used PyGAD and here are some of them: - -- Jaros, Marta, and Jiri Jaros. "Performance-Cost Optimization of - Moldable Scientific Workflows." - -- Thorat, Divya. "Enhanced genetic algorithm to reduce makespan of - multiple jobs in map-reduce application on serverless platform". - Diss. Dublin, National College of Ireland, 2020. - -- Koch, Chris, and Edgar Dobriban. "AttenGen: Generating Live - Attenuated Vaccine Candidates using Machine Learning." (2021). - -- Bhardwaj, Bhavya, et al. "Windfarm optimization using Nelder-Mead and - Particle Swarm optimization." *2021 7th International Conference on - Electrical Energy Systems (ICEES)*. IEEE, 2021. - -- Bernardo, Reginald Christian S. and J. Said. “Towards a - model-independent reconstruction approach for late-time Hubble data.” - (2021). - -- Duong, Tri Dung, Qian Li, and Guandong Xu. "Prototype-based - Counterfactual Explanation for Causal Classification." *arXiv - preprint arXiv:2105.00703* (2021). - -- Farrag, Tamer Ahmed, and Ehab E. Elattar. "Optimized Deep Stacked - Long Short-Term Memory Network for Long-Term Load Forecasting." *IEEE - Access* 9 (2021): 68511-68522. - -- Antunes, E. D. O., Caetano, M. F., Marotta, M. A., Araujo, A., - Bondan, L., Meneguette, R. I., & Rocha Filho, G. P. (2021, August). - Soluções Otimizadas para o Problema de Localização de Máxima - Cobertura em Redes Militarizadas 4G/LTE. In *Anais do XXVI Workshop - de Gerência e Operação de Redes e Serviços* (pp. 152-165). SBC. - -- M. Yani, F. Ardilla, A. A. Saputra and N. Kubota, "Gradient-Free Deep - Q-Networks Reinforcement learning: Benchmark and Evaluation," *2021 - IEEE Symposium Series on Computational Intelligence (SSCI)*, 2021, - pp. 1-5, doi: 10.1109/SSCI50451.2021.9659941. - -- Yani, Mohamad, and Naoyuki Kubota. "Deep Convolutional Networks with - Genetic Algorithm for Reinforcement Learning Problem." - -- Mahendra, Muhammad Ihza, and Isman Kurniawan. "Optimizing - Convolutional Neural Network by Using Genetic Algorithm for COVID-19 - Detection in Chest X-Ray Image." *2021 International Conference on - Data Science and Its Applications (ICoDSA)*. IEEE, 2021. - -- Glibota, Vjeko. *Umjeravanje mikroskopskog prometnog modela primjenom - genetskog algoritma*. Diss. University of Zagreb. Faculty of - Transport and Traffic Sciences. Division of Intelligent Transport - Systems and Logistics. Department of Intelligent Transport Systems, - 2021. - -- Zhu, Mingda. *Genetic Algorithm-based Parameter Identification for - Ship Manoeuvring Model under Wind Disturbance*. MS thesis. NTNU, - 2021. - -- Abdalrahman, Ahmed, and Weihua Zhuang. "Dynamic pricing for - differentiated pev charging services using deep reinforcement - learning." *IEEE Transactions on Intelligent Transportation Systems* - (2020). - -More Links -========== - -https://rodriguezanton.com/identifying-contact-states-for-2d-objects-using-pygad-and/ - -https://torvaney.github.io/projects/t9-optimised - -For More Information -==================== - -There are different resources that can be used to get started with the -genetic algorithm and building it in Python. - -Tutorial: Implementing Genetic Algorithm in Python --------------------------------------------------- - -To start with coding the genetic algorithm, you can check the tutorial -titled `Genetic Algorithm Implementation in -Python `__ -available at these links: - -- `LinkedIn `__ - -- `Towards Data - Science `__ - -- `KDnuggets `__ - -`This -tutorial `__ -is prepared based on a previous version of the project but it still a -good resource to start with coding the genetic algorithm. - -|image9| - -Tutorial: Introduction to Genetic Algorithm -------------------------------------------- - -Get started with the genetic algorithm by reading the tutorial titled -`Introduction to Optimization with Genetic -Algorithm `__ -which is available at these links: - -- `LinkedIn `__ - -- `Towards Data - Science `__ - -- `KDnuggets `__ - -|image10| - -Tutorial: Build Neural Networks in Python ------------------------------------------ - -Read about building neural networks in Python through the tutorial -titled `Artificial Neural Network Implementation using NumPy and -Classification of the Fruits360 Image -Dataset `__ -available at these links: - -- `LinkedIn `__ - -- `Towards Data - Science `__ - -- `KDnuggets `__ - -|image11| - -Tutorial: Optimize Neural Networks with Genetic Algorithm ---------------------------------------------------------- - -Read about training neural networks using the genetic algorithm through -the tutorial titled `Artificial Neural Networks Optimization using -Genetic Algorithm with -Python `__ -available at these links: - -- `LinkedIn `__ - -- `Towards Data - Science `__ - -- `KDnuggets `__ - -|image12| - -Tutorial: Building CNN in Python --------------------------------- - -To start with coding the genetic algorithm, you can check the tutorial -titled `Building Convolutional Neural Network using NumPy from -Scratch `__ -available at these links: - -- `LinkedIn `__ - -- `Towards Data - Science `__ - -- `KDnuggets `__ - -- `Chinese Translation `__ - -`This -tutorial `__) -is prepared based on a previous version of the project but it still a -good resource to start with coding CNNs. - -|image13| - -Tutorial: Derivation of CNN from FCNN -------------------------------------- - -Get started with the genetic algorithm by reading the tutorial titled -`Derivation of Convolutional Neural Network from Fully Connected Network -Step-By-Step `__ -which is available at these links: - -- `LinkedIn `__ - -- `Towards Data - Science `__ - -- `KDnuggets `__ - -|image14| - -Book: Practical Computer Vision Applications Using Deep Learning with CNNs --------------------------------------------------------------------------- - -You can also check my book cited as `Ahmed Fawzy Gad 'Practical Computer -Vision Applications Using Deep Learning with CNNs'. Dec. 2018, Apress, -978-1-4842-4167-7 `__ -which discusses neural networks, convolutional neural networks, deep -learning, genetic algorithm, and more. - -Find the book at these links: - -- `Amazon `__ - -- `Springer `__ - -- `Apress `__ - -- `O'Reilly `__ - -- `Google Books `__ - -.. figure:: https://user-images.githubusercontent.com/16560492/78830077-ae7c2800-79e7-11ea-980b-53b6bd879eeb.jpg - :alt: - -Contact Us -========== - -- E-mail: ahmed.f.gad@gmail.com - -- `LinkedIn `__ - -- `Amazon Author Page `__ - -- `Heartbeat `__ - -- `Paperspace `__ - -- `KDnuggets `__ - -- `TowardsDataScience `__ - -- `GitHub `__ - -.. figure:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png - :alt: - -Thank you for using -`PyGAD `__ :) - -.. |image1| image:: https://user-images.githubusercontent.com/16560492/111009628-2b372500-8362-11eb-90cf-01b47d831624.png - :target: https://blog.paperspace.com/train-keras-models-using-genetic-algorithm-with-pygad -.. |image2| image:: https://user-images.githubusercontent.com/16560492/111009678-5457b580-8362-11eb-899a-39e2f96984df.png - :target: https://neptune.ai/blog/train-pytorch-models-using-genetic-algorithm-with-pygad -.. |image3| image:: https://user-images.githubusercontent.com/16560492/111009275-3178d180-8361-11eb-9e86-7fb1519acde7.png - :target: https://www.hebergementwebs.com/nouvelles/comment-les-algorithmes-genetiques-peuvent-rivaliser-avec-la-descente-de-gradient-et-le-backprop -.. |image4| image:: https://user-images.githubusercontent.com/16560492/111009257-232ab580-8361-11eb-99a5-7226efbc3065.png - :target: https://www.hebergementwebs.com/noticias/como-los-algoritmos-geneticos-pueden-competir-con-el-descenso-de-gradiente-y-el-backprop -.. |image5| image:: https://user-images.githubusercontent.com/16560492/108586306-85bd0280-731b-11eb-874c-7ac4ce1326cd.jpg - :target: https://data-newbie.tistory.com/m/685 -.. |image6| image:: https://user-images.githubusercontent.com/16560492/108586601-85be0200-731d-11eb-98a4-161c75a1f099.jpg - :target: https://erencan34.medium.com/pygad-ile-genetik-algoritmay%C4%B1-kullanarak-keras-modelleri-nas%C4%B1l-e%C4%9Fitilir-cf92639a478c -.. |image7| image:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png - :target: https://thebojda.medium.com/tensorflow-alapoz%C3%B3-10-24f7767d4a2c -.. |image8| image:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png - :target: https://neurohive.io/ru/frameworki/pygad-biblioteka-dlya-implementacii-geneticheskogo-algoritma -.. |image9| image:: https://user-images.githubusercontent.com/16560492/78830052-a3c19300-79e7-11ea-8b9b-4b343ea4049c.png - :target: https://www.linkedin.com/pulse/genetic-algorithm-implementation-python-ahmed-gad -.. |image10| image:: https://user-images.githubusercontent.com/16560492/82078259-26252d00-96e1-11ea-9a02-52a99e1054b9.jpg - :target: https://www.linkedin.com/pulse/introduction-optimization-genetic-algorithm-ahmed-gad -.. |image11| image:: https://user-images.githubusercontent.com/16560492/82078281-30472b80-96e1-11ea-8017-6a1f4383d602.jpg - :target: https://www.linkedin.com/pulse/artificial-neural-network-implementation-using-numpy-fruits360-gad -.. |image12| image:: https://user-images.githubusercontent.com/16560492/82078300-376e3980-96e1-11ea-821c-aa6b8ceb44d4.jpg - :target: https://www.linkedin.com/pulse/artificial-neural-networks-optimization-using-genetic-ahmed-gad -.. |image13| image:: https://user-images.githubusercontent.com/16560492/82431022-6c3a1200-9a8e-11ea-8f1b-b055196d76e3.png - :target: https://www.linkedin.com/pulse/building-convolutional-neural-network-using-numpy-from-ahmed-gad -.. |image14| image:: https://user-images.githubusercontent.com/16560492/82431369-db176b00-9a8e-11ea-99bd-e845192873fc.png - :target: https://www.linkedin.com/pulse/derivation-convolutional-neural-network-from-fully-connected-gad +Release History +=============== + +.. figure:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png + :alt: + +.. _pygad-1017: + +PyGAD 1.0.17 +------------ + +Release Date: 15 April 2020 + +1. The **pygad.GA** class accepts a new argument named ``fitness_func`` + which accepts a function to be used for calculating the fitness + values for the solutions. This allows the project to be customized to + any problem by building the right fitness function. + +.. _pygad-1020: + +PyGAD 1.0.20 +------------- + +Release Date: 4 May 2020 + +1. The **pygad.GA** attributes are moved from the class scope to the + instance scope. + +2. Raising an exception for incorrect values of the passed parameters. + +3. Two new parameters are added to the **pygad.GA** class constructor + (``init_range_low`` and ``init_range_high``) allowing the user to + customize the range from which the genes values in the initial + population are selected. + +4. The code object ``__code__`` of the passed fitness function is + checked to ensure it has the right number of parameters. + +.. _pygad-200: + +PyGAD 2.0.0 +------------ + +Release Date: 13 May 2020 + +1. The fitness function accepts a new argument named ``sol_idx`` + representing the index of the solution within the population. + +2. A new parameter to the **pygad.GA** class constructor named + ``initial_population`` is supported to allow the user to use a custom + initial population to be used by the genetic algorithm. If not None, + then the passed population will be used. If ``None``, then the + genetic algorithm will create the initial population using the + ``sol_per_pop`` and ``num_genes`` parameters. + +3. The parameters ``sol_per_pop`` and ``num_genes`` are optional and set + to ``None`` by default. + +4. A new parameter named ``callback_generation`` is introduced in the + **pygad.GA** class constructor. It accepts a function with a single + parameter representing the **pygad.GA** class instance. This function + is called after each generation. This helps the user to do + post-processing or debugging operations after each generation. + +.. _pygad-210: + +PyGAD 2.1.0 +----------- + +Release Date: 14 May 2020 + +1. The ``best_solution()`` method in the **pygad.GA** class returns a + new output representing the index of the best solution within the + population. Now, it returns a total of 3 outputs and their order is: + best solution, best solution fitness, and best solution index. Here + is an example: + +.. code:: python + + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Parameters of the best solution :", solution) + print("Fitness value of the best solution :", solution_fitness, "\n") + print("Index of the best solution :", solution_idx, "\n") + +1. | A new attribute named ``best_solution_generation`` is added to the + instances of the **pygad.GA** class. it holds the generation number + at which the best solution is reached. It is only assigned the + generation number after the ``run()`` method completes. Otherwise, + its value is -1. + | Example: + +.. code:: python + + print("Best solution reached after {best_solution_generation} generations.".format(best_solution_generation=ga_instance.best_solution_generation)) + +1. The ``best_solution_fitness`` attribute is renamed to + ``best_solutions_fitness`` (plural solution). + +2. Mutation is applied independently for the genes. + +.. _pygad-221: + +PyGAD 2.2.1 +----------- + +Release Date: 17 May 2020 + +1. Adding 2 extra modules (pygad.nn and pygad.gann) for building and + training neural networks with the genetic algorithm. + +.. _pygad-222: + +PyGAD 2.2.2 +----------- + +Release Date: 18 May 2020 + +1. The initial value of the ``generations_completed`` attribute of + instances from the pygad.GA class is ``0`` rather than ``None``. + +2. An optional bool parameter named ``mutation_by_replacement`` is added + to the constructor of the pygad.GA class. It works only when the + selected type of mutation is random (``mutation_type="random"``). In + this case, setting ``mutation_by_replacement=True`` means replace the + gene by the randomly generated value. If ``False``, then it has no + effect and random mutation works by adding the random value to the + gene. This parameter should be used when the gene falls within a + fixed range and its value must not go out of this range. Here are + some examples: + +Assume there is a gene with the value 0.5. + +If ``mutation_type="random"`` and ``mutation_by_replacement=False``, +then the generated random value (e.g. 0.1) will be added to the gene +value. The new gene value is **0.5+0.1=0.6**. + +If ``mutation_type="random"`` and ``mutation_by_replacement=True``, then +the generated random value (e.g. 0.1) will replace the gene value. The +new gene value is **0.1**. + +1. ``None`` value could be assigned to the ``mutation_type`` and + ``crossover_type`` parameters of the pygad.GA class constructor. When + ``None``, this means the step is bypassed and has no action. + +.. _pygad-230: + +PyGAD 2.3.0 +----------- + +Release date: 1 June 2020 + +1. A new module named ``pygad.cnn`` is supported for building + convolutional neural networks. + +2. A new module named ``pygad.gacnn`` is supported for training + convolutional neural networks using the genetic algorithm. + +3. The ``pygad.plot_result()`` method has 3 optional parameters named + ``title``, ``xlabel``, and ``ylabel`` to customize the plot title, + x-axis label, and y-axis label, respectively. + +4. The ``pygad.nn`` module supports the softmax activation function. + +5. The name of the ``pygad.nn.predict_outputs()`` function is changed to + ``pygad.nn.predict()``. + +6. The name of the ``pygad.nn.train_network()`` function is changed to + ``pygad.nn.train()``. + +.. _pygad-240: + +PyGAD 2.4.0 +----------- + +Release date: 5 July 2020 + +1. A new parameter named ``delay_after_gen`` is added which accepts a + non-negative number specifying the time in seconds to wait after a + generation completes and before going to the next generation. It + defaults to ``0.0`` which means no delay after the generation. + +2. The passed function to the ``callback_generation`` parameter of the + pygad.GA class constructor can terminate the execution of the genetic + algorithm if it returns the string ``stop``. This causes the + ``run()`` method to stop. + +One important use case for that feature is to stop the genetic algorithm +when a condition is met before passing though all the generations. The +user may assigned a value of 100 to the ``num_generations`` parameter of +the pygad.GA class constructor. Assuming that at generation 50, for +example, a condition is met and the user wants to stop the execution +before waiting the remaining 50 generations. To do that, just make the +function passed to the ``callback_generation`` parameter to return the +string ``stop``. + +Here is an example of a function to be passed to the +``callback_generation`` parameter which stops the execution if the +fitness value 70 is reached. The value 70 might be the best possible +fitness value. After being reached, then there is no need to pass +through more generations because no further improvement is possible. + +.. code:: python + + def func_generation(ga_instance): + if ga_instance.best_solution()[1] >= 70: + return "stop" + +.. _pygad-250: + +PyGAD 2.5.0 +----------- + +Release date: 19 July 2020 + +1. | 2 new optional parameters added to the constructor of the + ``pygad.GA`` class which are ``crossover_probability`` and + ``mutation_probability``. + | While applying the crossover operation, each parent has a random + value generated between 0.0 and 1.0. If this random value is less + than or equal to the value assigned to the + ``crossover_probability`` parameter, then the parent is selected + for the crossover operation. + | For the mutation operation, a random value between 0.0 and 1.0 is + generated for each gene in the solution. If this value is less than + or equal to the value assigned to the ``mutation_probability``, + then this gene is selected for mutation. + +2. A new optional parameter named ``linewidth`` is added to the + ``plot_result()`` method to specify the width of the curve in the + plot. It defaults to 3.0. + +3. Previously, the indices of the genes selected for mutation was + randomly generated once for all solutions within the generation. + Currently, the genes' indices are randomly generated for each + solution in the population. If the population has 4 solutions, the + indices are randomly generated 4 times inside the single generation, + 1 time for each solution. + +4. Previously, the position of the point(s) for the single-point and + two-points crossover was(were) randomly selected once for all + solutions within the generation. Currently, the position(s) is(are) + randomly selected for each solution in the population. If the + population has 4 solutions, the position(s) is(are) randomly + generated 4 times inside the single generation, 1 time for each + solution. + +5. A new optional parameter named ``gene_space`` as added to the + ``pygad.GA`` class constructor. It is used to specify the possible + values for each gene in case the user wants to restrict the gene + values. It is useful if the gene space is restricted to a certain + range or to discrete values. For more information, check the `More + about the ``gene_space`` + Parameter `__ + section. Thanks to `Prof. Tamer A. + Farrag `__ for requesting this useful + feature. + +.. _pygad-260: + +PyGAD 2.6.0 +------------ + +Release Date: 6 August 2020 + +1. A bug fix in assigning the value to the ``initial_population`` + parameter. + +2. A new parameter named ``gene_type`` is added to control the gene + type. It can be either ``int`` or ``float``. It has an effect only + when the parameter ``gene_space`` is ``None``. + +3. 7 new parameters that accept callback functions: ``on_start``, + ``on_fitness``, ``on_parents``, ``on_crossover``, ``on_mutation``, + ``on_generation``, and ``on_stop``. + +.. _pygad-270: + +PyGAD 2.7.0 +----------- + +Release Date: 11 September 2020 + +1. The ``learning_rate`` parameter in the ``pygad.nn.train()`` function + defaults to **0.01**. + +2. Added support of building neural networks for regression using the + new parameter named ``problem_type``. It is added as a parameter to + both ``pygad.nn.train()`` and ``pygad.nn.predict()`` functions. The + value of this parameter can be either **classification** or + **regression** to define the problem type. It defaults to + **classification**. + +3. The activation function for a layer can be set to the string + ``"None"`` to refer that there is no activation function at this + layer. As a result, the supported values for the activation function + are ``"sigmoid"``, ``"relu"``, ``"softmax"``, and ``"None"``. + +To build a regression network using the ``pygad.nn`` module, just do the +following: + +1. Set the ``problem_type`` parameter in the ``pygad.nn.train()`` and + ``pygad.nn.predict()`` functions to the string ``"regression"``. + +2. Set the activation function for the output layer to the string + ``"None"``. This sets no limits on the range of the outputs as it + will be from ``-infinity`` to ``+infinity``. If you are sure that all + outputs will be nonnegative values, then use the ReLU function. + +Check the documentation of the ``pygad.nn`` module for an example that +builds a neural network for regression. The regression example is also +available at `this GitHub +project `__: +https://github.com/ahmedfgad/NumPyANN + +To build and train a regression network using the ``pygad.gann`` module, +do the following: + +1. Set the ``problem_type`` parameter in the ``pygad.nn.train()`` and + ``pygad.nn.predict()`` functions to the string ``"regression"``. + +2. Set the ``output_activation`` parameter in the constructor of the + ``pygad.gann.GANN`` class to ``"None"``. + +Check the documentation of the ``pygad.gann`` module for an example that +builds and trains a neural network for regression. The regression +example is also available at `this GitHub +project `__: +https://github.com/ahmedfgad/NeuralGenetic + +To build a classification network, either ignore the ``problem_type`` +parameter or set it to ``"classification"`` (default value). In this +case, the activation function of the last layer can be set to any type +(e.g. softmax). + +.. _pygad-271: + +PyGAD 2.7.1 +----------- + +Release Date: 11 September 2020 + +1. A bug fix when the ``problem_type`` argument is set to + ``regression``. + +.. _pygad-272: + +PyGAD 2.7.2 +----------- + +Release Date: 14 September 2020 + +1. Bug fix to support building and training regression neural networks + with multiple outputs. + +.. _pygad-280: + +PyGAD 2.8.0 +----------- + +Release Date: 20 September 2020 + +1. Support of a new module named ``kerasga`` so that the Keras models + can be trained by the genetic algorithm using PyGAD. + +.. _pygad-281: + +PyGAD 2.8.1 +----------- + +Release Date: 3 October 2020 + +1. Bug fix in applying the crossover operation when the + ``crossover_probability`` parameter is used. Thanks to `Eng. Hamada + Kassem, Research and Teaching Assistant, Construction Engineering and + Management, Faculty of Engineering, Alexandria University, + Egypt `__. + +.. _pygad-290: + +PyGAD 2.9.0 +------------ + +Release Date: 06 December 2020 + +1. The fitness values of the initial population are considered in the + ``best_solutions_fitness`` attribute. + +2. An optional parameter named ``save_best_solutions`` is added. It + defaults to ``False``. When it is ``True``, then the best solution + after each generation is saved into an attribute named + ``best_solutions``. If ``False``, then no solutions are saved and the + ``best_solutions`` attribute will be empty. + +3. Scattered crossover is supported. To use it, assign the + ``crossover_type`` parameter the value ``"scattered"``. + +4. NumPy arrays are now supported by the ``gene_space`` parameter. + +5. The following parameters (``gene_type``, ``crossover_probability``, + ``mutation_probability``, ``delay_after_gen``) can be assigned to a + numeric value of any of these data types: ``int``, ``float``, + ``numpy.int``, ``numpy.int8``, ``numpy.int16``, ``numpy.int32``, + ``numpy.int64``, ``numpy.float``, ``numpy.float16``, + ``numpy.float32``, or ``numpy.float64``. + +.. _pygad-2100: + +PyGAD 2.10.0 +------------ + +Release Date: 03 January 2021 + +1. Support of a new module ``pygad.torchga`` to train PyTorch models + using PyGAD. Check `its + documentation `__. + +2. Support of adaptive mutation where the mutation rate is determined + by the fitness value of each solution. Read the `Adaptive + Mutation `__ + section for more details. Also, read this paper: `Libelli, S. + Marsili, and P. Alba. "Adaptive mutation in genetic algorithms." + Soft computing 4.2 (2000): + 76-80. `__ + +3. Before the ``run()`` method completes or exits, the fitness value of + the best solution in the current population is appended to the + ``best_solution_fitness`` list attribute. Note that the fitness + value of the best solution in the initial population is already + saved at the beginning of the list. So, the fitness value of the + best solution is saved before the genetic algorithm starts and after + it ends. + +4. When the parameter ``parent_selection_type`` is set to ``sss`` + (steady-state selection), then a warning message is printed if the + value of the ``keep_parents`` parameter is set to 0. + +5. More validations to the user input parameters. + +6. The default value of the ``mutation_percent_genes`` is set to the + string ``"default"`` rather than the integer 10. This change helps + to know whether the user explicitly passed a value to the + ``mutation_percent_genes`` parameter or it is left to its default + one. The ``"default"`` value is later translated into the integer + 10. + +7. The ``mutation_percent_genes`` parameter is no longer accepting the + value 0. It must be ``>0`` and ``<=100``. + +8. The built-in ``warnings`` module is used to show warning messages + rather than just using the ``print()`` function. + +9. A new ``bool`` parameter called ``suppress_warnings`` is added to + the constructor of the ``pygad.GA`` class. It allows the user to + control whether the warning messages are printed or not. It defaults + to ``False`` which means the messages are printed. + +10. A helper method called ``adaptive_mutation_population_fitness()`` is + created to calculate the average fitness value used in adaptive + mutation to filter the solutions. + +11. The ``best_solution()`` method accepts a new optional parameter + called ``pop_fitness``. It accepts a list of the fitness values of + the solutions in the population. If ``None``, then the + ``cal_pop_fitness()`` method is called to calculate the fitness + values of the population. + +.. _pygad-2101: + +PyGAD 2.10.1 +------------ + +Release Date: 10 January 2021 + +1. In the ``gene_space`` parameter, any ``None`` value (regardless of + its index or axis), is replaced by a randomly generated number based + on the 3 parameters ``init_range_low``, ``init_range_high``, and + ``gene_type``. So, the ``None`` value in ``[..., None, ...]`` or + ``[..., [..., None, ...], ...]`` are replaced with random values. + This gives more freedom in building the space of values for the + genes. + +2. All the numbers passed to the ``gene_space`` parameter are casted to + the type specified in the ``gene_type`` parameter. + +3. The ``numpy.uint`` data type is supported for the parameters that + accept integer values. + +4. In the ``pygad.kerasga`` module, the ``model_weights_as_vector()`` + function uses the ``trainable`` attribute of the model's layers to + only return the trainable weights in the network. So, only the + trainable layers with their ``trainable`` attribute set to ``True`` + (``trainable=True``), which is the default value, have their weights + evolved. All non-trainable layers with the ``trainable`` attribute + set to ``False`` (``trainable=False``) will not be evolved. Thanks to + `Prof. Tamer A. Farrag `__ for + pointing about that at + `GitHub `__. + +.. _pygad-2102: + +PyGAD 2.10.2 +------------ + +Release Date: 15 January 2021 + +1. A bug fix when ``save_best_solutions=True``. Refer to this issue for + more information: + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/25 + +.. _pygad-2110: + +PyGAD 2.11.0 +------------ + +Release Date: 16 February 2021 + +1. In the ``gene_space`` argument, the user can use a dictionary to + specify the lower and upper limits of the gene. This dictionary must + have only 2 items with keys ``low`` and ``high`` to specify the low + and high limits of the gene, respectively. This way, PyGAD takes care + of not exceeding the value limits of the gene. For a problem with + only 2 genes, then using + ``gene_space=[{'low': 1, 'high': 5}, {'low': 0.2, 'high': 0.81}]`` + means the accepted values in the first gene start from 1 (inclusive) + to 5 (exclusive) while the second one has values between 0.2 + (inclusive) and 0.85 (exclusive). For more information, please check + the `Limit the Gene Value + Range `__ + section of the documentation. + +2. The ``plot_result()`` method returns the figure so that the user can + save it. + +3. Bug fixes in copying elements from the gene space. + +4. For a gene with a set of discrete values (more than 1 value) in the + ``gene_space`` parameter like ``[0, 1]``, it was possible that the + gene value may not change after mutation. That is if the current + value is 0, then the randomly selected value could also be 0. Now, it + is verified that the new value is changed. So, if the current value + is 0, then the new value after mutation will not be 0 but 1. + +.. _pygad-2120: + +PyGAD 2.12.0 +------------ + +Release Date: 20 February 2021 + +1. 4 new instance attributes are added to hold temporary results after + each generation: ``last_generation_fitness`` holds the fitness values + of the solutions in the last generation, ``last_generation_parents`` + holds the parents selected from the last generation, + ``last_generation_offspring_crossover`` holds the offspring generated + after applying the crossover in the last generation, and + ``last_generation_offspring_mutation`` holds the offspring generated + after applying the mutation in the last generation. You can access + these attributes inside the ``on_generation()`` method for example. + +2. A bug fixed when the ``initial_population`` parameter is used. The + bug occurred due to a mismatch between the data type of the array + assigned to ``initial_population`` and the gene type in the + ``gene_type`` attribute. Assuming that the array assigned to the + ``initial_population`` parameter is + ``((1, 1), (3, 3), (5, 5), (7, 7))`` which has type ``int``. When + ``gene_type`` is set to ``float``, then the genes will not be float + but casted to ``int`` because the defined array has ``int`` type. The + bug is fixed by forcing the array assigned to ``initial_population`` + to have the data type in the ``gene_type`` attribute. Check the + `issue at + GitHub `__: + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/27 + +Thanks to Andrei Rozanski [PhD Bioinformatics Specialist, Department of +Tissue Dynamics and Regeneration, Max Planck Institute for Biophysical +Chemistry, Germany] for opening my eye to the first change. + +Thanks to `Marios +Giouvanakis `__, +a PhD candidate in Electrical & Computer Engineer, `Aristotle University +of Thessaloniki (Αριστοτέλειο Πανεπιστήμιο Θεσσαλονίκης), +Greece `__, for emailing me about the second +issue. + +.. _pygad-2130: + +PyGAD 2.13.0 +------------- + +Release Date: 12 March 2021 + +1. A new ``bool`` parameter called ``allow_duplicate_genes`` is + supported. If ``True``, which is the default, then a + solution/chromosome may have duplicate gene values. If ``False``, + then each gene will have a unique value in its solution. Check the + `Prevent Duplicates in Gene + Values `__ + section for more details. + +2. The ``last_generation_fitness`` is updated at the end of each + generation not at the beginning. This keeps the fitness values of the + most up-to-date population assigned to the + ``last_generation_fitness`` parameter. + +.. _pygad-2140: + +PyGAD 2.14.0 +------------ + +PyGAD 2.14.0 has an issue that is solved in PyGAD 2.14.1. Please +consider using 2.14.1 not 2.14.0. + +Release Date: 19 May 2021 + +1. `Issue + #40 `__ + is solved. Now, the ``None`` value works with the ``crossover_type`` + and ``mutation_type`` parameters: + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/40 + +2. The ``gene_type`` parameter supports accepting a + ``list/tuple/numpy.ndarray`` of numeric data types for the genes. + This helps to control the data type of each individual gene. + Previously, the ``gene_type`` can be assigned only to a single data + type that is applied for all genes. For more information, check the + `More about the ``gene_type`` + Parameter `__ + section. Thanks to `Rainer + Engel `__ + for asking about this feature in `this + discussion `__: + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/43 + +3. A new ``bool`` attribute named ``gene_type_single`` is added to the + ``pygad.GA`` class. It is ``True`` when there is a single data type + assigned to the ``gene_type`` parameter. When the ``gene_type`` + parameter is assigned a ``list/tuple/numpy.ndarray``, then + ``gene_type_single`` is set to ``False``. + +4. The ``mutation_by_replacement`` flag now has no effect if + ``gene_space`` exists except for the genes with ``None`` values. For + example, for ``gene_space=[None, [5, 6]]`` the + ``mutation_by_replacement`` flag affects only the first gene which + has ``None`` for its value space. + +5. When an element has a value of ``None`` in the ``gene_space`` + parameter (e.g. ``gene_space=[None, [5, 6]]``), then its value will + be randomly generated for each solution rather than being generate + once for all solutions. Previously, the gene with ``None`` value in + ``gene_space`` is the same across all solutions + +6. Some changes in the documentation according to `issue + #32 `__: + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/32 + +.. _pygad-2142: + +PyGAD 2.14.2 +------------ + +Release Date: 27 May 2021 + +1. Some bug fixes when the ``gene_type`` parameter is nested. Thanks to + `Rainer + Engel `__ + for opening `a + discussion `__ + to report this bug: + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/43#discussioncomment-763342 + +`Rainer +Engel `__ +helped a lot in suggesting new features and suggesting enhancements in +2.14.0 to 2.14.2 releases. + +.. _pygad-2143: + +PyGAD 2.14.3 +------------ + +Release Date: 6 June 2021 + +1. Some bug fixes when setting the ``save_best_solutions`` parameter to + ``True``. Previously, the best solution for generation ``i`` was + added into the ``best_solutions`` attribute at generation ``i+1``. + Now, the ``best_solutions`` attribute is updated by each best + solution at its exact generation. + +.. _pygad-2150: + +PyGAD 2.15.0 +------------ + +Release Date: 17 June 2021 + +1. Control the precision of all genes/individual genes. Thanks to + `Rainer `__ for asking about this + feature: + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/43#discussioncomment-763452 + +2. A new attribute named ``last_generation_parents_indices`` holds the + indices of the selected parents in the last generation. + +3. In adaptive mutation, no need to recalculate the fitness values of + the parents selected in the last generation as these values can be + returned based on the ``last_generation_fitness`` and + ``last_generation_parents_indices`` attributes. This speeds-up the + adaptive mutation. + +4. When a sublist has a value of ``None`` in the ``gene_space`` + parameter (e.g. ``gene_space=[[1, 2, 3], [5, 6, None]]``), then its + value will be randomly generated for each solution rather than being + generated once for all solutions. Previously, a value of ``None`` in + a sublist of the ``gene_space`` parameter was identical across all + solutions. + +5. The dictionary assigned to the ``gene_space`` parameter itself or + one of its elements has a new key called ``"step"`` to specify the + step of moving from the start to the end of the range specified by + the 2 existing keys ``"low"`` and ``"high"``. An example is + ``{"low": 0, "high": 30, "step": 2}`` to have only even values for + the gene(s) starting from 0 to 30. For more information, check the + `More about the ``gene_space`` + Parameter `__ + section. + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/48 + +6. A new function called ``predict()`` is added in both the + ``pygad.kerasga`` and ``pygad.torchga`` modules to make predictions. + This makes it easier than using custom code each time a prediction + is to be made. + +7. A new parameter called ``stop_criteria`` allows the user to specify + one or more stop criteria to stop the evolution based on some + conditions. Each criterion is passed as ``str`` which has a stop + word. The current 2 supported words are ``reach`` and ``saturate``. + ``reach`` stops the ``run()`` method if the fitness value is equal + to or greater than a given fitness value. An example for ``reach`` + is ``"reach_40"`` which stops the evolution if the fitness is >= 40. + ``saturate`` means stop the evolution if the fitness saturates for a + given number of consecutive generations. An example for ``saturate`` + is ``"saturate_7"`` which means stop the ``run()`` method if the + fitness does not change for 7 consecutive generations. Thanks to + `Rainer `__ for asking about this + feature: + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/44 + +8. A new bool parameter, defaults to ``False``, named + ``save_solutions`` is added to the constructor of the ``pygad.GA`` + class. If ``True``, then all solutions in each generation are + appended into an attribute called ``solutions`` which is NumPy + array. + +9. The ``plot_result()`` method is renamed to ``plot_fitness()``. The + users should migrate to the new name as the old name will be removed + in the future. + +10. Four new optional parameters are added to the ``plot_fitness()`` + function in the ``pygad.GA`` class which are ``font_size=14``, + ``save_dir=None``, ``color="#3870FF"``, and ``plot_type="plot"``. + Use ``font_size`` to change the font of the plot title and labels. + ``save_dir`` accepts the directory to which the figure is saved. It + defaults to ``None`` which means do not save the figure. ``color`` + changes the color of the plot. ``plot_type`` changes the plot type + which can be either ``"plot"`` (default), ``"scatter"``, or + ``"bar"``. + https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/47 + +11. The default value of the ``title`` parameter in the + ``plot_fitness()`` method is ``"PyGAD - Generation vs. Fitness"`` + rather than ``"PyGAD - Iteration vs. Fitness"``. + +12. A new method named ``plot_new_solution_rate()`` creates, shows, and + returns a figure showing the rate of new/unique solutions explored + in each generation. It accepts the same parameters as in the + ``plot_fitness()`` method. This method only works when + ``save_solutions=True`` in the ``pygad.GA`` class's constructor. + +13. A new method named ``plot_genes()`` creates, shows, and returns a + figure to show how each gene changes per each generation. It accepts + similar parameters like the ``plot_fitness()`` method in addition to + the ``graph_type``, ``fill_color``, and ``solutions`` parameters. + The ``graph_type`` parameter can be either ``"plot"`` (default), + ``"boxplot"``, or ``"histogram"``. ``fill_color`` accepts the fill + color which works when ``graph_type`` is either ``"boxplot"`` or + ``"histogram"``. ``solutions`` can be either ``"all"`` or ``"best"`` + to decide whether all solutions or only best solutions are used. + +14. The ``gene_type`` parameter now supports controlling the precision + of ``float`` data types. For a gene, rather than assigning just the + data type like ``float``, assign a + ``list``/``tuple``/``numpy.ndarray`` with 2 elements where the first + one is the type and the second one is the precision. For example, + ``[float, 2]`` forces a gene with a value like ``0.1234`` to be + ``0.12``. For more information, check the `More about the + ``gene_type`` + Parameter `__ + section. + +.. _pygad-2151: + +PyGAD 2.15.1 +------------ + +Release Date: 18 June 2021 + +1. Fix a bug when ``keep_parents`` is set to a positive integer. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/49 + +.. _pygad-2152: + +PyGAD 2.15.2 +------------ + +Release Date: 18 June 2021 + +1. Fix a bug when using the ``kerasga`` or ``torchga`` modules. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/51 + +.. _pygad-2160: + +PyGAD 2.16.0 +------------ + +Release Date: 19 June 2021 + +1. A user-defined function can be passed to the ``mutation_type``, + ``crossover_type``, and ``parent_selection_type`` parameters in the + ``pygad.GA`` class to create a custom mutation, crossover, and parent + selection operators. Check the `User-Defined Crossover, Mutation, and + Parent Selection + Operators `__ + section for more details. + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/50 + +.. _pygad-2161: + +PyGAD 2.16.1 +------------ + +Release Date: 28 September 2021 + +1. The user can use the ``tqdm`` library to show a progress bar. + https://github.com/ahmedfgad/GeneticAlgorithmPython/discussions/50. + +.. code:: python + + import pygad + import numpy + import tqdm + + equation_inputs = [4,-2,3.5] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + num_generations = 10000 + with tqdm.tqdm(total=num_generations) as pbar: + ga_instance = pygad.GA(num_generations=num_generations, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + on_generation=lambda _: pbar.update(1)) + + ga_instance.run() + + ga_instance.plot_result() + +But this work does not work if the ``ga_instance`` will be pickled (i.e. +the ``save()`` method will be called. + +.. code:: python + + ga_instance.save("test") + +To solve this issue, define a function and pass it to the +``on_generation`` parameter. In the next code, the +``on_generation_progress()`` function is defined which updates the +progress bar. + +.. code:: python + + import pygad + import numpy + import tqdm + + equation_inputs = [4,-2,3.5] + desired_output = 44 + + def fitness_func(ga_instance, solution, solution_idx): + output = numpy.sum(solution * equation_inputs) + fitness = 1.0 / (numpy.abs(output - desired_output) + 0.000001) + return fitness + + def on_generation_progress(ga): + pbar.update(1) + + num_generations = 100 + with tqdm.tqdm(total=num_generations) as pbar: + ga_instance = pygad.GA(num_generations=num_generations, + sol_per_pop=5, + num_parents_mating=2, + num_genes=len(equation_inputs), + fitness_func=fitness_func, + on_generation=on_generation_progress) + + ga_instance.run() + + ga_instance.plot_result() + + ga_instance.save("test") + +1. Solved the issue of unequal length between the ``solutions`` and + ``solutions_fitness`` when the ``save_solutions`` parameter is set to + ``True``. Now, the fitness of the last population is appended to the + ``solutions_fitness`` array. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/64 + +2. There was an issue of getting the length of these 4 variables + (``solutions``, ``solutions_fitness``, ``best_solutions``, and + ``best_solutions_fitness``) doubled after each call of the ``run()`` + method. This is solved by resetting these variables at the beginning + of the ``run()`` method. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/62 + +3. Bug fixes when adaptive mutation is used + (``mutation_type="adaptive"``). + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/65 + +.. _pygad-2162: + +PyGAD 2.16.2 +------------ + +Release Date: 2 February 2022 + +1. A new instance attribute called ``previous_generation_fitness`` added + in the ``pygad.GA`` class. It holds the fitness values of one + generation before the fitness values saved in the + ``last_generation_fitness``. + +2. Issue in the ``cal_pop_fitness()`` method in getting the correct + indices of the previous parents. This is solved by using the previous + generation's fitness saved in the new attribute + ``previous_generation_fitness`` to return the parents' fitness + values. Thanks to Tobias Tischhauser (M.Sc. - `Mitarbeiter Institut + EMS, Departement Technik, OST – Ostschweizer Fachhochschule, + Switzerland `__) + for detecting this bug. + +.. _pygad-2163: + +PyGAD 2.16.3 +------------ + +Release Date: 2 February 2022 + +1. Validate the fitness value returned from the fitness function. An + exception is raised if something is wrong. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/67 + +.. _pygad-2170: + +PyGAD 2.17.0 +------------ + +Release Date: 8 July 2022 + +1. An issue is solved when the ``gene_space`` parameter is given a fixed + value. e.g. gene_space=[range(5), 4]. The second gene's value is + static (4) which causes an exception. + +2. Fixed the issue where the ``allow_duplicate_genes`` parameter did not + work when mutation is disabled (i.e. ``mutation_type=None``). This is + by checking for duplicates after crossover directly. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/39 + +3. Solve an issue in the ``tournament_selection()`` method as the + indices of the selected parents were incorrect. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/89 + +4. Reuse the fitness values of the previously explored solutions rather + than recalculating them. This feature only works if + ``save_solutions=True``. + +5. Parallel processing is supported. This is by the introduction of a + new parameter named ``parallel_processing`` in the constructor of the + ``pygad.GA`` class. Thanks to + `@windowshopr `__ for opening the + issue + `#78 `__ + at GitHub. Check the `Parallel Processing in + PyGAD `__ + section for more information and examples. + +.. _pygad-2180: + +PyGAD 2.18.0 +------------ + +Release Date: 9 September 2022 + +1. Raise an exception if the sum of fitness values is zero while either + roulette wheel or stochastic universal parent selection is used. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/129 + +2. Initialize the value of the ``run_completed`` property to ``False``. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/122 + +3. The values of these properties are no longer reset with each call to + the ``run()`` method + ``self.best_solutions, self.best_solutions_fitness, self.solutions, self.solutions_fitness``: + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/123. Now, + the user can have the flexibility of calling the ``run()`` method + more than once while extending the data collected after each + generation. Another advantage happens when the instance is loaded and + the ``run()`` method is called, as the old fitness value are shown on + the graph alongside with the new fitness values. Read more in this + section: `Continue without Loosing + Progress `__ + +4. Thanks `Prof. Fernando Jiménez + Barrionuevo `__ (Dept. of Information and + Communications Engineering, University of Murcia, Murcia, Spain) for + editing this + `comment `__ + in the code. + https://github.com/ahmedfgad/GeneticAlgorithmPython/commit/5315bbec02777df96ce1ec665c94dece81c440f4 + +5. A bug fixed when ``crossover_type=None``. + +6. Support of elitism selection through a new parameter named + ``keep_elitism``. It defaults to 1 which means for each generation + keep only the best solution in the next generation. If assigned 0, + then it has no effect. Read more in this section: `Elitism + Selection `__. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/74 + +7. A new instance attribute named ``last_generation_elitism`` added to + hold the elitism in the last generation. + +8. A new parameter called ``random_seed`` added to accept a seed for the + random function generators. Credit to this issue + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/70 and + `Prof. Fernando Jiménez Barrionuevo `__. + Read more in this section: `Random + Seed `__. + +9. Editing the ``pygad.TorchGA`` module to make sure the tensor data is + moved from GPU to CPU. Thanks to Rasmus Johansson for opening this + pull request: https://github.com/ahmedfgad/TorchGA/pull/2 + +.. _pygad-2181: + +PyGAD 2.18.1 +------------ + +Release Date: 19 September 2022 + +1. A big fix when ``keep_elitism`` is used. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/132 + +.. _pygad-2182: + +PyGAD 2.18.2 +------------ + +Release Date: 14 February 2023 + +1. Remove ``numpy.int`` and ``numpy.float`` from the list of supported + data types. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/151 + https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/152 + +2. Call the ``on_crossover()`` callback function even if + ``crossover_type`` is ``None``. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/138 + +3. Call the ``on_mutation()`` callback function even if + ``mutation_type`` is ``None``. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/138 + +.. _pygad-2183: + +PyGAD 2.18.3 +------------ + +Release Date: 14 February 2023 + +1. Bug fixes. + +.. _pygad-2190: + +PyGAD 2.19.0 +------------ + +Release Date: 22 February 2023 + +1. A new ``summary()`` method is supported to return a Keras-like + summary of the PyGAD lifecycle. + +2. A new optional parameter called ``fitness_batch_size`` is supported + to calculate the fitness in batches. If it is assigned the value + ``1`` or ``None`` (default), then the normal flow is used where the + fitness function is called for each individual solution. If the + ``fitness_batch_size`` parameter is assigned a value satisfying this + condition ``1 < fitness_batch_size <= sol_per_pop``, then the + solutions are grouped into batches of size ``fitness_batch_size`` + and the fitness function is called once for each batch. In this + case, the fitness function must return a list/tuple/numpy.ndarray + with a length equal to the number of solutions passed. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/136. + +3. The ``cloudpickle`` library + (https://github.com/cloudpipe/cloudpickle) is used instead of the + ``pickle`` library to pickle the ``pygad.GA`` objects. This solves + the issue of having to redefine the functions (e.g. fitness + function). The ``cloudpickle`` library is added as a dependency in + the ``requirements.txt`` file. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/159 + +4. Support of assigning methods to these parameters: ``fitness_func``, + ``crossover_type``, ``mutation_type``, ``parent_selection_type``, + ``on_start``, ``on_fitness``, ``on_parents``, ``on_crossover``, + ``on_mutation``, ``on_generation``, and ``on_stop``. + https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/92 + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/138 + +5. Validating the output of the parent selection, crossover, and + mutation functions. + +6. The built-in parent selection operators return the parent's indices + as a NumPy array. + +7. The outputs of the parent selection, crossover, and mutation + operators must be NumPy arrays. + +8. Fix an issue when ``allow_duplicate_genes=True``. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/39 + +9. Fix an issue creating scatter plots of the solutions' fitness. + +10. Sampling from a ``set()`` is no longer supported in Python 3.11. + Instead, sampling happens from a ``list()``. Thanks ``Marco Brenna`` + for pointing to this issue. + +11. The lifecycle is updated to reflect that the new population's + fitness is calculated at the end of the lifecycle not at the + beginning. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/154#issuecomment-1438739483 + +12. There was an issue when ``save_solutions=True`` that causes the + fitness function to be called for solutions already explored and + have their fitness pre-calculated. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/160 + +13. A new instance attribute named ``last_generation_elitism_indices`` + added to hold the indices of the selected elitism. This attribute + helps to re-use the fitness of the elitism instead of calling the + fitness function. + +14. Fewer calls to the ``best_solution()`` method which in turns saves + some calls to the fitness function. + +15. Some updates in the documentation to give more details about the + ``cal_pop_fitness()`` method. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/79#issuecomment-1439605442 + +.. _pygad-2191: + +PyGAD 2.19.1 +------------ + +Release Date: 22 February 2023 + +1. Add the `cloudpickle `__ + library as a dependency. + +.. _pygad-2192: + +PyGAD 2.19.2 +------------ + +Release Date 23 February 2023 + +1. Fix an issue when parallel processing was used where the elitism + solutions' fitness values are not re-used. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/160#issuecomment-1441718184 + +.. _pygad-300: + +PyGAD 3.0.0 +----------- + +Release Date 8 April 2023 + +1. The structure of the library is changed and some methods defined in + the ``pygad.py`` module are moved to the ``pygad.utils``, + ``pygad.helper``, and ``pygad.visualize`` submodules. + +2. The ``pygad.utils.parent_selection`` module has a class named + ``ParentSelection`` where all the parent selection operators exist. + The ``pygad.GA`` class extends this class. + +3. The ``pygad.utils.crossover`` module has a class named ``Crossover`` + where all the crossover operators exist. The ``pygad.GA`` class + extends this class. + +4. The ``pygad.utils.mutation`` module has a class named ``Mutation`` + where all the mutation operators exist. The ``pygad.GA`` class + extends this class. + +5. The ``pygad.helper.unique`` module has a class named ``Unique`` some + helper methods exist to solve duplicate genes and make sure every + gene is unique. The ``pygad.GA`` class extends this class. + +6. The ``pygad.visualize.plot`` module has a class named ``Plot`` where + all the methods that create plots exist. The ``pygad.GA`` class + extends this class. + +7. Support of using the ``logging`` module to log the outputs to both + the console and text file instead of using the ``print()`` function. + This is by assigning the ``logging.Logger`` to the new ``logger`` + parameter. Check the `Logging + Outputs `__ + for more information. + +8. A new instance attribute called ``logger`` to save the logger. + +9. The function/method passed to the ``fitness_func`` parameter accepts + a new parameter that refers to the instance of the ``pygad.GA`` + class. Check this for an example: `Use Functions and Methods to + Build Fitness Function and + Callbacks `__. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/163 + +10. Update the documentation to include an example of using functions + and methods to calculate the fitness and build callbacks. Check this + for more details: `Use Functions and Methods to Build Fitness + Function and + Callbacks `__. + https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/92#issuecomment-1443635003 + +11. Validate the value passed to the ``initial_population`` parameter. + +12. Validate the type and length of the ``pop_fitness`` parameter of the + ``best_solution()`` method. + +13. Some edits in the documentation. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/106 + +14. Fix an issue when building the initial population as (some) genes + have their value taken from the mutation range (defined by the + parameters ``random_mutation_min_val`` and + ``random_mutation_max_val``) instead of using the parameters + ``init_range_low`` and ``init_range_high``. + +15. The ``summary()`` method returns the summary as a single-line + string. Just log/print the returned string it to see it properly. + +16. The ``callback_generation`` parameter is removed. Use the + ``on_generation`` parameter instead. + +17. There was an issue when using the ``parallel_processing`` parameter + with Keras and PyTorch. As Keras/PyTorch are not thread-safe, the + ``predict()`` method gives incorrect and weird results when more + than 1 thread is used. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/145 + https://github.com/ahmedfgad/TorchGA/issues/5 + https://github.com/ahmedfgad/KerasGA/issues/6. Thanks to this + `StackOverflow + answer `__. + +18. Replace ``numpy.float`` by ``float`` in the 2 parent selection + operators roulette wheel and stochastic universal. + https://github.com/ahmedfgad/GeneticAlgorithmPython/pull/168 + +.. _pygad-301: + +PyGAD 3.0.1 +----------- + +Release Date 20 April 2023 + +1. Fix an issue with passing user-defined function/method for parent + selection. + https://github.com/ahmedfgad/GeneticAlgorithmPython/issues/179 + +PyGAD Projects at GitHub +======================== + +The PyGAD library is available at PyPI at this page +https://pypi.org/project/pygad. PyGAD is built out of a number of +open-source GitHub projects. A brief note about these projects is given +in the next subsections. + +`GeneticAlgorithmPython `__ +-------------------------------------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/GeneticAlgorithmPython + +`GeneticAlgorithmPython `__ +is the first project which is an open-source Python 3 project for +implementing the genetic algorithm based on NumPy. + +`NumPyANN `__ +---------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/NumPyANN + +`NumPyANN `__ builds artificial +neural networks in **Python 3** using **NumPy** from scratch. The +purpose of this project is to only implement the **forward pass** of a +neural network without using a training algorithm. Currently, it only +supports classification and later regression will be also supported. +Moreover, only one class is supported per sample. + +`NeuralGenetic `__ +-------------------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/NeuralGenetic + +`NeuralGenetic `__ trains +neural networks using the genetic algorithm based on the previous 2 +projects +`GeneticAlgorithmPython `__ +and `NumPyANN `__. + +`NumPyCNN `__ +---------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/NumPyCNN + +`NumPyCNN `__ builds +convolutional neural networks using NumPy. The purpose of this project +is to only implement the **forward pass** of a convolutional neural +network without using a training algorithm. + +`CNNGenetic `__ +-------------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/CNNGenetic + +`CNNGenetic `__ trains +convolutional neural networks using the genetic algorithm. It uses the +`GeneticAlgorithmPython `__ +project for building the genetic algorithm. + +`KerasGA `__ +-------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/KerasGA + +`KerasGA `__ trains +`Keras `__ models using the genetic algorithm. It uses +the +`GeneticAlgorithmPython `__ +project for building the genetic algorithm. + +`TorchGA `__ +-------------------------------------------------- + +GitHub Link: https://github.com/ahmedfgad/TorchGA + +`TorchGA `__ trains +`PyTorch `__ models using the genetic algorithm. It +uses the +`GeneticAlgorithmPython `__ +project for building the genetic algorithm. + +`pygad.torchga `__: +https://github.com/ahmedfgad/TorchGA + +Stackoverflow Questions about PyGAD +=================================== + +.. _how-do-i-proceed-to-load-a-gainstance-as-pkl-format-in-pygad: + +`How do I proceed to load a ga_instance as “.pkl” format in PyGad? `__ +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +`Binary Classification NN Model Weights not being Trained in PyGAD `__ +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +`How to solve TSP problem using pyGAD package? `__ +--------------------------------------------------------------------------------------------------------------------------------------------- + +`How can I save a matplotlib plot that is the output of a function in jupyter? `__ +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +`How do I query the best solution of a pyGAD GA instance? `__ +------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +`Multi-Input Multi-Output in Genetic algorithm (python) `__ +-------------------------------------------------------------------------------------------------------------------------------------------------------------- + +https://www.linkedin.com/pulse/validation-short-term-parametric-trading-model-genetic-landolfi + +https://itchef.ru/articles/397758 + +https://audhiaprilliant.medium.com/genetic-algorithm-based-clustering-algorithm-in-searching-robust-initial-centroids-for-k-means-e3b4d892a4be + +https://python.plainenglish.io/validation-of-a-short-term-parametric-trading-model-with-genetic-optimization-and-walk-forward-89708b789af6 + +https://ichi.pro/ko/pygadwa-hamkke-yujeon-algolijeum-eul-sayonghayeo-keras-model-eul-hunlyeonsikineun-bangbeob-173299286377169 + +https://ichi.pro/tr/pygad-ile-genetik-algoritmayi-kullanarak-keras-modelleri-nasil-egitilir-173299286377169 + +https://ichi.pro/ru/kak-obucit-modeli-keras-s-pomos-u-geneticeskogo-algoritma-s-pygad-173299286377169 + +https://blog.csdn.net/sinat_38079265/article/details/108449614 + +Submitting Issues +================= + +If there is an issue using PyGAD, then use any of your preferred option +to discuss that issue. + +One way is `submitting an +issue `__ +into this GitHub project +(`github.com/ahmedfgad/GeneticAlgorithmPython `__) +in case something is not working properly or to ask for questions. + +If this is not a proper option for you, then check the `Contact +Us `__ +section for more contact details. + +Ask for Feature +=============== + +PyGAD is actively developed with the goal of building a dynamic library +for suporting a wide-range of problems to be optimized using the genetic +algorithm. + +To ask for a new feature, either `submit an +issue `__ +into this GitHub project +(`github.com/ahmedfgad/GeneticAlgorithmPython `__) +or send an e-mail to ahmed.f.gad@gmail.com. + +Also check the `Contact +Us `__ +section for more contact details. + +Projects Built using PyGAD +========================== + +If you created a project that uses PyGAD, then we can support you by +mentioning this project here in PyGAD's documentation. + +To do that, please send a message at ahmed.f.gad@gmail.com or check the +`Contact +Us `__ +section for more contact details. + +Within your message, please send the following details: + +- Project title + +- Brief description + +- Preferably, a link that directs the readers to your project + +Tutorials about PyGAD +===================== + +`Adaptive Mutation in Genetic Algorithm with Python Examples `__ +----------------------------------------------------------------------------------------------------------------------------------------------------- + +In this tutorial, we’ll see why mutation with a fixed number of genes is +bad, and how to replace it with adaptive mutation. Using the `PyGAD +Python 3 library `__, we’ll discuss a few +examples that use both random and adaptive mutation. + +`Clustering Using the Genetic Algorithm in Python `__ +------------------------------------------------------------------------------------------------------------------------- + +This tutorial discusses how the genetic algorithm is used to cluster +data, starting from random clusters and running until the optimal +clusters are found. We'll start by briefly revising the K-means +clustering algorithm to point out its weak points, which are later +solved by the genetic algorithm. The code examples in this tutorial are +implemented in Python using the `PyGAD +library `__. + +`Working with Different Genetic Algorithm Representations in Python `__ +-------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +Depending on the nature of the problem being optimized, the genetic +algorithm (GA) supports two different gene representations: binary, and +decimal. The binary GA has only two values for its genes, which are 0 +and 1. This is easier to manage as its gene values are limited compared +to the decimal GA, for which we can use different formats like float or +integer, and limited or unlimited ranges. + +This tutorial discusses how the +`PyGAD `__ library supports the two GA +representations, binary and decimal. + +.. _5-genetic-algorithm-applications-using-pygad: + +`5 Genetic Algorithm Applications Using PyGAD `__ +------------------------------------------------------------------------------------------------------------------------- + +This tutorial introduces PyGAD, an open-source Python library for +implementing the genetic algorithm and training machine learning +algorithms. PyGAD supports 19 parameters for customizing the genetic +algorithm for various applications. + +Within this tutorial we'll discuss 5 different applications of the +genetic algorithm and build them using PyGAD. + +`Train Neural Networks Using a Genetic Algorithm in Python with PyGAD `__ +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +The genetic algorithm (GA) is a biologically-inspired optimization +algorithm. It has in recent years gained importance, as it’s simple +while also solving complex problems like travel route optimization, +training machine learning algorithms, working with single and +multi-objective problems, game playing, and more. + +Deep neural networks are inspired by the idea of how the biological +brain works. It’s a universal function approximator, which is capable of +simulating any function, and is now used to solve the most complex +problems in machine learning. What’s more, they’re able to work with all +types of data (images, audio, video, and text). + +Both genetic algorithms (GAs) and neural networks (NNs) are similar, as +both are biologically-inspired techniques. This similarity motivates us +to create a hybrid of both to see whether a GA can train NNs with high +accuracy. + +This tutorial uses `PyGAD `__, a Python +library that supports building and training NNs using a GA. +`PyGAD `__ offers both classification and +regression NNs. + +`Building a Game-Playing Agent for CoinTex Using the Genetic Algorithm `__ +---------------------------------------------------------------------------------------------------------------------------------------------------------- + +In this tutorial we'll see how to build a game-playing agent using only +the genetic algorithm to play a game called +`CoinTex `__, +which is developed in the Kivy Python framework. The objective of +CoinTex is to collect the randomly distributed coins while avoiding +collision with fire and monsters (that move randomly). The source code +of CoinTex can be found `on +GitHub `__. + +The genetic algorithm is the only AI used here; there is no other +machine/deep learning model used with it. We'll implement the genetic +algorithm using +`PyGad `__. +This tutorial starts with a quick overview of CoinTex followed by a +brief explanation of the genetic algorithm, and how it can be used to +create the playing agent. Finally, we'll see how to implement these +ideas in Python. + +The source code of the genetic algorithm agent is available +`here `__, +and you can download the code used in this tutorial from +`here `__. + +`How To Train Keras Models Using the Genetic Algorithm with PyGAD `__ +-------------------------------------------------------------------------------------------------------------------------------------------------------- + +PyGAD is an open-source Python library for building the genetic +algorithm and training machine learning algorithms. It offers a wide +range of parameters to customize the genetic algorithm to work with +different types of problems. + +PyGAD has its own modules that support building and training neural +networks (NNs) and convolutional neural networks (CNNs). Despite these +modules working well, they are implemented in Python without any +additional optimization measures. This leads to comparatively high +computational times for even simple problems. + +The latest PyGAD version, 2.8.0 (released on 20 September 2020), +supports a new module to train Keras models. Even though Keras is built +in Python, it's fast. The reason is that Keras uses TensorFlow as a +backend, and TensorFlow is highly optimized. + +This tutorial discusses how to train Keras models using PyGAD. The +discussion includes building Keras models using either the Sequential +Model or the Functional API, building an initial population of Keras +model parameters, creating an appropriate fitness function, and more. + +|image1| + +`Train PyTorch Models Using Genetic Algorithm with PyGAD `__ +--------------------------------------------------------------------------------------------------------------------------------------------- + +`PyGAD `__ is a genetic algorithm Python +3 library for solving optimization problems. One of these problems is +training machine learning algorithms. + +PyGAD has a module called +`pygad.kerasga `__. It trains +Keras models using the genetic algorithm. On January 3rd, 2021, a new +release of `PyGAD 2.10.0 `__ brought a +new module called +`pygad.torchga `__ to train +PyTorch models. It’s very easy to use, but there are a few tricky steps. + +So, in this tutorial, we’ll explore how to use PyGAD to train PyTorch +models. + +|image2| + +`A Guide to Genetic ‘Learning’ Algorithms for Optimization `__ +------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +PyGAD in Other Languages +======================== + +French +------ + +`Cómo los algoritmos genéticos pueden competir con el descenso de +gradiente y el +backprop `__ + +Bien que la manière standard d'entraîner les réseaux de neurones soit la +descente de gradient et la rétropropagation, il y a d'autres joueurs +dans le jeu. L'un d'eux est les algorithmes évolutionnaires, tels que +les algorithmes génétiques. + +Utiliser un algorithme génétique pour former un réseau de neurones +simple pour résoudre le OpenAI CartPole Jeu. Dans cet article, nous +allons former un simple réseau de neurones pour résoudre le OpenAI +CartPole . J'utiliserai PyTorch et PyGAD . + +|image3| + +Spanish +------- + +`Cómo los algoritmos genéticos pueden competir con el descenso de +gradiente y el +backprop `__ + +Aunque la forma estandar de entrenar redes neuronales es el descenso de +gradiente y la retropropagacion, hay otros jugadores en el juego, uno de +ellos son los algoritmos evolutivos, como los algoritmos geneticos. + +Usa un algoritmo genetico para entrenar una red neuronal simple para +resolver el Juego OpenAI CartPole. En este articulo, entrenaremos una +red neuronal simple para resolver el OpenAI CartPole . Usare PyTorch y +PyGAD . + +|image4| + +Korean +------ + +`[PyGAD] Python 에서 Genetic Algorithm 을 사용해보기 `__ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +|image5| + +파이썬에서 genetic algorithm을 사용하는 패키지들을 다 사용해보진 +않았지만, 확장성이 있어보이고, 시도할 일이 있어서 살펴봤다. + +이 패키지에서 가장 인상 깊었던 것은 neural network에서 hyper parameter +탐색을 gradient descent 방식이 아닌 GA로도 할 수 있다는 것이다. + +개인적으로 이 부분이 어느정도 초기치를 잘 잡아줄 수 있는 역할로도 쓸 수 +있고, Loss가 gradient descent 하기 어려운 구조에서 대안으로 쓸 수 있을 +것으로도 생각된다. + +일단 큰 흐름은 다음과 같이 된다. + +사실 완전히 흐름이나 각 parameter에 대한 이해는 부족한 상황 + +Turkish +------- + +`PyGAD ile Genetik Algoritmayı Kullanarak Keras Modelleri Nasıl Eğitilir `__ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is a translation of an original English tutorial published at +Paperspace: `How To Train Keras Models Using the Genetic Algorithm with +PyGAD `__ + +PyGAD, genetik algoritma oluşturmak ve makine öğrenimi algoritmalarını +eğitmek için kullanılan açık kaynaklı bir Python kitaplığıdır. Genetik +algoritmayı farklı problem türleri ile çalışacak şekilde özelleştirmek +için çok çeşitli parametreler sunar. + +PyGAD, sinir ağları (NN’ler) ve evrişimli sinir ağları (CNN’ler) +oluşturmayı ve eğitmeyi destekleyen kendi modüllerine sahiptir. Bu +modüllerin iyi çalışmasına rağmen, herhangi bir ek optimizasyon önlemi +olmaksızın Python’da uygulanırlar. Bu, basit problemler için bile +nispeten yüksek hesaplama sürelerine yol açar. + +En son PyGAD sürümü 2.8.0 (20 Eylül 2020'de piyasaya sürüldü), Keras +modellerini eğitmek için yeni bir modülü destekliyor. Keras Python’da +oluşturulmuş olsa da hızlıdır. Bunun nedeni, Keras’ın arka uç olarak +TensorFlow kullanması ve TensorFlow’un oldukça optimize edilmiş +olmasıdır. + +Bu öğreticide, PyGAD kullanılarak Keras modellerinin nasıl eğitileceği +anlatılmaktadır. Tartışma, Sıralı Modeli veya İşlevsel API’yi kullanarak +Keras modellerini oluşturmayı, Keras model parametrelerinin ilk +popülasyonunu oluşturmayı, uygun bir uygunluk işlevi oluşturmayı ve daha +fazlasını içerir. + +|image6| + +Hungarian +--------- + +.. _tensorflow-alapozó-10-neurális-hálózatok-tenyésztése-genetikus-algoritmussal-pygad-és-openai-gym-használatával: + +`Tensorflow alapozó 10. Neurális hálózatok tenyésztése genetikus algoritmussal PyGAD és OpenAI Gym használatával `__ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Hogy kontextusba helyezzem a genetikus algoritmusokat, ismételjük kicsit +át, hogy hogyan működik a gradient descent és a backpropagation, ami a +neurális hálók tanításának általános módszere. Az erről írt cikkemet itt +tudjátok elolvasni. + +A hálózatok tenyésztéséhez a +`PyGAD `__ nevű +programkönyvtárat használjuk, így mindenek előtt ezt kell telepítenünk, +valamint a Tensorflow-t és a Gym-et, amit Colabban már eleve telepítve +kapunk. + +Maga a PyGAD egy teljesen általános genetikus algoritmusok futtatására +képes rendszer. Ennek a kiterjesztése a KerasGA, ami az általános motor +Tensorflow (Keras) neurális hálókon történő futtatását segíti. A 47. +sorban létrehozott KerasGA objektum ennek a kiterjesztésnek a része és +arra szolgál, hogy a paraméterként átadott modellből a második +paraméterben megadott számosságú populációt hozzon létre. Mivel a +hálózatunk 386 állítható paraméterrel rendelkezik, ezért a DNS-ünk itt +386 elemből fog állni. A populáció mérete 10 egyed, így a kezdő +populációnk egy 10x386 elemű mátrix lesz. Ezt adjuk át az 51. sorban az +initial_population paraméterben. + +|image7| + +Russian +------- + +`PyGAD: библиотека для имплементации генетического алгоритма `__ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +PyGAD — это библиотека для имплементации генетического алгоритма. Кроме +того, библиотека предоставляет доступ к оптимизированным реализациям +алгоритмов машинного обучения. PyGAD разрабатывали на Python 3. + +Библиотека PyGAD поддерживает разные типы скрещивания, мутации и +селекции родителя. PyGAD позволяет оптимизировать проблемы с помощью +генетического алгоритма через кастомизацию целевой функции. + +Кроме генетического алгоритма, библиотека содержит оптимизированные +имплементации алгоритмов машинного обучения. На текущий момент PyGAD +поддерживает создание и обучение нейросетей для задач классификации. + +Библиотека находится в стадии активной разработки. Создатели планируют +добавление функционала для решения бинарных задач и имплементации новых +алгоритмов. + +PyGAD разрабатывали на Python 3.7.3. Зависимости включают в себя NumPy +для создания и манипуляции массивами и Matplotlib для визуализации. Один +из изкейсов использования инструмента — оптимизация весов, которые +удовлетворяют заданной функции. + +|image8| + +Research Papers using PyGAD +=========================== + +A number of research papers used PyGAD and here are some of them: + +- Jaros, Marta, and Jiri Jaros. "Performance-Cost Optimization of + Moldable Scientific Workflows." + +- Thorat, Divya. "Enhanced genetic algorithm to reduce makespan of + multiple jobs in map-reduce application on serverless platform". + Diss. Dublin, National College of Ireland, 2020. + +- Koch, Chris, and Edgar Dobriban. "AttenGen: Generating Live + Attenuated Vaccine Candidates using Machine Learning." (2021). + +- Bhardwaj, Bhavya, et al. "Windfarm optimization using Nelder-Mead and + Particle Swarm optimization." *2021 7th International Conference on + Electrical Energy Systems (ICEES)*. IEEE, 2021. + +- Bernardo, Reginald Christian S. and J. Said. “Towards a + model-independent reconstruction approach for late-time Hubble data.” + (2021). + +- Duong, Tri Dung, Qian Li, and Guandong Xu. "Prototype-based + Counterfactual Explanation for Causal Classification." *arXiv + preprint arXiv:2105.00703* (2021). + +- Farrag, Tamer Ahmed, and Ehab E. Elattar. "Optimized Deep Stacked + Long Short-Term Memory Network for Long-Term Load Forecasting." *IEEE + Access* 9 (2021): 68511-68522. + +- Antunes, E. D. O., Caetano, M. F., Marotta, M. A., Araujo, A., + Bondan, L., Meneguette, R. I., & Rocha Filho, G. P. (2021, August). + Soluções Otimizadas para o Problema de Localização de Máxima + Cobertura em Redes Militarizadas 4G/LTE. In *Anais do XXVI Workshop + de Gerência e Operação de Redes e Serviços* (pp. 152-165). SBC. + +- M. Yani, F. Ardilla, A. A. Saputra and N. Kubota, "Gradient-Free Deep + Q-Networks Reinforcement learning: Benchmark and Evaluation," *2021 + IEEE Symposium Series on Computational Intelligence (SSCI)*, 2021, + pp. 1-5, doi: 10.1109/SSCI50451.2021.9659941. + +- Yani, Mohamad, and Naoyuki Kubota. "Deep Convolutional Networks with + Genetic Algorithm for Reinforcement Learning Problem." + +- Mahendra, Muhammad Ihza, and Isman Kurniawan. "Optimizing + Convolutional Neural Network by Using Genetic Algorithm for COVID-19 + Detection in Chest X-Ray Image." *2021 International Conference on + Data Science and Its Applications (ICoDSA)*. IEEE, 2021. + +- Glibota, Vjeko. *Umjeravanje mikroskopskog prometnog modela primjenom + genetskog algoritma*. Diss. University of Zagreb. Faculty of + Transport and Traffic Sciences. Division of Intelligent Transport + Systems and Logistics. Department of Intelligent Transport Systems, + 2021. + +- Zhu, Mingda. *Genetic Algorithm-based Parameter Identification for + Ship Manoeuvring Model under Wind Disturbance*. MS thesis. NTNU, + 2021. + +- Abdalrahman, Ahmed, and Weihua Zhuang. "Dynamic pricing for + differentiated pev charging services using deep reinforcement + learning." *IEEE Transactions on Intelligent Transportation Systems* + (2020). + +More Links +========== + +https://rodriguezanton.com/identifying-contact-states-for-2d-objects-using-pygad-and/ + +https://torvaney.github.io/projects/t9-optimised + +For More Information +==================== + +There are different resources that can be used to get started with the +genetic algorithm and building it in Python. + +Tutorial: Implementing Genetic Algorithm in Python +-------------------------------------------------- + +To start with coding the genetic algorithm, you can check the tutorial +titled `Genetic Algorithm Implementation in +Python `__ +available at these links: + +- `LinkedIn `__ + +- `Towards Data + Science `__ + +- `KDnuggets `__ + +`This +tutorial `__ +is prepared based on a previous version of the project but it still a +good resource to start with coding the genetic algorithm. + +|image9| + +Tutorial: Introduction to Genetic Algorithm +------------------------------------------- + +Get started with the genetic algorithm by reading the tutorial titled +`Introduction to Optimization with Genetic +Algorithm `__ +which is available at these links: + +- `LinkedIn `__ + +- `Towards Data + Science `__ + +- `KDnuggets `__ + +|image10| + +Tutorial: Build Neural Networks in Python +----------------------------------------- + +Read about building neural networks in Python through the tutorial +titled `Artificial Neural Network Implementation using NumPy and +Classification of the Fruits360 Image +Dataset `__ +available at these links: + +- `LinkedIn `__ + +- `Towards Data + Science `__ + +- `KDnuggets `__ + +|image11| + +Tutorial: Optimize Neural Networks with Genetic Algorithm +--------------------------------------------------------- + +Read about training neural networks using the genetic algorithm through +the tutorial titled `Artificial Neural Networks Optimization using +Genetic Algorithm with +Python `__ +available at these links: + +- `LinkedIn `__ + +- `Towards Data + Science `__ + +- `KDnuggets `__ + +|image12| + +Tutorial: Building CNN in Python +-------------------------------- + +To start with coding the genetic algorithm, you can check the tutorial +titled `Building Convolutional Neural Network using NumPy from +Scratch `__ +available at these links: + +- `LinkedIn `__ + +- `Towards Data + Science `__ + +- `KDnuggets `__ + +- `Chinese Translation `__ + +`This +tutorial `__) +is prepared based on a previous version of the project but it still a +good resource to start with coding CNNs. + +|image13| + +Tutorial: Derivation of CNN from FCNN +------------------------------------- + +Get started with the genetic algorithm by reading the tutorial titled +`Derivation of Convolutional Neural Network from Fully Connected Network +Step-By-Step `__ +which is available at these links: + +- `LinkedIn `__ + +- `Towards Data + Science `__ + +- `KDnuggets `__ + +|image14| + +Book: Practical Computer Vision Applications Using Deep Learning with CNNs +-------------------------------------------------------------------------- + +You can also check my book cited as `Ahmed Fawzy Gad 'Practical Computer +Vision Applications Using Deep Learning with CNNs'. Dec. 2018, Apress, +978-1-4842-4167-7 `__ +which discusses neural networks, convolutional neural networks, deep +learning, genetic algorithm, and more. + +Find the book at these links: + +- `Amazon `__ + +- `Springer `__ + +- `Apress `__ + +- `O'Reilly `__ + +- `Google Books `__ + +.. figure:: https://user-images.githubusercontent.com/16560492/78830077-ae7c2800-79e7-11ea-980b-53b6bd879eeb.jpg + :alt: + +Contact Us +========== + +- E-mail: ahmed.f.gad@gmail.com + +- `LinkedIn `__ + +- `Amazon Author Page `__ + +- `Heartbeat `__ + +- `Paperspace `__ + +- `KDnuggets `__ + +- `TowardsDataScience `__ + +- `GitHub `__ + +.. figure:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png + :alt: + +Thank you for using +`PyGAD `__ :) + +.. |image1| image:: https://user-images.githubusercontent.com/16560492/111009628-2b372500-8362-11eb-90cf-01b47d831624.png + :target: https://blog.paperspace.com/train-keras-models-using-genetic-algorithm-with-pygad +.. |image2| image:: https://user-images.githubusercontent.com/16560492/111009678-5457b580-8362-11eb-899a-39e2f96984df.png + :target: https://neptune.ai/blog/train-pytorch-models-using-genetic-algorithm-with-pygad +.. |image3| image:: https://user-images.githubusercontent.com/16560492/111009275-3178d180-8361-11eb-9e86-7fb1519acde7.png + :target: https://www.hebergementwebs.com/nouvelles/comment-les-algorithmes-genetiques-peuvent-rivaliser-avec-la-descente-de-gradient-et-le-backprop +.. |image4| image:: https://user-images.githubusercontent.com/16560492/111009257-232ab580-8361-11eb-99a5-7226efbc3065.png + :target: https://www.hebergementwebs.com/noticias/como-los-algoritmos-geneticos-pueden-competir-con-el-descenso-de-gradiente-y-el-backprop +.. |image5| image:: https://user-images.githubusercontent.com/16560492/108586306-85bd0280-731b-11eb-874c-7ac4ce1326cd.jpg + :target: https://data-newbie.tistory.com/m/685 +.. |image6| image:: https://user-images.githubusercontent.com/16560492/108586601-85be0200-731d-11eb-98a4-161c75a1f099.jpg + :target: https://erencan34.medium.com/pygad-ile-genetik-algoritmay%C4%B1-kullanarak-keras-modelleri-nas%C4%B1l-e%C4%9Fitilir-cf92639a478c +.. |image7| image:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png + :target: https://thebojda.medium.com/tensorflow-alapoz%C3%B3-10-24f7767d4a2c +.. |image8| image:: https://user-images.githubusercontent.com/16560492/101267295-c74c0180-375f-11eb-9ad0-f8e37bd796ce.png + :target: https://neurohive.io/ru/frameworki/pygad-biblioteka-dlya-implementacii-geneticheskogo-algoritma +.. |image9| image:: https://user-images.githubusercontent.com/16560492/78830052-a3c19300-79e7-11ea-8b9b-4b343ea4049c.png + :target: https://www.linkedin.com/pulse/genetic-algorithm-implementation-python-ahmed-gad +.. |image10| image:: https://user-images.githubusercontent.com/16560492/82078259-26252d00-96e1-11ea-9a02-52a99e1054b9.jpg + :target: https://www.linkedin.com/pulse/introduction-optimization-genetic-algorithm-ahmed-gad +.. |image11| image:: https://user-images.githubusercontent.com/16560492/82078281-30472b80-96e1-11ea-8017-6a1f4383d602.jpg + :target: https://www.linkedin.com/pulse/artificial-neural-network-implementation-using-numpy-fruits360-gad +.. |image12| image:: https://user-images.githubusercontent.com/16560492/82078300-376e3980-96e1-11ea-821c-aa6b8ceb44d4.jpg + :target: https://www.linkedin.com/pulse/artificial-neural-networks-optimization-using-genetic-ahmed-gad +.. |image13| image:: https://user-images.githubusercontent.com/16560492/82431022-6c3a1200-9a8e-11ea-8f1b-b055196d76e3.png + :target: https://www.linkedin.com/pulse/building-convolutional-neural-network-using-numpy-from-ahmed-gad +.. |image14| image:: https://user-images.githubusercontent.com/16560492/82431369-db176b00-9a8e-11ea-99bd-e845192873fc.png + :target: https://www.linkedin.com/pulse/derivation-convolutional-neural-network-from-fully-connected-gad diff --git a/docs/source/README_pygad_torchga_ReadTheDocs.rst b/docs/source/torchga.rst similarity index 97% rename from docs/source/README_pygad_torchga_ReadTheDocs.rst rename to docs/source/torchga.rst index 66a554e..e49a2a2 100644 --- a/docs/source/README_pygad_torchga_ReadTheDocs.rst +++ b/docs/source/torchga.rst @@ -1,946 +1,946 @@ -.. _pygadtorchga-module: - -``pygad.torchga`` Module -======================== - -This section of the PyGAD's library documentation discusses the -**pygad.torchga** module. - -The ``pygad.torchga`` module has a helper class and 2 functions to train -PyTorch models using the genetic algorithm (PyGAD). - -The contents of this module are: - -1. ``TorchGA``: A class for creating an initial population of all - parameters in the PyTorch model. - -2. ``model_weights_as_vector()``: A function to reshape the PyTorch - model weights to a single vector. - -3. ``model_weights_as_dict()``: A function to restore the PyTorch model - weights from a vector. - -4. ``predict()``: A function to make predictions based on the PyTorch - model and a solution. - -More details are given in the next sections. - -Steps Summary -============= - -The summary of the steps used to train a PyTorch model using PyGAD is as -follows: - -1. Create a PyTorch model. - -2. Create an instance of the ``pygad.torchga.TorchGA`` class. - -3. Prepare the training data. - -4. Build the fitness function. - -5. Create an instance of the ``pygad.GA`` class. - -6. Run the genetic algorithm. - -Create PyTorch Model -==================== - -Before discussing training a PyTorch model using PyGAD, the first thing -to do is to create the PyTorch model. To get started, please check the -`PyTorch library -documentation `__. - -Here is an example of a PyTorch model. - -.. code:: python - - import torch - - input_layer = torch.nn.Linear(3, 5) - relu_layer = torch.nn.ReLU() - output_layer = torch.nn.Linear(5, 1) - - model = torch.nn.Sequential(input_layer, - relu_layer, - output_layer) - -Feel free to add the layers of your choice. - -.. _pygadtorchgatorchga-class: - -``pygad.torchga.TorchGA`` Class -=============================== - -The ``pygad.torchga`` module has a class named ``TorchGA`` for creating -an initial population for the genetic algorithm based on a PyTorch -model. The constructor, methods, and attributes within the class are -discussed in this section. - -.. _init: - -``__init__()`` --------------- - -The ``pygad.torchga.TorchGA`` class constructor accepts the following -parameters: - -- ``model``: An instance of the PyTorch model. - -- ``num_solutions``: Number of solutions in the population. Each - solution has different parameters of the model. - -Instance Attributes -------------------- - -All parameters in the ``pygad.torchga.TorchGA`` class constructor are -used as instance attributes in addition to adding a new attribute called -``population_weights``. - -Here is a list of all instance attributes: - -- ``model`` - -- ``num_solutions`` - -- ``population_weights``: A nested list holding the weights of all - solutions in the population. - -Methods in the ``TorchGA`` Class --------------------------------- - -This section discusses the methods available for instances of the -``pygad.torchga.TorchGA`` class. - -.. _createpopulation: - -``create_population()`` -~~~~~~~~~~~~~~~~~~~~~~~ - -The ``create_population()`` method creates the initial population of the -genetic algorithm as a list of solutions where each solution represents -different model parameters. The list of networks is assigned to the -``population_weights`` attribute of the instance. - -.. _functions-in-the-pygadtorchga-module: - -Functions in the ``pygad.torchga`` Module -========================================= - -This section discusses the functions in the ``pygad.torchga`` module. - -.. _pygadtorchgamodelweightsasvector: - -``pygad.torchga.model_weights_as_vector()`` --------------------------------------------- - -The ``model_weights_as_vector()`` function accepts a single parameter -named ``model`` representing the PyTorch model. It returns a vector -holding all model weights. The reason for representing the model weights -as a vector is that the genetic algorithm expects all parameters of any -solution to be in a 1D vector form. - -The function accepts the following parameters: - -- ``model``: The PyTorch model. - -It returns a 1D vector holding the model weights. - -.. _pygadtorchmodelweightsasdict: - -``pygad.torch.model_weights_as_dict()`` ---------------------------------------- - -The ``model_weights_as_dict()`` function accepts the following -parameters: - -1. ``model``: The PyTorch model. - -2. ``weights_vector``: The model parameters as a vector. - -It returns the restored model weights in the same form used by the -``state_dict()`` method. The returned dictionary is ready to be passed -to the ``load_state_dict()`` method for setting the PyTorch model's -parameters. - -.. _pygadtorchgapredict: - -``pygad.torchga.predict()`` ---------------------------- - -The ``predict()`` function makes a prediction based on a solution. It -accepts the following parameters: - -1. ``model``: The PyTorch model. - -2. ``solution``: The solution evolved. - -3. ``data``: The test data inputs. - -It returns the predictions for the data samples. - -Examples -======== - -This section gives the complete code of some examples that build and -train a PyTorch model using PyGAD. Each subsection builds a different -network. - -Example 1: Regression Example ------------------------------ - -The next code builds a simple PyTorch model for regression. The next -subsections discuss each part in the code. - -.. code:: python - - import torch - import torchga - import pygad - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, torch_ga, model, loss_function - - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - - abs_error = loss_function(predictions, data_outputs).detach().numpy() + 0.00000001 - - solution_fitness = 1.0 / abs_error - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Create the PyTorch model. - input_layer = torch.nn.Linear(3, 5) - relu_layer = torch.nn.ReLU() - output_layer = torch.nn.Linear(5, 1) - - model = torch.nn.Sequential(input_layer, - relu_layer, - output_layer) - # print(model) - - # Create an instance of the pygad.torchga.TorchGA class to build the initial population. - torch_ga = torchga.TorchGA(model=model, - num_solutions=10) - - loss_function = torch.nn.L1Loss() - - # Data inputs - data_inputs = torch.tensor([[0.02, 0.1, 0.15], - [0.7, 0.6, 0.8], - [1.5, 1.2, 1.7], - [3.2, 2.9, 3.1]]) - - # Data outputs - data_outputs = torch.tensor([[0.1], - [0.6], - [1.3], - [2.5]]) - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 250 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = torch_ga.population_weights # Initial population of network weights - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make predictions based on the best solution. - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - print("Predictions : \n", predictions.detach().numpy()) - - abs_error = loss_function(predictions, data_outputs) - print("Absolute Error : ", abs_error.detach().numpy()) - -Create a PyTorch model -~~~~~~~~~~~~~~~~~~~~~~ - -According to the steps mentioned previously, the first step is to create -a PyTorch model. Here is the code that builds the model using the -Functional API. - -.. code:: python - - import torch - - input_layer = torch.nn.Linear(3, 5) - relu_layer = torch.nn.ReLU() - output_layer = torch.nn.Linear(5, 1) - - model = torch.nn.Sequential(input_layer, - relu_layer, - output_layer) - -.. _create-an-instance-of-the-pygadtorchgatorchga-class: - -Create an Instance of the ``pygad.torchga.TorchGA`` Class -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The second step is to create an instance of the -``pygad.torchga.TorchGA`` class. There are 10 solutions per population. -Change this number according to your needs. - -.. code:: python - - import pygad.torchga - - torch_ga = torchga.TorchGA(model=model, - num_solutions=10) - -.. _prepare-the-training-data-1: - -Prepare the Training Data -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The third step is to prepare the training data inputs and outputs. Here -is an example where there are 4 samples. Each sample has 3 inputs and 1 -output. - -.. code:: python - - import numpy - - # Data inputs - data_inputs = numpy.array([[0.02, 0.1, 0.15], - [0.7, 0.6, 0.8], - [1.5, 1.2, 1.7], - [3.2, 2.9, 3.1]]) - - # Data outputs - data_outputs = numpy.array([[0.1], - [0.6], - [1.3], - [2.5]]) - -Build the Fitness Function -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The fourth step is to build the fitness function. This function must -accept 2 parameters representing the solution and its index within the -population. - -The next fitness function calculates the mean absolute error (MAE) of -the PyTorch model based on the parameters in the solution. The -reciprocal of the MAE is used as the fitness value. Feel free to use any -other loss function to calculate the fitness value. - -.. code:: python - - loss_function = torch.nn.L1Loss() - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, torch_ga, model, loss_function - - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - - abs_error = loss_function(predictions, data_outputs).detach().numpy() + 0.00000001 - - solution_fitness = 1.0 / abs_error - - return solution_fitness - -.. _create-an-instance-of-the-pygadga-class: - -Create an Instance of the ``pygad.GA`` Class -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The fifth step is to instantiate the ``pygad.GA`` class. Note how the -``initial_population`` parameter is assigned to the initial weights of -the PyTorch models. - -For more information, please check the `parameters this class -accepts `__. - -.. code:: python - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 250 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = torch_ga.population_weights # Initial population of network weights - - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - -Run the Genetic Algorithm -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The sixth and last step is to run the genetic algorithm by calling the -``run()`` method. - -.. code:: python - - ga_instance.run() - -After the PyGAD completes its execution, then there is a figure that -shows how the fitness value changes by generation. Call the -``plot_fitness()`` method to show the figure. - -.. code:: python - - ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) - -Here is the figure. - -.. figure:: https://user-images.githubusercontent.com/16560492/103469779-22f5b480-4d37-11eb-80dc-95503065ebb1.png - :alt: - -To get information about the best solution found by PyGAD, use the -``best_solution()`` method. - -.. code:: python - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - -.. code:: python - - Fitness value of the best solution = 145.42425295191546 - Index of the best solution : 0 - -The next code restores the trained model weights using the -``model_weights_as_dict()`` function. The restored weights are used to -calculate the predicted values. - -.. code:: python - - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - print("Predictions : \n", predictions.detach().numpy()) - -.. code:: python - - Predictions : - [[0.08401088] - [0.60939324] - [1.3010881 ] - [2.5010352 ]] - -The next code measures the trained model error. - -.. code:: python - - abs_error = loss_function(predictions, data_outputs) - print("Absolute Error : ", abs_error.detach().numpy()) - -.. code:: - - Absolute Error : 0.006876422 - -Example 2: XOR Binary Classification ------------------------------------- - -The next code creates a PyTorch model to build the XOR binary -classification problem. Let's highlight the changes compared to the -previous example. - -.. code:: python - - import torch - import torchga - import pygad - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, torch_ga, model, loss_function - - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - - solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Create the PyTorch model. - input_layer = torch.nn.Linear(2, 4) - relu_layer = torch.nn.ReLU() - dense_layer = torch.nn.Linear(4, 2) - output_layer = torch.nn.Softmax(1) - - model = torch.nn.Sequential(input_layer, - relu_layer, - dense_layer, - output_layer) - # print(model) - - # Create an instance of the pygad.torchga.TorchGA class to build the initial population. - torch_ga = torchga.TorchGA(model=model, - num_solutions=10) - - loss_function = torch.nn.BCELoss() - - # XOR problem inputs - data_inputs = torch.tensor([[0.0, 0.0], - [0.0, 1.0], - [1.0, 0.0], - [1.0, 1.0]]) - - # XOR problem outputs - data_outputs = torch.tensor([[1.0, 0.0], - [0.0, 1.0], - [0.0, 1.0], - [1.0, 0.0]]) - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 250 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = torch_ga.population_weights # Initial population of network weights. - - # Create an instance of the pygad.GA class - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - # Start the genetic algorithm evolution. - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make predictions based on the best solution. - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - print("Predictions : \n", predictions.detach().numpy()) - - # Calculate the binary crossentropy for the trained model. - print("Binary Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) - - # Calculate the classification accuracy of the trained model. - a = torch.max(predictions, axis=1) - b = torch.max(data_outputs, axis=1) - accuracy = torch.sum(a.indices == b.indices) / len(data_outputs) - print("Accuracy : ", accuracy.detach().numpy()) - -Compared to the previous regression example, here are the changes: - -- The PyTorch model is changed according to the nature of the problem. - Now, it has 2 inputs and 2 outputs with an in-between hidden layer of - 4 neurons. - -.. code:: python - - input_layer = torch.nn.Linear(2, 4) - relu_layer = torch.nn.ReLU() - dense_layer = torch.nn.Linear(4, 2) - output_layer = torch.nn.Softmax(1) - - model = torch.nn.Sequential(input_layer, - relu_layer, - dense_layer, - output_layer) - -- The train data is changed. Note that the output of each sample is a - 1D vector of 2 values, 1 for each class. - -.. code:: python - - # XOR problem inputs - data_inputs = torch.tensor([[0.0, 0.0], - [0.0, 1.0], - [1.0, 0.0], - [1.0, 1.0]]) - - # XOR problem outputs - data_outputs = torch.tensor([[1.0, 0.0], - [0.0, 1.0], - [0.0, 1.0], - [1.0, 0.0]]) - -- The fitness value is calculated based on the binary cross entropy. - -.. code:: python - - loss_function = torch.nn.BCELoss() - -After the previous code completes, the next figure shows how the fitness -value change by generation. - -.. figure:: https://user-images.githubusercontent.com/16560492/103469818-c646c980-4d37-11eb-98c3-d9d591acd5e2.png - :alt: - -Here is some information about the trained model. Its fitness value is -``100000000.0``, loss is ``0.0`` and accuracy is 100%. - -.. code:: python - - Fitness value of the best solution = 100000000.0 - - Index of the best solution : 0 - - Predictions : - [[1.0000000e+00 1.3627675e-10] - [3.8521746e-09 1.0000000e+00] - [4.2789325e-10 1.0000000e+00] - [1.0000000e+00 3.3668417e-09]] - - Binary Crossentropy : 0.0 - - Accuracy : 1.0 - -Example 3: Image Multi-Class Classification (Dense Layers) ----------------------------------------------------------- - -Here is the code. - -.. code:: python - - import torch - import torchga - import pygad - import numpy - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, torch_ga, model, loss_function - - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - - solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Build the PyTorch model using the functional API. - input_layer = torch.nn.Linear(360, 50) - relu_layer = torch.nn.ReLU() - dense_layer = torch.nn.Linear(50, 4) - output_layer = torch.nn.Softmax(1) - - model = torch.nn.Sequential(input_layer, - relu_layer, - dense_layer, - output_layer) - - # Create an instance of the pygad.torchga.TorchGA class to build the initial population. - torch_ga = torchga.TorchGA(model=model, - num_solutions=10) - - loss_function = torch.nn.CrossEntropyLoss() - - # Data inputs - data_inputs = torch.from_numpy(numpy.load("dataset_features.npy")).float() - - # Data outputs - data_outputs = torch.from_numpy(numpy.load("outputs.npy")).long() - # The next 2 lines are equivelant to this Keras function to perform 1-hot encoding: tensorflow.keras.utils.to_categorical(data_outputs) - # temp_outs = numpy.zeros((data_outputs.shape[0], numpy.unique(data_outputs).size), dtype=numpy.uint8) - # temp_outs[numpy.arange(data_outputs.shape[0]), numpy.uint8(data_outputs)] = 1 - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 200 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = torch_ga.population_weights # Initial population of network weights. - - # Create an instance of the pygad.GA class - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - # Start the genetic algorithm evolution. - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Fetch the parameters of the best solution. - best_solution_weights = torchga.model_weights_as_dict(model=model, - weights_vector=solution) - model.load_state_dict(best_solution_weights) - predictions = model(data_inputs) - # print("Predictions : \n", predictions) - - # Calculate the crossentropy loss of the trained model. - print("Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) - - # Calculate the classification accuracy for the trained model. - accuracy = torch.sum(torch.max(predictions, axis=1).indices == data_outputs) / len(data_outputs) - print("Accuracy : ", accuracy.detach().numpy()) - -Compared to the previous binary classification example, this example has -multiple classes (4) and thus the loss is measured using cross entropy. - -.. code:: python - - loss_function = torch.nn.CrossEntropyLoss() - -.. _prepare-the-training-data-2: - -Prepare the Training Data -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Before building and training neural networks, the training data (input -and output) needs to be prepared. The inputs and the outputs of the -training data are NumPy arrays. - -The data used in this example is available as 2 files: - -1. `dataset_features.npy `__: - Data inputs. - https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy - -2. `outputs.npy `__: - Class labels. - https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy - -The data consists of 4 classes of images. The image shape is -``(100, 100, 3)``. The number of training samples is 1962. The feature -vector extracted from each image has a length 360. - -.. code:: python - - import numpy - - data_inputs = numpy.load("dataset_features.npy") - - data_outputs = numpy.load("outputs.npy") - -The next figure shows how the fitness value changes. - -.. figure:: https://user-images.githubusercontent.com/16560492/103469855-5d138600-4d38-11eb-84b1-b5eff8faa7bc.png - :alt: - -Here are some statistics about the trained model. - -.. code:: - - Fitness value of the best solution = 1.3446997034434534 - Index of the best solution : 0 - Crossentropy : 0.74366045 - Accuracy : 1.0 - -Example 4: Image Multi-Class Classification (Conv Layers) ---------------------------------------------------------- - -Compared to the previous example that uses only dense layers, this -example uses convolutional layers to classify the same dataset. - -Here is the complete code. - -.. code:: python - - import torch - import torchga - import pygad - import numpy - - def fitness_func(ga_instance, solution, sol_idx): - global data_inputs, data_outputs, torch_ga, model, loss_function - - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - - solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) - - return solution_fitness - - def callback_generation(ga_instance): - print("Generation = {generation}".format(generation=ga_instance.generations_completed)) - print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) - - # Build the PyTorch model. - input_layer = torch.nn.Conv2d(in_channels=3, out_channels=5, kernel_size=7) - relu_layer1 = torch.nn.ReLU() - max_pool1 = torch.nn.MaxPool2d(kernel_size=5, stride=5) - - conv_layer2 = torch.nn.Conv2d(in_channels=5, out_channels=3, kernel_size=3) - relu_layer2 = torch.nn.ReLU() - - flatten_layer1 = torch.nn.Flatten() - # The value 768 is pre-computed by tracing the sizes of the layers' outputs. - dense_layer1 = torch.nn.Linear(in_features=768, out_features=15) - relu_layer3 = torch.nn.ReLU() - - dense_layer2 = torch.nn.Linear(in_features=15, out_features=4) - output_layer = torch.nn.Softmax(1) - - model = torch.nn.Sequential(input_layer, - relu_layer1, - max_pool1, - conv_layer2, - relu_layer2, - flatten_layer1, - dense_layer1, - relu_layer3, - dense_layer2, - output_layer) - - # Create an instance of the pygad.torchga.TorchGA class to build the initial population. - torch_ga = torchga.TorchGA(model=model, - num_solutions=10) - - loss_function = torch.nn.CrossEntropyLoss() - - # Data inputs - data_inputs = torch.from_numpy(numpy.load("dataset_inputs.npy")).float() - data_inputs = data_inputs.reshape((data_inputs.shape[0], data_inputs.shape[3], data_inputs.shape[1], data_inputs.shape[2])) - - # Data outputs - data_outputs = torch.from_numpy(numpy.load("dataset_outputs.npy")).long() - - # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class - num_generations = 200 # Number of generations. - num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. - initial_population = torch_ga.population_weights # Initial population of network weights. - - # Create an instance of the pygad.GA class - ga_instance = pygad.GA(num_generations=num_generations, - num_parents_mating=num_parents_mating, - initial_population=initial_population, - fitness_func=fitness_func, - on_generation=callback_generation) - - # Start the genetic algorithm evolution. - ga_instance.run() - - # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. - ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) - - # Returning the details of the best solution. - solution, solution_fitness, solution_idx = ga_instance.best_solution() - print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) - print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) - - # Make predictions based on the best solution. - predictions = pygad.torchga.predict(model=model, - solution=solution, - data=data_inputs) - # print("Predictions : \n", predictions) - - # Calculate the crossentropy for the trained model. - print("Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) - - # Calculate the classification accuracy for the trained model. - accuracy = torch.sum(torch.max(predictions, axis=1).indices == data_outputs) / len(data_outputs) - print("Accuracy : ", accuracy.detach().numpy()) - -Compared to the previous example, the only change is that the -architecture uses convolutional and max-pooling layers. The shape of -each input sample is 100x100x3. - -.. code:: python - - input_layer = torch.nn.Conv2d(in_channels=3, out_channels=5, kernel_size=7) - relu_layer1 = torch.nn.ReLU() - max_pool1 = torch.nn.MaxPool2d(kernel_size=5, stride=5) - - conv_layer2 = torch.nn.Conv2d(in_channels=5, out_channels=3, kernel_size=3) - relu_layer2 = torch.nn.ReLU() - - flatten_layer1 = torch.nn.Flatten() - # The value 768 is pre-computed by tracing the sizes of the layers' outputs. - dense_layer1 = torch.nn.Linear(in_features=768, out_features=15) - relu_layer3 = torch.nn.ReLU() - - dense_layer2 = torch.nn.Linear(in_features=15, out_features=4) - output_layer = torch.nn.Softmax(1) - - model = torch.nn.Sequential(input_layer, - relu_layer1, - max_pool1, - conv_layer2, - relu_layer2, - flatten_layer1, - dense_layer1, - relu_layer3, - dense_layer2, - output_layer) - -.. _prepare-the-training-data-3: - -Prepare the Training Data -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The data used in this example is available as 2 files: - -1. `dataset_inputs.npy `__: - Data inputs. - https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy - -2. `dataset_outputs.npy `__: - Class labels. - https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy - -The data consists of 4 classes of images. The image shape is -``(100, 100, 3)`` and there are 20 images per class for a total of 80 -training samples. For more information about the dataset, check the -`Reading the -Data `__ -section of the ``pygad.cnn`` module. - -Simply download these 2 files and read them according to the next code. - -.. code:: python - - import numpy - - data_inputs = numpy.load("dataset_inputs.npy") - - data_outputs = numpy.load("dataset_outputs.npy") - -The next figure shows how the fitness value changes. - -.. figure:: https://user-images.githubusercontent.com/16560492/103469887-c7c4c180-4d38-11eb-98a7-1c5e73e918d0.png - :alt: - -Here are some statistics about the trained model. The model accuracy is -97.5% after the 200 generations. Note that just running the code again -may give different results. - -.. code:: - - Fitness value of the best solution = 1.3009520689219258 - Index of the best solution : 0 - Crossentropy : 0.7686678 - Accuracy : 0.975 +.. _pygadtorchga-module: + +``pygad.torchga`` Module +======================== + +This section of the PyGAD's library documentation discusses the +**pygad.torchga** module. + +The ``pygad.torchga`` module has a helper class and 2 functions to train +PyTorch models using the genetic algorithm (PyGAD). + +The contents of this module are: + +1. ``TorchGA``: A class for creating an initial population of all + parameters in the PyTorch model. + +2. ``model_weights_as_vector()``: A function to reshape the PyTorch + model weights to a single vector. + +3. ``model_weights_as_dict()``: A function to restore the PyTorch model + weights from a vector. + +4. ``predict()``: A function to make predictions based on the PyTorch + model and a solution. + +More details are given in the next sections. + +Steps Summary +============= + +The summary of the steps used to train a PyTorch model using PyGAD is as +follows: + +1. Create a PyTorch model. + +2. Create an instance of the ``pygad.torchga.TorchGA`` class. + +3. Prepare the training data. + +4. Build the fitness function. + +5. Create an instance of the ``pygad.GA`` class. + +6. Run the genetic algorithm. + +Create PyTorch Model +==================== + +Before discussing training a PyTorch model using PyGAD, the first thing +to do is to create the PyTorch model. To get started, please check the +`PyTorch library +documentation `__. + +Here is an example of a PyTorch model. + +.. code:: python + + import torch + + input_layer = torch.nn.Linear(3, 5) + relu_layer = torch.nn.ReLU() + output_layer = torch.nn.Linear(5, 1) + + model = torch.nn.Sequential(input_layer, + relu_layer, + output_layer) + +Feel free to add the layers of your choice. + +.. _pygadtorchgatorchga-class: + +``pygad.torchga.TorchGA`` Class +=============================== + +The ``pygad.torchga`` module has a class named ``TorchGA`` for creating +an initial population for the genetic algorithm based on a PyTorch +model. The constructor, methods, and attributes within the class are +discussed in this section. + +.. _init: + +``__init__()`` +-------------- + +The ``pygad.torchga.TorchGA`` class constructor accepts the following +parameters: + +- ``model``: An instance of the PyTorch model. + +- ``num_solutions``: Number of solutions in the population. Each + solution has different parameters of the model. + +Instance Attributes +------------------- + +All parameters in the ``pygad.torchga.TorchGA`` class constructor are +used as instance attributes in addition to adding a new attribute called +``population_weights``. + +Here is a list of all instance attributes: + +- ``model`` + +- ``num_solutions`` + +- ``population_weights``: A nested list holding the weights of all + solutions in the population. + +Methods in the ``TorchGA`` Class +-------------------------------- + +This section discusses the methods available for instances of the +``pygad.torchga.TorchGA`` class. + +.. _createpopulation: + +``create_population()`` +~~~~~~~~~~~~~~~~~~~~~~~ + +The ``create_population()`` method creates the initial population of the +genetic algorithm as a list of solutions where each solution represents +different model parameters. The list of networks is assigned to the +``population_weights`` attribute of the instance. + +.. _functions-in-the-pygadtorchga-module: + +Functions in the ``pygad.torchga`` Module +========================================= + +This section discusses the functions in the ``pygad.torchga`` module. + +.. _pygadtorchgamodelweightsasvector: + +``pygad.torchga.model_weights_as_vector()`` +-------------------------------------------- + +The ``model_weights_as_vector()`` function accepts a single parameter +named ``model`` representing the PyTorch model. It returns a vector +holding all model weights. The reason for representing the model weights +as a vector is that the genetic algorithm expects all parameters of any +solution to be in a 1D vector form. + +The function accepts the following parameters: + +- ``model``: The PyTorch model. + +It returns a 1D vector holding the model weights. + +.. _pygadtorchmodelweightsasdict: + +``pygad.torch.model_weights_as_dict()`` +--------------------------------------- + +The ``model_weights_as_dict()`` function accepts the following +parameters: + +1. ``model``: The PyTorch model. + +2. ``weights_vector``: The model parameters as a vector. + +It returns the restored model weights in the same form used by the +``state_dict()`` method. The returned dictionary is ready to be passed +to the ``load_state_dict()`` method for setting the PyTorch model's +parameters. + +.. _pygadtorchgapredict: + +``pygad.torchga.predict()`` +--------------------------- + +The ``predict()`` function makes a prediction based on a solution. It +accepts the following parameters: + +1. ``model``: The PyTorch model. + +2. ``solution``: The solution evolved. + +3. ``data``: The test data inputs. + +It returns the predictions for the data samples. + +Examples +======== + +This section gives the complete code of some examples that build and +train a PyTorch model using PyGAD. Each subsection builds a different +network. + +Example 1: Regression Example +----------------------------- + +The next code builds a simple PyTorch model for regression. The next +subsections discuss each part in the code. + +.. code:: python + + import torch + import torchga + import pygad + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + abs_error = loss_function(predictions, data_outputs).detach().numpy() + 0.00000001 + + solution_fitness = 1.0 / abs_error + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Create the PyTorch model. + input_layer = torch.nn.Linear(3, 5) + relu_layer = torch.nn.ReLU() + output_layer = torch.nn.Linear(5, 1) + + model = torch.nn.Sequential(input_layer, + relu_layer, + output_layer) + # print(model) + + # Create an instance of the pygad.torchga.TorchGA class to build the initial population. + torch_ga = torchga.TorchGA(model=model, + num_solutions=10) + + loss_function = torch.nn.L1Loss() + + # Data inputs + data_inputs = torch.tensor([[0.02, 0.1, 0.15], + [0.7, 0.6, 0.8], + [1.5, 1.2, 1.7], + [3.2, 2.9, 3.1]]) + + # Data outputs + data_outputs = torch.tensor([[0.1], + [0.6], + [1.3], + [2.5]]) + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 250 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = torch_ga.population_weights # Initial population of network weights + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make predictions based on the best solution. + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + print("Predictions : \n", predictions.detach().numpy()) + + abs_error = loss_function(predictions, data_outputs) + print("Absolute Error : ", abs_error.detach().numpy()) + +Create a PyTorch model +~~~~~~~~~~~~~~~~~~~~~~ + +According to the steps mentioned previously, the first step is to create +a PyTorch model. Here is the code that builds the model using the +Functional API. + +.. code:: python + + import torch + + input_layer = torch.nn.Linear(3, 5) + relu_layer = torch.nn.ReLU() + output_layer = torch.nn.Linear(5, 1) + + model = torch.nn.Sequential(input_layer, + relu_layer, + output_layer) + +.. _create-an-instance-of-the-pygadtorchgatorchga-class: + +Create an Instance of the ``pygad.torchga.TorchGA`` Class +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The second step is to create an instance of the +``pygad.torchga.TorchGA`` class. There are 10 solutions per population. +Change this number according to your needs. + +.. code:: python + + import pygad.torchga + + torch_ga = torchga.TorchGA(model=model, + num_solutions=10) + +.. _prepare-the-training-data-1: + +Prepare the Training Data +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The third step is to prepare the training data inputs and outputs. Here +is an example where there are 4 samples. Each sample has 3 inputs and 1 +output. + +.. code:: python + + import numpy + + # Data inputs + data_inputs = numpy.array([[0.02, 0.1, 0.15], + [0.7, 0.6, 0.8], + [1.5, 1.2, 1.7], + [3.2, 2.9, 3.1]]) + + # Data outputs + data_outputs = numpy.array([[0.1], + [0.6], + [1.3], + [2.5]]) + +Build the Fitness Function +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The fourth step is to build the fitness function. This function must +accept 2 parameters representing the solution and its index within the +population. + +The next fitness function calculates the mean absolute error (MAE) of +the PyTorch model based on the parameters in the solution. The +reciprocal of the MAE is used as the fitness value. Feel free to use any +other loss function to calculate the fitness value. + +.. code:: python + + loss_function = torch.nn.L1Loss() + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + abs_error = loss_function(predictions, data_outputs).detach().numpy() + 0.00000001 + + solution_fitness = 1.0 / abs_error + + return solution_fitness + +.. _create-an-instance-of-the-pygadga-class: + +Create an Instance of the ``pygad.GA`` Class +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The fifth step is to instantiate the ``pygad.GA`` class. Note how the +``initial_population`` parameter is assigned to the initial weights of +the PyTorch models. + +For more information, please check the `parameters this class +accepts `__. + +.. code:: python + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 250 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = torch_ga.population_weights # Initial population of network weights + + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + +Run the Genetic Algorithm +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The sixth and last step is to run the genetic algorithm by calling the +``run()`` method. + +.. code:: python + + ga_instance.run() + +After the PyGAD completes its execution, then there is a figure that +shows how the fitness value changes by generation. Call the +``plot_fitness()`` method to show the figure. + +.. code:: python + + ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + +Here is the figure. + +.. figure:: https://user-images.githubusercontent.com/16560492/103469779-22f5b480-4d37-11eb-80dc-95503065ebb1.png + :alt: + +To get information about the best solution found by PyGAD, use the +``best_solution()`` method. + +.. code:: python + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +.. code:: python + + Fitness value of the best solution = 145.42425295191546 + Index of the best solution : 0 + +The next code restores the trained model weights using the +``model_weights_as_dict()`` function. The restored weights are used to +calculate the predicted values. + +.. code:: python + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + print("Predictions : \n", predictions.detach().numpy()) + +.. code:: python + + Predictions : + [[0.08401088] + [0.60939324] + [1.3010881 ] + [2.5010352 ]] + +The next code measures the trained model error. + +.. code:: python + + abs_error = loss_function(predictions, data_outputs) + print("Absolute Error : ", abs_error.detach().numpy()) + +.. code:: + + Absolute Error : 0.006876422 + +Example 2: XOR Binary Classification +------------------------------------ + +The next code creates a PyTorch model to build the XOR binary +classification problem. Let's highlight the changes compared to the +previous example. + +.. code:: python + + import torch + import torchga + import pygad + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Create the PyTorch model. + input_layer = torch.nn.Linear(2, 4) + relu_layer = torch.nn.ReLU() + dense_layer = torch.nn.Linear(4, 2) + output_layer = torch.nn.Softmax(1) + + model = torch.nn.Sequential(input_layer, + relu_layer, + dense_layer, + output_layer) + # print(model) + + # Create an instance of the pygad.torchga.TorchGA class to build the initial population. + torch_ga = torchga.TorchGA(model=model, + num_solutions=10) + + loss_function = torch.nn.BCELoss() + + # XOR problem inputs + data_inputs = torch.tensor([[0.0, 0.0], + [0.0, 1.0], + [1.0, 0.0], + [1.0, 1.0]]) + + # XOR problem outputs + data_outputs = torch.tensor([[1.0, 0.0], + [0.0, 1.0], + [0.0, 1.0], + [1.0, 0.0]]) + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 250 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = torch_ga.population_weights # Initial population of network weights. + + # Create an instance of the pygad.GA class + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + # Start the genetic algorithm evolution. + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make predictions based on the best solution. + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + print("Predictions : \n", predictions.detach().numpy()) + + # Calculate the binary crossentropy for the trained model. + print("Binary Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) + + # Calculate the classification accuracy of the trained model. + a = torch.max(predictions, axis=1) + b = torch.max(data_outputs, axis=1) + accuracy = torch.sum(a.indices == b.indices) / len(data_outputs) + print("Accuracy : ", accuracy.detach().numpy()) + +Compared to the previous regression example, here are the changes: + +- The PyTorch model is changed according to the nature of the problem. + Now, it has 2 inputs and 2 outputs with an in-between hidden layer of + 4 neurons. + +.. code:: python + + input_layer = torch.nn.Linear(2, 4) + relu_layer = torch.nn.ReLU() + dense_layer = torch.nn.Linear(4, 2) + output_layer = torch.nn.Softmax(1) + + model = torch.nn.Sequential(input_layer, + relu_layer, + dense_layer, + output_layer) + +- The train data is changed. Note that the output of each sample is a + 1D vector of 2 values, 1 for each class. + +.. code:: python + + # XOR problem inputs + data_inputs = torch.tensor([[0.0, 0.0], + [0.0, 1.0], + [1.0, 0.0], + [1.0, 1.0]]) + + # XOR problem outputs + data_outputs = torch.tensor([[1.0, 0.0], + [0.0, 1.0], + [0.0, 1.0], + [1.0, 0.0]]) + +- The fitness value is calculated based on the binary cross entropy. + +.. code:: python + + loss_function = torch.nn.BCELoss() + +After the previous code completes, the next figure shows how the fitness +value change by generation. + +.. figure:: https://user-images.githubusercontent.com/16560492/103469818-c646c980-4d37-11eb-98c3-d9d591acd5e2.png + :alt: + +Here is some information about the trained model. Its fitness value is +``100000000.0``, loss is ``0.0`` and accuracy is 100%. + +.. code:: python + + Fitness value of the best solution = 100000000.0 + + Index of the best solution : 0 + + Predictions : + [[1.0000000e+00 1.3627675e-10] + [3.8521746e-09 1.0000000e+00] + [4.2789325e-10 1.0000000e+00] + [1.0000000e+00 3.3668417e-09]] + + Binary Crossentropy : 0.0 + + Accuracy : 1.0 + +Example 3: Image Multi-Class Classification (Dense Layers) +---------------------------------------------------------- + +Here is the code. + +.. code:: python + + import torch + import torchga + import pygad + import numpy + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Build the PyTorch model using the functional API. + input_layer = torch.nn.Linear(360, 50) + relu_layer = torch.nn.ReLU() + dense_layer = torch.nn.Linear(50, 4) + output_layer = torch.nn.Softmax(1) + + model = torch.nn.Sequential(input_layer, + relu_layer, + dense_layer, + output_layer) + + # Create an instance of the pygad.torchga.TorchGA class to build the initial population. + torch_ga = torchga.TorchGA(model=model, + num_solutions=10) + + loss_function = torch.nn.CrossEntropyLoss() + + # Data inputs + data_inputs = torch.from_numpy(numpy.load("dataset_features.npy")).float() + + # Data outputs + data_outputs = torch.from_numpy(numpy.load("outputs.npy")).long() + # The next 2 lines are equivelant to this Keras function to perform 1-hot encoding: tensorflow.keras.utils.to_categorical(data_outputs) + # temp_outs = numpy.zeros((data_outputs.shape[0], numpy.unique(data_outputs).size), dtype=numpy.uint8) + # temp_outs[numpy.arange(data_outputs.shape[0]), numpy.uint8(data_outputs)] = 1 + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 200 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = torch_ga.population_weights # Initial population of network weights. + + # Create an instance of the pygad.GA class + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + # Start the genetic algorithm evolution. + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Fetch the parameters of the best solution. + best_solution_weights = torchga.model_weights_as_dict(model=model, + weights_vector=solution) + model.load_state_dict(best_solution_weights) + predictions = model(data_inputs) + # print("Predictions : \n", predictions) + + # Calculate the crossentropy loss of the trained model. + print("Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) + + # Calculate the classification accuracy for the trained model. + accuracy = torch.sum(torch.max(predictions, axis=1).indices == data_outputs) / len(data_outputs) + print("Accuracy : ", accuracy.detach().numpy()) + +Compared to the previous binary classification example, this example has +multiple classes (4) and thus the loss is measured using cross entropy. + +.. code:: python + + loss_function = torch.nn.CrossEntropyLoss() + +.. _prepare-the-training-data-2: + +Prepare the Training Data +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before building and training neural networks, the training data (input +and output) needs to be prepared. The inputs and the outputs of the +training data are NumPy arrays. + +The data used in this example is available as 2 files: + +1. `dataset_features.npy `__: + Data inputs. + https://github.com/ahmedfgad/NumPyANN/blob/master/dataset_features.npy + +2. `outputs.npy `__: + Class labels. + https://github.com/ahmedfgad/NumPyANN/blob/master/outputs.npy + +The data consists of 4 classes of images. The image shape is +``(100, 100, 3)``. The number of training samples is 1962. The feature +vector extracted from each image has a length 360. + +.. code:: python + + import numpy + + data_inputs = numpy.load("dataset_features.npy") + + data_outputs = numpy.load("outputs.npy") + +The next figure shows how the fitness value changes. + +.. figure:: https://user-images.githubusercontent.com/16560492/103469855-5d138600-4d38-11eb-84b1-b5eff8faa7bc.png + :alt: + +Here are some statistics about the trained model. + +.. code:: + + Fitness value of the best solution = 1.3446997034434534 + Index of the best solution : 0 + Crossentropy : 0.74366045 + Accuracy : 1.0 + +Example 4: Image Multi-Class Classification (Conv Layers) +--------------------------------------------------------- + +Compared to the previous example that uses only dense layers, this +example uses convolutional layers to classify the same dataset. + +Here is the complete code. + +.. code:: python + + import torch + import torchga + import pygad + import numpy + + def fitness_func(ga_instance, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) + + return solution_fitness + + def callback_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + + # Build the PyTorch model. + input_layer = torch.nn.Conv2d(in_channels=3, out_channels=5, kernel_size=7) + relu_layer1 = torch.nn.ReLU() + max_pool1 = torch.nn.MaxPool2d(kernel_size=5, stride=5) + + conv_layer2 = torch.nn.Conv2d(in_channels=5, out_channels=3, kernel_size=3) + relu_layer2 = torch.nn.ReLU() + + flatten_layer1 = torch.nn.Flatten() + # The value 768 is pre-computed by tracing the sizes of the layers' outputs. + dense_layer1 = torch.nn.Linear(in_features=768, out_features=15) + relu_layer3 = torch.nn.ReLU() + + dense_layer2 = torch.nn.Linear(in_features=15, out_features=4) + output_layer = torch.nn.Softmax(1) + + model = torch.nn.Sequential(input_layer, + relu_layer1, + max_pool1, + conv_layer2, + relu_layer2, + flatten_layer1, + dense_layer1, + relu_layer3, + dense_layer2, + output_layer) + + # Create an instance of the pygad.torchga.TorchGA class to build the initial population. + torch_ga = torchga.TorchGA(model=model, + num_solutions=10) + + loss_function = torch.nn.CrossEntropyLoss() + + # Data inputs + data_inputs = torch.from_numpy(numpy.load("dataset_inputs.npy")).float() + data_inputs = data_inputs.reshape((data_inputs.shape[0], data_inputs.shape[3], data_inputs.shape[1], data_inputs.shape[2])) + + # Data outputs + data_outputs = torch.from_numpy(numpy.load("dataset_outputs.npy")).long() + + # Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class + num_generations = 200 # Number of generations. + num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. + initial_population = torch_ga.population_weights # Initial population of network weights. + + # Create an instance of the pygad.GA class + ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=callback_generation) + + # Start the genetic algorithm evolution. + ga_instance.run() + + # After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. + ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + + # Returning the details of the best solution. + solution, solution_fitness, solution_idx = ga_instance.best_solution() + print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) + print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + + # Make predictions based on the best solution. + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + # print("Predictions : \n", predictions) + + # Calculate the crossentropy for the trained model. + print("Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) + + # Calculate the classification accuracy for the trained model. + accuracy = torch.sum(torch.max(predictions, axis=1).indices == data_outputs) / len(data_outputs) + print("Accuracy : ", accuracy.detach().numpy()) + +Compared to the previous example, the only change is that the +architecture uses convolutional and max-pooling layers. The shape of +each input sample is 100x100x3. + +.. code:: python + + input_layer = torch.nn.Conv2d(in_channels=3, out_channels=5, kernel_size=7) + relu_layer1 = torch.nn.ReLU() + max_pool1 = torch.nn.MaxPool2d(kernel_size=5, stride=5) + + conv_layer2 = torch.nn.Conv2d(in_channels=5, out_channels=3, kernel_size=3) + relu_layer2 = torch.nn.ReLU() + + flatten_layer1 = torch.nn.Flatten() + # The value 768 is pre-computed by tracing the sizes of the layers' outputs. + dense_layer1 = torch.nn.Linear(in_features=768, out_features=15) + relu_layer3 = torch.nn.ReLU() + + dense_layer2 = torch.nn.Linear(in_features=15, out_features=4) + output_layer = torch.nn.Softmax(1) + + model = torch.nn.Sequential(input_layer, + relu_layer1, + max_pool1, + conv_layer2, + relu_layer2, + flatten_layer1, + dense_layer1, + relu_layer3, + dense_layer2, + output_layer) + +.. _prepare-the-training-data-3: + +Prepare the Training Data +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The data used in this example is available as 2 files: + +1. `dataset_inputs.npy `__: + Data inputs. + https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_inputs.npy + +2. `dataset_outputs.npy `__: + Class labels. + https://github.com/ahmedfgad/NumPyCNN/blob/master/dataset_outputs.npy + +The data consists of 4 classes of images. The image shape is +``(100, 100, 3)`` and there are 20 images per class for a total of 80 +training samples. For more information about the dataset, check the +`Reading the +Data `__ +section of the ``pygad.cnn`` module. + +Simply download these 2 files and read them according to the next code. + +.. code:: python + + import numpy + + data_inputs = numpy.load("dataset_inputs.npy") + + data_outputs = numpy.load("dataset_outputs.npy") + +The next figure shows how the fitness value changes. + +.. figure:: https://user-images.githubusercontent.com/16560492/103469887-c7c4c180-4d38-11eb-98a7-1c5e73e918d0.png + :alt: + +Here are some statistics about the trained model. The model accuracy is +97.5% after the 200 generations. Note that just running the code again +may give different results. + +.. code:: + + Fitness value of the best solution = 1.3009520689219258 + Index of the best solution : 0 + Crossentropy : 0.7686678 + Accuracy : 0.975 diff --git a/examples/KerasGA/XOR_classification.py b/examples/KerasGA/XOR_classification.py new file mode 100644 index 0000000..3b0a1e1 --- /dev/null +++ b/examples/KerasGA/XOR_classification.py @@ -0,0 +1,86 @@ +import tensorflow.keras +import pygad.kerasga +import numpy +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + bce = tensorflow.keras.losses.BinaryCrossentropy() + solution_fitness = 1.0 / (bce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Build the keras model using the functional API. +input_layer = tensorflow.keras.layers.Input(2) +dense_layer = tensorflow.keras.layers.Dense(4, activation="relu")(input_layer) +output_layer = tensorflow.keras.layers.Dense(2, activation="softmax")(dense_layer) + +model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +# Create an instance of the pygad.kerasga.KerasGA class to build the initial population. +keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +# XOR problem inputs +data_inputs = numpy.array([[0.0, 0.0], + [0.0, 1.0], + [1.0, 0.0], + [1.0, 1.0]]) + +# XOR problem outputs +data_outputs = numpy.array([[1.0, 0.0], + [0.0, 1.0], + [0.0, 1.0], + [1.0, 0.0]]) + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 250 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = keras_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) +print("Predictions : \n", predictions) + +# Calculate the binary crossentropy for the trained model. +bce = tensorflow.keras.losses.BinaryCrossentropy() +print("Binary Crossentropy : ", bce(data_outputs, predictions).numpy()) + +# Calculate the classification accuracy for the trained model. +ba = tensorflow.keras.metrics.BinaryAccuracy() +ba.update_state(data_outputs, predictions) +accuracy = ba.result().numpy() +print("Accuracy : ", accuracy) + +# model.compile(optimizer="Adam", loss="mse", metrics=["mae"]) + +# _ = model.fit(x, y, verbose=0) +# r = model.predict(data_inputs) diff --git a/examples/KerasGA/cancer_dataset.py b/examples/KerasGA/cancer_dataset.py new file mode 100644 index 0000000..5aceae6 --- /dev/null +++ b/examples/KerasGA/cancer_dataset.py @@ -0,0 +1,95 @@ +import tensorflow as tf +import tensorflow.keras +import pygad.kerasga +import pygad +import numpy + +def fitness_func(ga_instanse, solution, sol_idx): + global train_data, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=train_data) + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution(ga_instance.last_generation_fitness)[1])) + +# The dataset path. +dataset_path = r'../data/Skin_Cancer_Dataset' + +num_classes = 2 +img_size = 224 + +# Create a simple CNN. This does not gurantee high classification accuracy. +model = tf.keras.models.Sequential() +model.add(tf.keras.layers.Input(shape=(img_size, img_size, 3))) +model.add(tf.keras.layers.Conv2D(32, (3,3), activation="relu", padding="same")) +model.add(tf.keras.layers.MaxPooling2D((2, 2))) +model.add(tf.keras.layers.Flatten()) +model.add(tf.keras.layers.Dropout(rate=0.2)) +model.add(tf.keras.layers.Dense(num_classes, activation="softmax")) + +# Create an instance of the pygad.kerasga.KerasGA class to build the initial population. +keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +train_data = tf.keras.utils.image_dataset_from_directory( + directory=dataset_path, + image_size=(img_size, img_size), + label_mode="categorical", + batch_size=32 +) + +# Get the dataset labels. +# train_data.class_indices +data_outputs = numpy.array([]) +for x, y in train_data: + data_outputs = numpy.concatenate([data_outputs, numpy.argmax(y.numpy(), axis=-1)]) +data_outputs = tf.keras.utils.to_categorical(data_outputs) + +# Check the documentation for more information about the parameters: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +initial_population = keras_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=10, + num_parents_mating=5, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution(ga_instance.last_generation_fitness) +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=train_data) +# print("Predictions : \n", predictions) + +# Calculate the categorical crossentropy for the trained model. +cce = tensorflow.keras.losses.CategoricalCrossentropy() +print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) + +# Calculate the classification accuracy for the trained model. +ca = tensorflow.keras.metrics.CategoricalAccuracy() +ca.update_state(data_outputs, predictions) +accuracy = ca.result().numpy() +print("Accuracy : ", accuracy) + +# model.compile(optimizer="Adam", loss="mse", metrics=["mae"]) + +# _ = model.fit(x, y, verbose=0) +# r = model.predict(data_inputs) diff --git a/examples/KerasGA/cancer_dataset_generator.py b/examples/KerasGA/cancer_dataset_generator.py new file mode 100644 index 0000000..3f8afeb --- /dev/null +++ b/examples/KerasGA/cancer_dataset_generator.py @@ -0,0 +1,89 @@ +import tensorflow as tf +import tensorflow.keras +import pygad.kerasga +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global train_generator, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=train_generator) + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution(ga_instance.last_generation_fitness)[1])) + +# The dataset path. +dataset_path = r'../data/Skin_Cancer_Dataset' + +num_classes = 2 +img_size = 224 + +# Create a simple CNN. This does not gurantee high classification accuracy. +model = tf.keras.models.Sequential() +model.add(tf.keras.layers.Input(shape=(img_size, img_size, 3))) +model.add(tf.keras.layers.Conv2D(32, (3,3), activation="relu", padding="same")) +model.add(tf.keras.layers.MaxPooling2D((2, 2))) +model.add(tf.keras.layers.Flatten()) +model.add(tf.keras.layers.Dropout(rate=0.2)) +model.add(tf.keras.layers.Dense(num_classes, activation="softmax")) + +# Create an instance of the pygad.kerasga.KerasGA class to build the initial population. +keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +data_generator = tf.keras.preprocessing.image.ImageDataGenerator() +train_generator = data_generator.flow_from_directory(dataset_path, + class_mode='categorical', + target_size=(224, 224), + batch_size=32, + shuffle=False) +# train_generator.class_indices +data_outputs = tf.keras.utils.to_categorical(train_generator.labels) + +# Check the documentation for more information about the parameters: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +initial_population = keras_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=10, + num_parents_mating=5, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution(ga_instance.last_generation_fitness) +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=train_generator) +# print("Predictions : \n", predictions) + +# Calculate the categorical crossentropy for the trained model. +cce = tensorflow.keras.losses.CategoricalCrossentropy() +print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) + +# Calculate the classification accuracy for the trained model. +ca = tensorflow.keras.metrics.CategoricalAccuracy() +ca.update_state(data_outputs, predictions) +accuracy = ca.result().numpy() +print("Accuracy : ", accuracy) + +# model.compile(optimizer="Adam", loss="mse", metrics=["mae"]) + +# _ = model.fit(x, y, verbose=0) +# r = model.predict(data_inputs) diff --git a/examples/KerasGA/image_classification_CNN.py b/examples/KerasGA/image_classification_CNN.py new file mode 100644 index 0000000..9fb4563 --- /dev/null +++ b/examples/KerasGA/image_classification_CNN.py @@ -0,0 +1,90 @@ +import tensorflow.keras +import pygad.kerasga +import numpy +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Build the keras model using the functional API. +input_layer = tensorflow.keras.layers.Input(shape=(100, 100, 3)) +conv_layer1 = tensorflow.keras.layers.Conv2D(filters=5, + kernel_size=7, + activation="relu")(input_layer) +max_pool1 = tensorflow.keras.layers.MaxPooling2D(pool_size=(5,5), + strides=5)(conv_layer1) +conv_layer2 = tensorflow.keras.layers.Conv2D(filters=3, + kernel_size=3, + activation="relu")(max_pool1) +flatten_layer = tensorflow.keras.layers.Flatten()(conv_layer2) +dense_layer = tensorflow.keras.layers.Dense(15, activation="relu")(flatten_layer) +output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) + +model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +# Create an instance of the pygad.kerasga.KerasGA class to build the initial population. +keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +# Data inputs +data_inputs = numpy.load("../data/dataset_inputs.npy") + +# Data outputs +data_outputs = numpy.load("../data/dataset_outputs.npy") +data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 200 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = keras_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) +# print("Predictions : \n", predictions) + +# Calculate the categorical crossentropy for the trained model. +cce = tensorflow.keras.losses.CategoricalCrossentropy() +print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) + +# Calculate the classification accuracy for the trained model. +ca = tensorflow.keras.metrics.CategoricalAccuracy() +ca.update_state(data_outputs, predictions) +accuracy = ca.result().numpy() +print("Accuracy : ", accuracy) + +# model.compile(optimizer="Adam", loss="mse", metrics=["mae"]) + +# _ = model.fit(x, y, verbose=0) +# r = model.predict(data_inputs) diff --git a/examples/KerasGA/image_classification_Dense.py b/examples/KerasGA/image_classification_Dense.py new file mode 100644 index 0000000..002e36c --- /dev/null +++ b/examples/KerasGA/image_classification_Dense.py @@ -0,0 +1,82 @@ +import tensorflow.keras +import pygad.kerasga +import numpy +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + cce = tensorflow.keras.losses.CategoricalCrossentropy() + solution_fitness = 1.0 / (cce(data_outputs, predictions).numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Build the keras model using the functional API. +input_layer = tensorflow.keras.layers.Input(360) +dense_layer = tensorflow.keras.layers.Dense(50, activation="relu")(input_layer) +output_layer = tensorflow.keras.layers.Dense(4, activation="softmax")(dense_layer) + +model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +# Create an instance of the pygad.kerasga.KerasGA class to build the initial population. +keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +# Data inputs +data_inputs = numpy.load("../data/dataset_features.npy") + +# Data outputs +data_outputs = numpy.load("../data/outputs.npy") +data_outputs = tensorflow.keras.utils.to_categorical(data_outputs) + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 100 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = keras_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +# Fetch the parameters of the best solution. +predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) +# print("Predictions : \n", predictions) + +# Calculate the categorical crossentropy for the trained model. +cce = tensorflow.keras.losses.CategoricalCrossentropy() +print("Categorical Crossentropy : ", cce(data_outputs, predictions).numpy()) + +# Calculate the classification accuracy for the trained model. +ca = tensorflow.keras.metrics.CategoricalAccuracy() +ca.update_state(data_outputs, predictions) +accuracy = ca.result().numpy() +print("Accuracy : ", accuracy) + +# model.compile(optimizer="Adam", loss="mse", metrics=["mae"]) + +# _ = model.fit(x, y, verbose=0) +# r = model.predict(data_inputs) diff --git a/examples/KerasGA/regression_example.py b/examples/KerasGA/regression_example.py new file mode 100644 index 0000000..2deec1f --- /dev/null +++ b/examples/KerasGA/regression_example.py @@ -0,0 +1,79 @@ +import tensorflow.keras +import pygad.kerasga +import numpy +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, keras_ga, model + + predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) + + mae = tensorflow.keras.losses.MeanAbsoluteError() + abs_error = mae(data_outputs, predictions).numpy() + 0.00000001 + solution_fitness = 1.0 / abs_error + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Create the Keras model. +input_layer = tensorflow.keras.layers.Input(3) +dense_layer1 = tensorflow.keras.layers.Dense(5, activation="relu")(input_layer) +dense_layer1.trainable = False +output_layer = tensorflow.keras.layers.Dense(1, activation="linear")(dense_layer1) + +model = tensorflow.keras.Model(inputs=input_layer, outputs=output_layer) + +keras_ga = pygad.kerasga.KerasGA(model=model, + num_solutions=10) + +# Data inputs +data_inputs = numpy.array([[0.02, 0.1, 0.15], + [0.7, 0.6, 0.8], + [1.5, 1.2, 1.7], + [3.2, 2.9, 3.1]]) + +# Data outputs +data_outputs = numpy.array([[0.1], + [0.6], + [1.3], + [2.5]]) + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 250 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = keras_ga.population_weights # Initial population of network weights + +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & Keras - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.kerasga.predict(model=model, + solution=solution, + data=data_inputs) +print("Predictions : \n", predictions) + +mae = tensorflow.keras.losses.MeanAbsoluteError() +abs_error = mae(data_outputs, predictions).numpy() +print("Absolute Error : ", abs_error) + +# model.compile(optimizer="Adam", loss="mse", metrics=["mae"]) + +# _ = model.fit(x, y, verbose=0) +# r = model.predict(data_inputs) diff --git a/examples/TorchGA/XOR_classification.py b/examples/TorchGA/XOR_classification.py new file mode 100644 index 0000000..f7a2f44 --- /dev/null +++ b/examples/TorchGA/XOR_classification.py @@ -0,0 +1,86 @@ +import torch +import pygad.torchga +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Create the PyTorch model. +input_layer = torch.nn.Linear(2, 4) +relu_layer = torch.nn.ReLU() +dense_layer = torch.nn.Linear(4, 2) +output_layer = torch.nn.Softmax(1) + +model = torch.nn.Sequential(input_layer, + relu_layer, + dense_layer, + output_layer) +# print(model) + +# Create an instance of the pygad.torchga.TorchGA class to build the initial population. +torch_ga = pygad.torchga.TorchGA(model=model, + num_solutions=10) + +loss_function = torch.nn.BCELoss() + +# XOR problem inputs +data_inputs = torch.tensor([[0.0, 0.0], + [0.0, 1.0], + [1.0, 0.0], + [1.0, 1.0]]) + +# XOR problem outputs +data_outputs = torch.tensor([[1.0, 0.0], + [0.0, 1.0], + [0.0, 1.0], + [1.0, 0.0]]) + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 250 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = torch_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + parallel_processing=3, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) +print("Predictions : \n", predictions.detach().numpy()) + +# Calculate the binary crossentropy for the trained model. +print("Binary Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) + +# Calculate the classification accuracy of the trained model. +a = torch.max(predictions, axis=1) +b = torch.max(data_outputs, axis=1) +accuracy = torch.true_divide(torch.sum(a.indices == b.indices), len(data_outputs)) +print("Accuracy : ", accuracy.detach().numpy()) diff --git a/examples/TorchGA/image_classification_CNN.py b/examples/TorchGA/image_classification_CNN.py new file mode 100644 index 0000000..baf1f1b --- /dev/null +++ b/examples/TorchGA/image_classification_CNN.py @@ -0,0 +1,94 @@ +import torch +import pygad.torchga +import pygad +import numpy + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Build the PyTorch model. +input_layer = torch.nn.Conv2d(in_channels=3, out_channels=5, kernel_size=7) +relu_layer1 = torch.nn.ReLU() +max_pool1 = torch.nn.MaxPool2d(kernel_size=5, stride=5) + +conv_layer2 = torch.nn.Conv2d(in_channels=5, out_channels=3, kernel_size=3) +relu_layer2 = torch.nn.ReLU() + +flatten_layer1 = torch.nn.Flatten() +# The value 768 is pre-computed by tracing the sizes of the layers' outputs. +dense_layer1 = torch.nn.Linear(in_features=768, out_features=15) +relu_layer3 = torch.nn.ReLU() + +dense_layer2 = torch.nn.Linear(in_features=15, out_features=4) +output_layer = torch.nn.Softmax(1) + +model = torch.nn.Sequential(input_layer, + relu_layer1, + max_pool1, + conv_layer2, + relu_layer2, + flatten_layer1, + dense_layer1, + relu_layer3, + dense_layer2, + output_layer) + +# Create an instance of the pygad.torchga.TorchGA class to build the initial population. +torch_ga = pygad.torchga.TorchGA(model=model, + num_solutions=10) + +loss_function = torch.nn.CrossEntropyLoss() + +# Data inputs +data_inputs = torch.from_numpy(numpy.load("../data/dataset_inputs.npy")).float() +data_inputs = data_inputs.reshape((data_inputs.shape[0], data_inputs.shape[3], data_inputs.shape[1], data_inputs.shape[2])) + +# Data outputs +data_outputs = torch.from_numpy(numpy.load("../data/dataset_outputs.npy")).long() + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 200 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = torch_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) +# print("Predictions : \n", predictions) + +# Calculate the crossentropy for the trained model. +print("Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) + +# Calculate the classification accuracy for the trained model. +accuracy = torch.true_divide(torch.sum(torch.max(predictions, axis=1).indices == data_outputs), len(data_outputs)) +print("Accuracy : ", accuracy.detach().numpy()) diff --git a/examples/TorchGA/image_classification_Dense.py b/examples/TorchGA/image_classification_Dense.py new file mode 100644 index 0000000..91bb4c1 --- /dev/null +++ b/examples/TorchGA/image_classification_Dense.py @@ -0,0 +1,80 @@ +import torch +import pygad.torchga +import pygad +import numpy + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + + solution_fitness = 1.0 / (loss_function(predictions, data_outputs).detach().numpy() + 0.00000001) + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Build the PyTorch model using the functional API. +input_layer = torch.nn.Linear(360, 50) +relu_layer = torch.nn.ReLU() +dense_layer = torch.nn.Linear(50, 4) +output_layer = torch.nn.Softmax(1) + +model = torch.nn.Sequential(input_layer, + relu_layer, + dense_layer, + output_layer) + +# Create an instance of the pygad.torchga.TorchGA class to build the initial population. +torch_ga = pygad.torchga.TorchGA(model=model, + num_solutions=10) + +loss_function = torch.nn.CrossEntropyLoss() + +# Data inputs +data_inputs = torch.from_numpy(numpy.load("../data/dataset_features.npy")).float() + +# Data outputs +data_outputs = torch.from_numpy(numpy.load("../data/outputs.npy")).long() +# The next 2 lines are equivelant to this Keras function to perform 1-hot encoding: tensorflow.keras.utils.to_categorical(data_outputs) +# temp_outs = numpy.zeros((data_outputs.shape[0], numpy.unique(data_outputs).size), dtype=numpy.uint8) +# temp_outs[numpy.arange(data_outputs.shape[0]), numpy.uint8(data_outputs)] = 1 + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 200 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = torch_ga.population_weights # Initial population of network weights. + +# Create an instance of the pygad.GA class +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +# Start the genetic algorithm evolution. +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) +# print("Predictions : \n", predictions) + +# Calculate the crossentropy loss of the trained model. +print("Crossentropy : ", loss_function(predictions, data_outputs).detach().numpy()) + +# Calculate the classification accuracy for the trained model. +accuracy = torch.true_divide(torch.sum(torch.max(predictions, axis=1).indices == data_outputs), len(data_outputs)) +print("Accuracy : ", accuracy.detach().numpy()) diff --git a/examples/TorchGA/regression_example.py b/examples/TorchGA/regression_example.py new file mode 100644 index 0000000..5bf2fc1 --- /dev/null +++ b/examples/TorchGA/regression_example.py @@ -0,0 +1,76 @@ +import torch +import pygad.torchga +import pygad + +def fitness_func(ga_instanse, solution, sol_idx): + global data_inputs, data_outputs, torch_ga, model, loss_function + + predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) + abs_error = loss_function(predictions, data_outputs).detach().numpy() + 0.00000001 + + solution_fitness = 1.0 / abs_error + + return solution_fitness + +def on_generation(ga_instance): + print("Generation = {generation}".format(generation=ga_instance.generations_completed)) + print("Fitness = {fitness}".format(fitness=ga_instance.best_solution()[1])) + +# Create the PyTorch model. +input_layer = torch.nn.Linear(3, 2) +relu_layer = torch.nn.ReLU() +output_layer = torch.nn.Linear(2, 1) + +model = torch.nn.Sequential(input_layer, + relu_layer, + output_layer) +# print(model) + +# Create an instance of the pygad.torchga.TorchGA class to build the initial population. +torch_ga = pygad.torchga.TorchGA(model=model, + num_solutions=10) + +loss_function = torch.nn.L1Loss() + +# Data inputs +data_inputs = torch.tensor([[0.02, 0.1, 0.15], + [0.7, 0.6, 0.8], + [1.5, 1.2, 1.7], + [3.2, 2.9, 3.1]]) + +# Data outputs +data_outputs = torch.tensor([[0.1], + [0.6], + [1.3], + [2.5]]) + +# Prepare the PyGAD parameters. Check the documentation for more information: https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class +num_generations = 250 # Number of generations. +num_parents_mating = 5 # Number of solutions to be selected as parents in the mating pool. +initial_population = torch_ga.population_weights # Initial population of network weights + +ga_instance = pygad.GA(num_generations=num_generations, + num_parents_mating=num_parents_mating, + initial_population=initial_population, + fitness_func=fitness_func, + on_generation=on_generation) + +ga_instance.run() + +# After the generations complete, some plots are showed that summarize how the outputs/fitness values evolve over generations. +ga_instance.plot_fitness(title="PyGAD & PyTorch - Iteration vs. Fitness", linewidth=4) + +# Returning the details of the best solution. +solution, solution_fitness, solution_idx = ga_instance.best_solution() +print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness)) +print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx)) + +predictions = pygad.torchga.predict(model=model, + solution=solution, + data=data_inputs) +print("Predictions : \n", predictions.detach().numpy()) + +abs_error = loss_function(predictions, data_outputs) +print("Absolute Error : ", abs_error.detach().numpy()) diff --git a/examples/example_clustering_2.py b/examples/clustering/example_clustering_2.py similarity index 97% rename from examples/example_clustering_2.py rename to examples/clustering/example_clustering_2.py index fa14bb7..877e318 100644 --- a/examples/example_clustering_2.py +++ b/examples/clustering/example_clustering_2.py @@ -1,122 +1,122 @@ -import numpy -import matplotlib.pyplot -import pygad - -cluster1_num_samples = 10 -cluster1_x1_start = 0 -cluster1_x1_end = 5 -cluster1_x2_start = 2 -cluster1_x2_end = 6 -cluster1_x1 = numpy.random.random(size=(cluster1_num_samples)) -cluster1_x1 = cluster1_x1 * (cluster1_x1_end - cluster1_x1_start) + cluster1_x1_start -cluster1_x2 = numpy.random.random(size=(cluster1_num_samples)) -cluster1_x2 = cluster1_x2 * (cluster1_x2_end - cluster1_x2_start) + cluster1_x2_start - -cluster2_num_samples = 10 -cluster2_x1_start = 10 -cluster2_x1_end = 15 -cluster2_x2_start = 8 -cluster2_x2_end = 12 -cluster2_x1 = numpy.random.random(size=(cluster2_num_samples)) -cluster2_x1 = cluster2_x1 * (cluster2_x1_end - cluster2_x1_start) + cluster2_x1_start -cluster2_x2 = numpy.random.random(size=(cluster2_num_samples)) -cluster2_x2 = cluster2_x2 * (cluster2_x2_end - cluster2_x2_start) + cluster2_x2_start - -c1 = numpy.array([cluster1_x1, cluster1_x2]).T -c2 = numpy.array([cluster2_x1, cluster2_x2]).T - -data = numpy.concatenate((c1, c2), axis=0) - -matplotlib.pyplot.scatter(cluster1_x1, cluster1_x2) -matplotlib.pyplot.scatter(cluster2_x1, cluster2_x2) -matplotlib.pyplot.title("Optimal Clustering") -matplotlib.pyplot.show() - -def euclidean_distance(X, Y): - """ - Calculate the euclidean distance between X and Y. It accepts: - :X should be a matrix of size (N, f) where N is the number of samples and f is the number of features for each sample. - :Y should be of size f. In other words, it is a single sample. - - Returns a vector of N elements with the distances between the N samples and the Y. - """ - - return numpy.sqrt(numpy.sum(numpy.power(X - Y, 2), axis=1)) - -def cluster_data(solution, solution_idx): - """ - Clusters the data based on the current solution. - """ - - global num_cluster, data - feature_vector_length = data.shape[1] - cluster_centers = [] # A list of size (C, f) where C is the number of clusters and f is the number of features representing each sample. - all_clusters_dists = [] # A list of size (C, N) where C is the number of clusters and N is the number of data samples. It holds the distances between each cluster center and all the data samples. - clusters = [] # A list with C elements where each element holds the indices of the samples within a cluster. - clusters_sum_dist = [] # A list with C elements where each element represents the sum of distances of the samples with a cluster. - - for clust_idx in range(num_clusters): - # Return the current cluster center. - cluster_centers.append(solution[feature_vector_length*clust_idx:feature_vector_length*(clust_idx+1)]) - # Calculate the distance (e.g. euclidean) between the current cluster center and all samples. - cluster_center_dists = euclidean_distance(data, cluster_centers[clust_idx]) - all_clusters_dists.append(numpy.array(cluster_center_dists)) - - cluster_centers = numpy.array(cluster_centers) - all_clusters_dists = numpy.array(all_clusters_dists) - - # A 1D array that, for each sample, holds the index of the cluster with the smallest distance. - # In other words, the array holds the sample's cluster index. - cluster_indices = numpy.argmin(all_clusters_dists, axis=0) - for clust_idx in range(num_clusters): - clusters.append(numpy.where(cluster_indices == clust_idx)[0]) - # Calculate the sum of distances for the cluster. - if len(clusters[clust_idx]) == 0: - # In case the cluster is empty (i.e. has zero samples). - clusters_sum_dist.append(0) - else: - # When the cluster is not empty (i.e. has at least 1 sample). - clusters_sum_dist.append(numpy.sum(all_clusters_dists[clust_idx, clusters[clust_idx]])) - # clusters_sum_dist.append(numpy.sum(euclidean_distance(data[clusters[clust_idx], :], cluster_centers[clust_idx]))) - - clusters_sum_dist = numpy.array(clusters_sum_dist) - - return cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist - -def fitness_func(ga_instance, solution, solution_idx): - _, _, _, _, clusters_sum_dist = cluster_data(solution, solution_idx) - - # The tiny value 0.00000001 is added to the denominator in case the average distance is 0. - fitness = 1.0 / (numpy.sum(clusters_sum_dist) + 0.00000001) - - return fitness - -num_clusters = 2 -num_genes = num_clusters * data.shape[1] - -ga_instance = pygad.GA(num_generations=100, - sol_per_pop=10, - num_parents_mating=5, - init_range_low=-6, - init_range_high=20, - keep_parents=2, - num_genes=num_genes, - fitness_func=fitness_func, - suppress_warnings=True) - -ga_instance.run() - -best_solution, best_solution_fitness, best_solution_idx = ga_instance.best_solution() -print("Best solution is {bs}".format(bs=best_solution)) -print("Fitness of the best solution is {bsf}".format(bsf=best_solution_fitness)) -print("Best solution found after {gen} generations".format(gen=ga_instance.best_solution_generation)) - -cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist = cluster_data(best_solution, best_solution_idx) - -for cluster_idx in range(num_clusters): - cluster_x = data[clusters[cluster_idx], 0] - cluster_y = data[clusters[cluster_idx], 1] - matplotlib.pyplot.scatter(cluster_x, cluster_y) - matplotlib.pyplot.scatter(cluster_centers[cluster_idx, 0], cluster_centers[cluster_idx, 1], linewidths=5) -matplotlib.pyplot.title("Clustering using PyGAD") -matplotlib.pyplot.show() +import numpy +import matplotlib.pyplot +import pygad + +cluster1_num_samples = 10 +cluster1_x1_start = 0 +cluster1_x1_end = 5 +cluster1_x2_start = 2 +cluster1_x2_end = 6 +cluster1_x1 = numpy.random.random(size=(cluster1_num_samples)) +cluster1_x1 = cluster1_x1 * (cluster1_x1_end - cluster1_x1_start) + cluster1_x1_start +cluster1_x2 = numpy.random.random(size=(cluster1_num_samples)) +cluster1_x2 = cluster1_x2 * (cluster1_x2_end - cluster1_x2_start) + cluster1_x2_start + +cluster2_num_samples = 10 +cluster2_x1_start = 10 +cluster2_x1_end = 15 +cluster2_x2_start = 8 +cluster2_x2_end = 12 +cluster2_x1 = numpy.random.random(size=(cluster2_num_samples)) +cluster2_x1 = cluster2_x1 * (cluster2_x1_end - cluster2_x1_start) + cluster2_x1_start +cluster2_x2 = numpy.random.random(size=(cluster2_num_samples)) +cluster2_x2 = cluster2_x2 * (cluster2_x2_end - cluster2_x2_start) + cluster2_x2_start + +c1 = numpy.array([cluster1_x1, cluster1_x2]).T +c2 = numpy.array([cluster2_x1, cluster2_x2]).T + +data = numpy.concatenate((c1, c2), axis=0) + +matplotlib.pyplot.scatter(cluster1_x1, cluster1_x2) +matplotlib.pyplot.scatter(cluster2_x1, cluster2_x2) +matplotlib.pyplot.title("Optimal Clustering") +matplotlib.pyplot.show() + +def euclidean_distance(X, Y): + """ + Calculate the euclidean distance between X and Y. It accepts: + :X should be a matrix of size (N, f) where N is the number of samples and f is the number of features for each sample. + :Y should be of size f. In other words, it is a single sample. + + Returns a vector of N elements with the distances between the N samples and the Y. + """ + + return numpy.sqrt(numpy.sum(numpy.power(X - Y, 2), axis=1)) + +def cluster_data(solution, solution_idx): + """ + Clusters the data based on the current solution. + """ + + global num_cluster, data + feature_vector_length = data.shape[1] + cluster_centers = [] # A list of size (C, f) where C is the number of clusters and f is the number of features representing each sample. + all_clusters_dists = [] # A list of size (C, N) where C is the number of clusters and N is the number of data samples. It holds the distances between each cluster center and all the data samples. + clusters = [] # A list with C elements where each element holds the indices of the samples within a cluster. + clusters_sum_dist = [] # A list with C elements where each element represents the sum of distances of the samples with a cluster. + + for clust_idx in range(num_clusters): + # Return the current cluster center. + cluster_centers.append(solution[feature_vector_length*clust_idx:feature_vector_length*(clust_idx+1)]) + # Calculate the distance (e.g. euclidean) between the current cluster center and all samples. + cluster_center_dists = euclidean_distance(data, cluster_centers[clust_idx]) + all_clusters_dists.append(numpy.array(cluster_center_dists)) + + cluster_centers = numpy.array(cluster_centers) + all_clusters_dists = numpy.array(all_clusters_dists) + + # A 1D array that, for each sample, holds the index of the cluster with the smallest distance. + # In other words, the array holds the sample's cluster index. + cluster_indices = numpy.argmin(all_clusters_dists, axis=0) + for clust_idx in range(num_clusters): + clusters.append(numpy.where(cluster_indices == clust_idx)[0]) + # Calculate the sum of distances for the cluster. + if len(clusters[clust_idx]) == 0: + # In case the cluster is empty (i.e. has zero samples). + clusters_sum_dist.append(0) + else: + # When the cluster is not empty (i.e. has at least 1 sample). + clusters_sum_dist.append(numpy.sum(all_clusters_dists[clust_idx, clusters[clust_idx]])) + # clusters_sum_dist.append(numpy.sum(euclidean_distance(data[clusters[clust_idx], :], cluster_centers[clust_idx]))) + + clusters_sum_dist = numpy.array(clusters_sum_dist) + + return cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist + +def fitness_func(ga_instance, solution, solution_idx): + _, _, _, _, clusters_sum_dist = cluster_data(solution, solution_idx) + + # The tiny value 0.00000001 is added to the denominator in case the average distance is 0. + fitness = 1.0 / (numpy.sum(clusters_sum_dist) + 0.00000001) + + return fitness + +num_clusters = 2 +num_genes = num_clusters * data.shape[1] + +ga_instance = pygad.GA(num_generations=100, + sol_per_pop=10, + num_parents_mating=5, + init_range_low=-6, + init_range_high=20, + keep_parents=2, + num_genes=num_genes, + fitness_func=fitness_func, + suppress_warnings=True) + +ga_instance.run() + +best_solution, best_solution_fitness, best_solution_idx = ga_instance.best_solution() +print("Best solution is {bs}".format(bs=best_solution)) +print("Fitness of the best solution is {bsf}".format(bsf=best_solution_fitness)) +print("Best solution found after {gen} generations".format(gen=ga_instance.best_solution_generation)) + +cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist = cluster_data(best_solution, best_solution_idx) + +for cluster_idx in range(num_clusters): + cluster_x = data[clusters[cluster_idx], 0] + cluster_y = data[clusters[cluster_idx], 1] + matplotlib.pyplot.scatter(cluster_x, cluster_y) + matplotlib.pyplot.scatter(cluster_centers[cluster_idx, 0], cluster_centers[cluster_idx, 1], linewidths=5) +matplotlib.pyplot.title("Clustering using PyGAD") +matplotlib.pyplot.show() diff --git a/examples/example_clustering_3.py b/examples/clustering/example_clustering_3.py similarity index 97% rename from examples/example_clustering_3.py rename to examples/clustering/example_clustering_3.py index 08e3dd7..608d54b 100644 --- a/examples/example_clustering_3.py +++ b/examples/clustering/example_clustering_3.py @@ -1,134 +1,134 @@ -import numpy -import matplotlib.pyplot -import pygad - -cluster1_num_samples = 20 -cluster1_x1_start = 0 -cluster1_x1_end = 5 -cluster1_x2_start = 2 -cluster1_x2_end = 6 -cluster1_x1 = numpy.random.random(size=(cluster1_num_samples)) -cluster1_x1 = cluster1_x1 * (cluster1_x1_end - cluster1_x1_start) + cluster1_x1_start -cluster1_x2 = numpy.random.random(size=(cluster1_num_samples)) -cluster1_x2 = cluster1_x2 * (cluster1_x2_end - cluster1_x2_start) + cluster1_x2_start - -cluster2_num_samples = 20 -cluster2_x1_start = 4 -cluster2_x1_end = 12 -cluster2_x2_start = 14 -cluster2_x2_end = 18 -cluster2_x1 = numpy.random.random(size=(cluster2_num_samples)) -cluster2_x1 = cluster2_x1 * (cluster2_x1_end - cluster2_x1_start) + cluster2_x1_start -cluster2_x2 = numpy.random.random(size=(cluster2_num_samples)) -cluster2_x2 = cluster2_x2 * (cluster2_x2_end - cluster2_x2_start) + cluster2_x2_start - -cluster3_num_samples = 20 -cluster3_x1_start = 12 -cluster3_x1_end = 18 -cluster3_x2_start = 8 -cluster3_x2_end = 11 -cluster3_x1 = numpy.random.random(size=(cluster3_num_samples)) -cluster3_x1 = cluster3_x1 * (cluster3_x1_end - cluster3_x1_start) + cluster3_x1_start -cluster3_x2 = numpy.random.random(size=(cluster3_num_samples)) -cluster3_x2 = cluster3_x2 * (cluster3_x2_end - cluster3_x2_start) + cluster3_x2_start - -c1 = numpy.array([cluster1_x1, cluster1_x2]).T -c2 = numpy.array([cluster2_x1, cluster2_x2]).T -c3 = numpy.array([cluster3_x1, cluster3_x2]).T - -data = numpy.concatenate((c1, c2, c3), axis=0) - -matplotlib.pyplot.scatter(cluster1_x1, cluster1_x2) -matplotlib.pyplot.scatter(cluster2_x1, cluster2_x2) -matplotlib.pyplot.scatter(cluster3_x1, cluster3_x2) -matplotlib.pyplot.title("Optimal Clustering") -matplotlib.pyplot.show() - -def euclidean_distance(X, Y): - """ - Calculate the euclidean distance between X and Y. It accepts: - :X should be a matrix of size (N, f) where N is the number of samples and f is the number of features for each sample. - :Y should be of size f. In other words, it is a single sample. - - Returns a vector of N elements with the distances between the N samples and the Y. - """ - - return numpy.sqrt(numpy.sum(numpy.power(X - Y, 2), axis=1)) - -def cluster_data(solution, solution_idx): - """ - Clusters the data based on the current solution. - """ - - global num_clusters, feature_vector_length, data - cluster_centers = [] # A list of size (C, f) where C is the number of clusters and f is the number of features representing each sample. - all_clusters_dists = [] # A list of size (C, N) where C is the number of clusters and N is the number of data samples. It holds the distances between each cluster center and all the data samples. - clusters = [] # A list with C elements where each element holds the indices of the samples within a cluster. - clusters_sum_dist = [] # A list with C elements where each element represents the sum of distances of the samples with a cluster. - - for clust_idx in range(num_clusters): - # Return the current cluster center. - cluster_centers.append(solution[feature_vector_length*clust_idx:feature_vector_length*(clust_idx+1)]) - # Calculate the distance (e.g. euclidean) between the current cluster center and all samples. - cluster_center_dists = euclidean_distance(data, cluster_centers[clust_idx]) - all_clusters_dists.append(numpy.array(cluster_center_dists)) - - cluster_centers = numpy.array(cluster_centers) - all_clusters_dists = numpy.array(all_clusters_dists) - - # A 1D array that, for each sample, holds the index of the cluster with the smallest distance. - # In other words, the array holds the sample's cluster index. - cluster_indices = numpy.argmin(all_clusters_dists, axis=0) - for clust_idx in range(num_clusters): - clusters.append(numpy.where(cluster_indices == clust_idx)[0]) - # Calculate the sum of distances for the cluster. - if len(clusters[clust_idx]) == 0: - # In case the cluster is empty (i.e. has zero samples). - clusters_sum_dist.append(0) - else: - # When the cluster is not empty (i.e. has at least 1 sample). - clusters_sum_dist.append(numpy.sum(all_clusters_dists[clust_idx, clusters[clust_idx]])) - # clusters_sum_dist.append(numpy.sum(euclidean_distance(data[clusters[clust_idx], :], cluster_centers[clust_idx]))) - - clusters_sum_dist = numpy.array(clusters_sum_dist) - - return cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist - -def fitness_func(ga_instance, solution, solution_idx): - _, _, _, _, clusters_sum_dist = cluster_data(solution, solution_idx) - - # The tiny value 0.00000001 is added to the denominator in case the average distance is 0. - fitness = 1.0 / (numpy.sum(clusters_sum_dist) + 0.00000001) - - return fitness - -num_clusters = 3 -feature_vector_length = data.shape[1] -num_genes = num_clusters * feature_vector_length - -ga_instance = pygad.GA(num_generations=100, - sol_per_pop=10, - init_range_low=0, - init_range_high=20, - num_parents_mating=5, - keep_parents=2, - num_genes=num_genes, - fitness_func=fitness_func, - suppress_warnings=True) - -ga_instance.run() - -best_solution, best_solution_fitness, best_solution_idx = ga_instance.best_solution() -print("Best solution is {bs}".format(bs=best_solution)) -print("Fitness of the best solution is {bsf}".format(bsf=best_solution_fitness)) -print("Best solution found after {gen} generations".format(gen=ga_instance.best_solution_generation)) - -cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist = cluster_data(best_solution, best_solution_idx) - -for cluster_idx in range(num_clusters): - cluster_x = data[clusters[cluster_idx], 0] - cluster_y = data[clusters[cluster_idx], 1] - matplotlib.pyplot.scatter(cluster_x, cluster_y) - matplotlib.pyplot.scatter(cluster_centers[cluster_idx, 0], cluster_centers[cluster_idx, 1], linewidths=5) -matplotlib.pyplot.title("Clustering using PyGAD") -matplotlib.pyplot.show() +import numpy +import matplotlib.pyplot +import pygad + +cluster1_num_samples = 20 +cluster1_x1_start = 0 +cluster1_x1_end = 5 +cluster1_x2_start = 2 +cluster1_x2_end = 6 +cluster1_x1 = numpy.random.random(size=(cluster1_num_samples)) +cluster1_x1 = cluster1_x1 * (cluster1_x1_end - cluster1_x1_start) + cluster1_x1_start +cluster1_x2 = numpy.random.random(size=(cluster1_num_samples)) +cluster1_x2 = cluster1_x2 * (cluster1_x2_end - cluster1_x2_start) + cluster1_x2_start + +cluster2_num_samples = 20 +cluster2_x1_start = 4 +cluster2_x1_end = 12 +cluster2_x2_start = 14 +cluster2_x2_end = 18 +cluster2_x1 = numpy.random.random(size=(cluster2_num_samples)) +cluster2_x1 = cluster2_x1 * (cluster2_x1_end - cluster2_x1_start) + cluster2_x1_start +cluster2_x2 = numpy.random.random(size=(cluster2_num_samples)) +cluster2_x2 = cluster2_x2 * (cluster2_x2_end - cluster2_x2_start) + cluster2_x2_start + +cluster3_num_samples = 20 +cluster3_x1_start = 12 +cluster3_x1_end = 18 +cluster3_x2_start = 8 +cluster3_x2_end = 11 +cluster3_x1 = numpy.random.random(size=(cluster3_num_samples)) +cluster3_x1 = cluster3_x1 * (cluster3_x1_end - cluster3_x1_start) + cluster3_x1_start +cluster3_x2 = numpy.random.random(size=(cluster3_num_samples)) +cluster3_x2 = cluster3_x2 * (cluster3_x2_end - cluster3_x2_start) + cluster3_x2_start + +c1 = numpy.array([cluster1_x1, cluster1_x2]).T +c2 = numpy.array([cluster2_x1, cluster2_x2]).T +c3 = numpy.array([cluster3_x1, cluster3_x2]).T + +data = numpy.concatenate((c1, c2, c3), axis=0) + +matplotlib.pyplot.scatter(cluster1_x1, cluster1_x2) +matplotlib.pyplot.scatter(cluster2_x1, cluster2_x2) +matplotlib.pyplot.scatter(cluster3_x1, cluster3_x2) +matplotlib.pyplot.title("Optimal Clustering") +matplotlib.pyplot.show() + +def euclidean_distance(X, Y): + """ + Calculate the euclidean distance between X and Y. It accepts: + :X should be a matrix of size (N, f) where N is the number of samples and f is the number of features for each sample. + :Y should be of size f. In other words, it is a single sample. + + Returns a vector of N elements with the distances between the N samples and the Y. + """ + + return numpy.sqrt(numpy.sum(numpy.power(X - Y, 2), axis=1)) + +def cluster_data(solution, solution_idx): + """ + Clusters the data based on the current solution. + """ + + global num_clusters, feature_vector_length, data + cluster_centers = [] # A list of size (C, f) where C is the number of clusters and f is the number of features representing each sample. + all_clusters_dists = [] # A list of size (C, N) where C is the number of clusters and N is the number of data samples. It holds the distances between each cluster center and all the data samples. + clusters = [] # A list with C elements where each element holds the indices of the samples within a cluster. + clusters_sum_dist = [] # A list with C elements where each element represents the sum of distances of the samples with a cluster. + + for clust_idx in range(num_clusters): + # Return the current cluster center. + cluster_centers.append(solution[feature_vector_length*clust_idx:feature_vector_length*(clust_idx+1)]) + # Calculate the distance (e.g. euclidean) between the current cluster center and all samples. + cluster_center_dists = euclidean_distance(data, cluster_centers[clust_idx]) + all_clusters_dists.append(numpy.array(cluster_center_dists)) + + cluster_centers = numpy.array(cluster_centers) + all_clusters_dists = numpy.array(all_clusters_dists) + + # A 1D array that, for each sample, holds the index of the cluster with the smallest distance. + # In other words, the array holds the sample's cluster index. + cluster_indices = numpy.argmin(all_clusters_dists, axis=0) + for clust_idx in range(num_clusters): + clusters.append(numpy.where(cluster_indices == clust_idx)[0]) + # Calculate the sum of distances for the cluster. + if len(clusters[clust_idx]) == 0: + # In case the cluster is empty (i.e. has zero samples). + clusters_sum_dist.append(0) + else: + # When the cluster is not empty (i.e. has at least 1 sample). + clusters_sum_dist.append(numpy.sum(all_clusters_dists[clust_idx, clusters[clust_idx]])) + # clusters_sum_dist.append(numpy.sum(euclidean_distance(data[clusters[clust_idx], :], cluster_centers[clust_idx]))) + + clusters_sum_dist = numpy.array(clusters_sum_dist) + + return cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist + +def fitness_func(ga_instance, solution, solution_idx): + _, _, _, _, clusters_sum_dist = cluster_data(solution, solution_idx) + + # The tiny value 0.00000001 is added to the denominator in case the average distance is 0. + fitness = 1.0 / (numpy.sum(clusters_sum_dist) + 0.00000001) + + return fitness + +num_clusters = 3 +feature_vector_length = data.shape[1] +num_genes = num_clusters * feature_vector_length + +ga_instance = pygad.GA(num_generations=100, + sol_per_pop=10, + init_range_low=0, + init_range_high=20, + num_parents_mating=5, + keep_parents=2, + num_genes=num_genes, + fitness_func=fitness_func, + suppress_warnings=True) + +ga_instance.run() + +best_solution, best_solution_fitness, best_solution_idx = ga_instance.best_solution() +print("Best solution is {bs}".format(bs=best_solution)) +print("Fitness of the best solution is {bsf}".format(bsf=best_solution_fitness)) +print("Best solution found after {gen} generations".format(gen=ga_instance.best_solution_generation)) + +cluster_centers, all_clusters_dists, cluster_indices, clusters, clusters_sum_dist = cluster_data(best_solution, best_solution_idx) + +for cluster_idx in range(num_clusters): + cluster_x = data[clusters[cluster_idx], 0] + cluster_y = data[clusters[cluster_idx], 1] + matplotlib.pyplot.scatter(cluster_x, cluster_y) + matplotlib.pyplot.scatter(cluster_centers[cluster_idx, 0], cluster_centers[cluster_idx, 1], linewidths=5) +matplotlib.pyplot.title("Clustering using PyGAD") +matplotlib.pyplot.show() diff --git a/examples/data/Skin_Cancer_Dataset/benign/1.jpg b/examples/data/Skin_Cancer_Dataset/benign/1.jpg new file mode 100644 index 0000000..60cc36e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/1.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/10.jpg b/examples/data/Skin_Cancer_Dataset/benign/10.jpg new file mode 100644 index 0000000..bb8e34f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/10.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/100.jpg b/examples/data/Skin_Cancer_Dataset/benign/100.jpg new file mode 100644 index 0000000..3bbab93 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/100.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/11.jpg b/examples/data/Skin_Cancer_Dataset/benign/11.jpg new file mode 100644 index 0000000..3d8ecfe Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/11.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/12.jpg b/examples/data/Skin_Cancer_Dataset/benign/12.jpg new file mode 100644 index 0000000..f47c0ab Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/12.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/13.jpg b/examples/data/Skin_Cancer_Dataset/benign/13.jpg new file mode 100644 index 0000000..d84106b Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/13.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/14.jpg b/examples/data/Skin_Cancer_Dataset/benign/14.jpg new file mode 100644 index 0000000..f8442dd Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/14.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/15.jpg b/examples/data/Skin_Cancer_Dataset/benign/15.jpg new file mode 100644 index 0000000..9d451f4 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/15.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/16.jpg b/examples/data/Skin_Cancer_Dataset/benign/16.jpg new file mode 100644 index 0000000..f4f75b2 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/16.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/17.jpg b/examples/data/Skin_Cancer_Dataset/benign/17.jpg new file mode 100644 index 0000000..8f30c39 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/17.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/18.jpg b/examples/data/Skin_Cancer_Dataset/benign/18.jpg new file mode 100644 index 0000000..70061f6 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/18.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/19.jpg b/examples/data/Skin_Cancer_Dataset/benign/19.jpg new file mode 100644 index 0000000..9605db8 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/19.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/2.jpg b/examples/data/Skin_Cancer_Dataset/benign/2.jpg new file mode 100644 index 0000000..c509505 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/2.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/20.jpg b/examples/data/Skin_Cancer_Dataset/benign/20.jpg new file mode 100644 index 0000000..b6cc9c3 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/20.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/21.jpg b/examples/data/Skin_Cancer_Dataset/benign/21.jpg new file mode 100644 index 0000000..ca55f2a Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/21.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/22.jpg b/examples/data/Skin_Cancer_Dataset/benign/22.jpg new file mode 100644 index 0000000..b928dfe Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/22.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/23.jpg b/examples/data/Skin_Cancer_Dataset/benign/23.jpg new file mode 100644 index 0000000..e113fb6 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/23.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/24.jpg b/examples/data/Skin_Cancer_Dataset/benign/24.jpg new file mode 100644 index 0000000..0262e9b Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/24.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/25.jpg b/examples/data/Skin_Cancer_Dataset/benign/25.jpg new file mode 100644 index 0000000..e84d529 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/25.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/26.jpg b/examples/data/Skin_Cancer_Dataset/benign/26.jpg new file mode 100644 index 0000000..603b8e1 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/26.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/27.jpg b/examples/data/Skin_Cancer_Dataset/benign/27.jpg new file mode 100644 index 0000000..8382e01 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/27.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/28.jpg b/examples/data/Skin_Cancer_Dataset/benign/28.jpg new file mode 100644 index 0000000..5b7b776 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/28.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/29.jpg b/examples/data/Skin_Cancer_Dataset/benign/29.jpg new file mode 100644 index 0000000..371a95f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/29.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/3.jpg b/examples/data/Skin_Cancer_Dataset/benign/3.jpg new file mode 100644 index 0000000..01fcb1c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/3.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/30.jpg b/examples/data/Skin_Cancer_Dataset/benign/30.jpg new file mode 100644 index 0000000..b3ce109 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/30.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/31.jpg b/examples/data/Skin_Cancer_Dataset/benign/31.jpg new file mode 100644 index 0000000..d322bbf Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/31.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/32.jpg b/examples/data/Skin_Cancer_Dataset/benign/32.jpg new file mode 100644 index 0000000..50e8363 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/32.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/33.jpg b/examples/data/Skin_Cancer_Dataset/benign/33.jpg new file mode 100644 index 0000000..c040ab0 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/33.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/34.jpg b/examples/data/Skin_Cancer_Dataset/benign/34.jpg new file mode 100644 index 0000000..9e3628f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/34.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/35.jpg b/examples/data/Skin_Cancer_Dataset/benign/35.jpg new file mode 100644 index 0000000..eb7fa5e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/35.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/36.jpg b/examples/data/Skin_Cancer_Dataset/benign/36.jpg new file mode 100644 index 0000000..eca336b Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/36.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/37.jpg b/examples/data/Skin_Cancer_Dataset/benign/37.jpg new file mode 100644 index 0000000..dd06486 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/37.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/38.jpg b/examples/data/Skin_Cancer_Dataset/benign/38.jpg new file mode 100644 index 0000000..13a0509 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/38.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/39.jpg b/examples/data/Skin_Cancer_Dataset/benign/39.jpg new file mode 100644 index 0000000..ae84827 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/39.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/4.jpg b/examples/data/Skin_Cancer_Dataset/benign/4.jpg new file mode 100644 index 0000000..ac6cf50 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/4.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/40.jpg b/examples/data/Skin_Cancer_Dataset/benign/40.jpg new file mode 100644 index 0000000..29ad202 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/40.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/41.jpg b/examples/data/Skin_Cancer_Dataset/benign/41.jpg new file mode 100644 index 0000000..743a598 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/41.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/42.jpg b/examples/data/Skin_Cancer_Dataset/benign/42.jpg new file mode 100644 index 0000000..e181514 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/42.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/43.jpg b/examples/data/Skin_Cancer_Dataset/benign/43.jpg new file mode 100644 index 0000000..b596a60 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/43.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/44.jpg b/examples/data/Skin_Cancer_Dataset/benign/44.jpg new file mode 100644 index 0000000..00286ea Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/44.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/45.jpg b/examples/data/Skin_Cancer_Dataset/benign/45.jpg new file mode 100644 index 0000000..79096ac Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/45.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/46.jpg b/examples/data/Skin_Cancer_Dataset/benign/46.jpg new file mode 100644 index 0000000..0c38c2c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/46.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/47.jpg b/examples/data/Skin_Cancer_Dataset/benign/47.jpg new file mode 100644 index 0000000..1d1de9f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/47.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/48.jpg b/examples/data/Skin_Cancer_Dataset/benign/48.jpg new file mode 100644 index 0000000..7317ed1 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/48.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/49.jpg b/examples/data/Skin_Cancer_Dataset/benign/49.jpg new file mode 100644 index 0000000..9d5f512 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/49.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/5.jpg b/examples/data/Skin_Cancer_Dataset/benign/5.jpg new file mode 100644 index 0000000..7e3bbff Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/5.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/50.jpg b/examples/data/Skin_Cancer_Dataset/benign/50.jpg new file mode 100644 index 0000000..c3d6743 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/50.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/51.jpg b/examples/data/Skin_Cancer_Dataset/benign/51.jpg new file mode 100644 index 0000000..ff175df Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/51.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/52.jpg b/examples/data/Skin_Cancer_Dataset/benign/52.jpg new file mode 100644 index 0000000..3fbed5c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/52.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/53.jpg b/examples/data/Skin_Cancer_Dataset/benign/53.jpg new file mode 100644 index 0000000..1be8d1c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/53.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/54.jpg b/examples/data/Skin_Cancer_Dataset/benign/54.jpg new file mode 100644 index 0000000..87c93fe Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/54.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/55.jpg b/examples/data/Skin_Cancer_Dataset/benign/55.jpg new file mode 100644 index 0000000..0cf1a76 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/55.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/56.jpg b/examples/data/Skin_Cancer_Dataset/benign/56.jpg new file mode 100644 index 0000000..5877a65 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/56.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/57.jpg b/examples/data/Skin_Cancer_Dataset/benign/57.jpg new file mode 100644 index 0000000..6f3c81f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/57.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/58.jpg b/examples/data/Skin_Cancer_Dataset/benign/58.jpg new file mode 100644 index 0000000..7743f13 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/58.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/59.jpg b/examples/data/Skin_Cancer_Dataset/benign/59.jpg new file mode 100644 index 0000000..bd8239d Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/59.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/6.jpg b/examples/data/Skin_Cancer_Dataset/benign/6.jpg new file mode 100644 index 0000000..ffa901e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/6.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/60.jpg b/examples/data/Skin_Cancer_Dataset/benign/60.jpg new file mode 100644 index 0000000..179b295 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/60.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/61.jpg b/examples/data/Skin_Cancer_Dataset/benign/61.jpg new file mode 100644 index 0000000..1d0344e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/61.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/62.jpg b/examples/data/Skin_Cancer_Dataset/benign/62.jpg new file mode 100644 index 0000000..d33b482 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/62.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/63.jpg b/examples/data/Skin_Cancer_Dataset/benign/63.jpg new file mode 100644 index 0000000..52375b1 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/63.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/64.jpg b/examples/data/Skin_Cancer_Dataset/benign/64.jpg new file mode 100644 index 0000000..ce81a48 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/64.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/65.jpg b/examples/data/Skin_Cancer_Dataset/benign/65.jpg new file mode 100644 index 0000000..9753e81 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/65.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/66.jpg b/examples/data/Skin_Cancer_Dataset/benign/66.jpg new file mode 100644 index 0000000..c0127e6 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/66.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/67.jpg b/examples/data/Skin_Cancer_Dataset/benign/67.jpg new file mode 100644 index 0000000..d119d00 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/67.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/68.jpg b/examples/data/Skin_Cancer_Dataset/benign/68.jpg new file mode 100644 index 0000000..c68f552 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/68.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/69.jpg b/examples/data/Skin_Cancer_Dataset/benign/69.jpg new file mode 100644 index 0000000..f6cda52 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/69.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/7.jpg b/examples/data/Skin_Cancer_Dataset/benign/7.jpg new file mode 100644 index 0000000..0ab462c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/7.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/70.jpg b/examples/data/Skin_Cancer_Dataset/benign/70.jpg new file mode 100644 index 0000000..dbfbefb Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/70.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/71.jpg b/examples/data/Skin_Cancer_Dataset/benign/71.jpg new file mode 100644 index 0000000..559456a Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/71.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/72.jpg b/examples/data/Skin_Cancer_Dataset/benign/72.jpg new file mode 100644 index 0000000..0921796 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/72.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/73.jpg b/examples/data/Skin_Cancer_Dataset/benign/73.jpg new file mode 100644 index 0000000..72aaaf5 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/73.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/74.jpg b/examples/data/Skin_Cancer_Dataset/benign/74.jpg new file mode 100644 index 0000000..dd7734d Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/74.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/75.jpg b/examples/data/Skin_Cancer_Dataset/benign/75.jpg new file mode 100644 index 0000000..e83da99 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/75.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/76.jpg b/examples/data/Skin_Cancer_Dataset/benign/76.jpg new file mode 100644 index 0000000..4190c79 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/76.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/77.jpg b/examples/data/Skin_Cancer_Dataset/benign/77.jpg new file mode 100644 index 0000000..143eb7f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/77.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/78.jpg b/examples/data/Skin_Cancer_Dataset/benign/78.jpg new file mode 100644 index 0000000..766a82e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/78.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/79.jpg b/examples/data/Skin_Cancer_Dataset/benign/79.jpg new file mode 100644 index 0000000..f5bc6ba Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/79.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/8.jpg b/examples/data/Skin_Cancer_Dataset/benign/8.jpg new file mode 100644 index 0000000..b4aaa20 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/8.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/80.jpg b/examples/data/Skin_Cancer_Dataset/benign/80.jpg new file mode 100644 index 0000000..e8a12fa Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/80.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/81.jpg b/examples/data/Skin_Cancer_Dataset/benign/81.jpg new file mode 100644 index 0000000..d2729be Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/81.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/82.jpg b/examples/data/Skin_Cancer_Dataset/benign/82.jpg new file mode 100644 index 0000000..bdcee12 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/82.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/83.jpg b/examples/data/Skin_Cancer_Dataset/benign/83.jpg new file mode 100644 index 0000000..6a89185 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/83.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/84.jpg b/examples/data/Skin_Cancer_Dataset/benign/84.jpg new file mode 100644 index 0000000..ae3330e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/84.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/85.jpg b/examples/data/Skin_Cancer_Dataset/benign/85.jpg new file mode 100644 index 0000000..63ae63c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/85.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/86.jpg b/examples/data/Skin_Cancer_Dataset/benign/86.jpg new file mode 100644 index 0000000..3af6247 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/86.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/87.jpg b/examples/data/Skin_Cancer_Dataset/benign/87.jpg new file mode 100644 index 0000000..ff94ca9 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/87.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/88.jpg b/examples/data/Skin_Cancer_Dataset/benign/88.jpg new file mode 100644 index 0000000..0629508 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/88.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/89.jpg b/examples/data/Skin_Cancer_Dataset/benign/89.jpg new file mode 100644 index 0000000..e1b55e2 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/89.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/9.jpg b/examples/data/Skin_Cancer_Dataset/benign/9.jpg new file mode 100644 index 0000000..87f6972 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/9.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/90.jpg b/examples/data/Skin_Cancer_Dataset/benign/90.jpg new file mode 100644 index 0000000..589db4c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/90.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/91.jpg b/examples/data/Skin_Cancer_Dataset/benign/91.jpg new file mode 100644 index 0000000..b887964 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/91.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/92.jpg b/examples/data/Skin_Cancer_Dataset/benign/92.jpg new file mode 100644 index 0000000..f810690 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/92.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/93.jpg b/examples/data/Skin_Cancer_Dataset/benign/93.jpg new file mode 100644 index 0000000..4a90dd2 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/93.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/94.jpg b/examples/data/Skin_Cancer_Dataset/benign/94.jpg new file mode 100644 index 0000000..a5da9ec Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/94.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/95.jpg b/examples/data/Skin_Cancer_Dataset/benign/95.jpg new file mode 100644 index 0000000..3bc0fed Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/95.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/96.jpg b/examples/data/Skin_Cancer_Dataset/benign/96.jpg new file mode 100644 index 0000000..5e59ce9 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/96.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/97.jpg b/examples/data/Skin_Cancer_Dataset/benign/97.jpg new file mode 100644 index 0000000..018def7 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/97.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/98.jpg b/examples/data/Skin_Cancer_Dataset/benign/98.jpg new file mode 100644 index 0000000..2edecb7 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/98.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/benign/99.jpg b/examples/data/Skin_Cancer_Dataset/benign/99.jpg new file mode 100644 index 0000000..438b3cb Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/benign/99.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/1.jpg b/examples/data/Skin_Cancer_Dataset/malignant/1.jpg new file mode 100644 index 0000000..c73d797 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/1.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/10.jpg b/examples/data/Skin_Cancer_Dataset/malignant/10.jpg new file mode 100644 index 0000000..c5c98c8 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/10.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/100.jpg b/examples/data/Skin_Cancer_Dataset/malignant/100.jpg new file mode 100644 index 0000000..e9e1a34 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/100.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/11.jpg b/examples/data/Skin_Cancer_Dataset/malignant/11.jpg new file mode 100644 index 0000000..077a2c5 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/11.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/12.jpg b/examples/data/Skin_Cancer_Dataset/malignant/12.jpg new file mode 100644 index 0000000..daef157 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/12.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/13.jpg b/examples/data/Skin_Cancer_Dataset/malignant/13.jpg new file mode 100644 index 0000000..a47cf52 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/13.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/14.jpg b/examples/data/Skin_Cancer_Dataset/malignant/14.jpg new file mode 100644 index 0000000..bdf815d Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/14.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/15.jpg b/examples/data/Skin_Cancer_Dataset/malignant/15.jpg new file mode 100644 index 0000000..b8573bf Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/15.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/16.jpg b/examples/data/Skin_Cancer_Dataset/malignant/16.jpg new file mode 100644 index 0000000..83ea997 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/16.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/17.jpg b/examples/data/Skin_Cancer_Dataset/malignant/17.jpg new file mode 100644 index 0000000..90dab86 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/17.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/18.jpg b/examples/data/Skin_Cancer_Dataset/malignant/18.jpg new file mode 100644 index 0000000..b75bd28 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/18.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/19.jpg b/examples/data/Skin_Cancer_Dataset/malignant/19.jpg new file mode 100644 index 0000000..4dd30bb Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/19.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/2.jpg b/examples/data/Skin_Cancer_Dataset/malignant/2.jpg new file mode 100644 index 0000000..a550d6a Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/2.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/20.jpg b/examples/data/Skin_Cancer_Dataset/malignant/20.jpg new file mode 100644 index 0000000..8aca62f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/20.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/21.jpg b/examples/data/Skin_Cancer_Dataset/malignant/21.jpg new file mode 100644 index 0000000..5e0159e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/21.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/22.jpg b/examples/data/Skin_Cancer_Dataset/malignant/22.jpg new file mode 100644 index 0000000..a6b007f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/22.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/23.jpg b/examples/data/Skin_Cancer_Dataset/malignant/23.jpg new file mode 100644 index 0000000..0a346e4 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/23.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/24.jpg b/examples/data/Skin_Cancer_Dataset/malignant/24.jpg new file mode 100644 index 0000000..3c3d9e7 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/24.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/25.jpg b/examples/data/Skin_Cancer_Dataset/malignant/25.jpg new file mode 100644 index 0000000..6bd74db Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/25.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/26.jpg b/examples/data/Skin_Cancer_Dataset/malignant/26.jpg new file mode 100644 index 0000000..4c42fc1 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/26.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/27.jpg b/examples/data/Skin_Cancer_Dataset/malignant/27.jpg new file mode 100644 index 0000000..046d59e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/27.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/28.jpg b/examples/data/Skin_Cancer_Dataset/malignant/28.jpg new file mode 100644 index 0000000..eff036d Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/28.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/29.jpg b/examples/data/Skin_Cancer_Dataset/malignant/29.jpg new file mode 100644 index 0000000..83bb701 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/29.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/3.jpg b/examples/data/Skin_Cancer_Dataset/malignant/3.jpg new file mode 100644 index 0000000..b3287c6 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/3.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/30.jpg b/examples/data/Skin_Cancer_Dataset/malignant/30.jpg new file mode 100644 index 0000000..dbe824e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/30.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/31.jpg b/examples/data/Skin_Cancer_Dataset/malignant/31.jpg new file mode 100644 index 0000000..1d214ea Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/31.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/32.jpg b/examples/data/Skin_Cancer_Dataset/malignant/32.jpg new file mode 100644 index 0000000..306e049 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/32.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/33.jpg b/examples/data/Skin_Cancer_Dataset/malignant/33.jpg new file mode 100644 index 0000000..16b20b3 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/33.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/34.jpg b/examples/data/Skin_Cancer_Dataset/malignant/34.jpg new file mode 100644 index 0000000..905918f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/34.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/35.jpg b/examples/data/Skin_Cancer_Dataset/malignant/35.jpg new file mode 100644 index 0000000..5000518 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/35.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/36.jpg b/examples/data/Skin_Cancer_Dataset/malignant/36.jpg new file mode 100644 index 0000000..f53f952 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/36.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/37.jpg b/examples/data/Skin_Cancer_Dataset/malignant/37.jpg new file mode 100644 index 0000000..2573a86 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/37.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/38.jpg b/examples/data/Skin_Cancer_Dataset/malignant/38.jpg new file mode 100644 index 0000000..a12966c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/38.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/39.jpg b/examples/data/Skin_Cancer_Dataset/malignant/39.jpg new file mode 100644 index 0000000..20feb07 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/39.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/4.jpg b/examples/data/Skin_Cancer_Dataset/malignant/4.jpg new file mode 100644 index 0000000..014ab21 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/4.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/40.jpg b/examples/data/Skin_Cancer_Dataset/malignant/40.jpg new file mode 100644 index 0000000..a2d35b7 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/40.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/41.jpg b/examples/data/Skin_Cancer_Dataset/malignant/41.jpg new file mode 100644 index 0000000..abcc1c2 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/41.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/42.jpg b/examples/data/Skin_Cancer_Dataset/malignant/42.jpg new file mode 100644 index 0000000..a61692d Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/42.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/43.jpg b/examples/data/Skin_Cancer_Dataset/malignant/43.jpg new file mode 100644 index 0000000..b047bb9 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/43.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/44.jpg b/examples/data/Skin_Cancer_Dataset/malignant/44.jpg new file mode 100644 index 0000000..fe2adb8 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/44.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/45.jpg b/examples/data/Skin_Cancer_Dataset/malignant/45.jpg new file mode 100644 index 0000000..f7524fc Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/45.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/46.jpg b/examples/data/Skin_Cancer_Dataset/malignant/46.jpg new file mode 100644 index 0000000..c86111e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/46.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/47.jpg b/examples/data/Skin_Cancer_Dataset/malignant/47.jpg new file mode 100644 index 0000000..8d4cfe9 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/47.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/48.jpg b/examples/data/Skin_Cancer_Dataset/malignant/48.jpg new file mode 100644 index 0000000..f96a51c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/48.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/49.jpg b/examples/data/Skin_Cancer_Dataset/malignant/49.jpg new file mode 100644 index 0000000..125434e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/49.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/5.jpg b/examples/data/Skin_Cancer_Dataset/malignant/5.jpg new file mode 100644 index 0000000..f62e537 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/5.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/50.jpg b/examples/data/Skin_Cancer_Dataset/malignant/50.jpg new file mode 100644 index 0000000..ef2d4a1 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/50.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/51.jpg b/examples/data/Skin_Cancer_Dataset/malignant/51.jpg new file mode 100644 index 0000000..c978119 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/51.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/52.jpg b/examples/data/Skin_Cancer_Dataset/malignant/52.jpg new file mode 100644 index 0000000..89b2748 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/52.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/53.jpg b/examples/data/Skin_Cancer_Dataset/malignant/53.jpg new file mode 100644 index 0000000..216fddd Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/53.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/54.jpg b/examples/data/Skin_Cancer_Dataset/malignant/54.jpg new file mode 100644 index 0000000..93e2f17 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/54.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/55.jpg b/examples/data/Skin_Cancer_Dataset/malignant/55.jpg new file mode 100644 index 0000000..7b83187 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/55.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/56.jpg b/examples/data/Skin_Cancer_Dataset/malignant/56.jpg new file mode 100644 index 0000000..42c02b7 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/56.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/57.jpg b/examples/data/Skin_Cancer_Dataset/malignant/57.jpg new file mode 100644 index 0000000..e5fba7e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/57.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/58.jpg b/examples/data/Skin_Cancer_Dataset/malignant/58.jpg new file mode 100644 index 0000000..a7435e0 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/58.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/59.jpg b/examples/data/Skin_Cancer_Dataset/malignant/59.jpg new file mode 100644 index 0000000..e16364c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/59.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/6.jpg b/examples/data/Skin_Cancer_Dataset/malignant/6.jpg new file mode 100644 index 0000000..fa68cb0 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/6.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/60.jpg b/examples/data/Skin_Cancer_Dataset/malignant/60.jpg new file mode 100644 index 0000000..65f951e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/60.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/61.jpg b/examples/data/Skin_Cancer_Dataset/malignant/61.jpg new file mode 100644 index 0000000..3e0a773 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/61.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/62.jpg b/examples/data/Skin_Cancer_Dataset/malignant/62.jpg new file mode 100644 index 0000000..16712fd Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/62.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/63.jpg b/examples/data/Skin_Cancer_Dataset/malignant/63.jpg new file mode 100644 index 0000000..5d8c4a8 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/63.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/64.jpg b/examples/data/Skin_Cancer_Dataset/malignant/64.jpg new file mode 100644 index 0000000..f63c851 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/64.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/65.jpg b/examples/data/Skin_Cancer_Dataset/malignant/65.jpg new file mode 100644 index 0000000..a3e522d Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/65.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/66.jpg b/examples/data/Skin_Cancer_Dataset/malignant/66.jpg new file mode 100644 index 0000000..708590b Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/66.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/67.jpg b/examples/data/Skin_Cancer_Dataset/malignant/67.jpg new file mode 100644 index 0000000..ec82ab2 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/67.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/68.jpg b/examples/data/Skin_Cancer_Dataset/malignant/68.jpg new file mode 100644 index 0000000..0b01e89 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/68.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/69.jpg b/examples/data/Skin_Cancer_Dataset/malignant/69.jpg new file mode 100644 index 0000000..e8172a5 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/69.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/7.jpg b/examples/data/Skin_Cancer_Dataset/malignant/7.jpg new file mode 100644 index 0000000..7ff0a12 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/7.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/70.jpg b/examples/data/Skin_Cancer_Dataset/malignant/70.jpg new file mode 100644 index 0000000..3b2c352 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/70.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/71.jpg b/examples/data/Skin_Cancer_Dataset/malignant/71.jpg new file mode 100644 index 0000000..389bf5f Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/71.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/72.jpg b/examples/data/Skin_Cancer_Dataset/malignant/72.jpg new file mode 100644 index 0000000..54447c7 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/72.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/73.jpg b/examples/data/Skin_Cancer_Dataset/malignant/73.jpg new file mode 100644 index 0000000..89c1036 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/73.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/74.jpg b/examples/data/Skin_Cancer_Dataset/malignant/74.jpg new file mode 100644 index 0000000..cc8500b Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/74.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/75.jpg b/examples/data/Skin_Cancer_Dataset/malignant/75.jpg new file mode 100644 index 0000000..71a3329 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/75.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/76.jpg b/examples/data/Skin_Cancer_Dataset/malignant/76.jpg new file mode 100644 index 0000000..760e1fe Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/76.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/77.jpg b/examples/data/Skin_Cancer_Dataset/malignant/77.jpg new file mode 100644 index 0000000..70ecf1e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/77.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/78.jpg b/examples/data/Skin_Cancer_Dataset/malignant/78.jpg new file mode 100644 index 0000000..6d08f87 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/78.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/79.jpg b/examples/data/Skin_Cancer_Dataset/malignant/79.jpg new file mode 100644 index 0000000..cfdbc09 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/79.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/8.jpg b/examples/data/Skin_Cancer_Dataset/malignant/8.jpg new file mode 100644 index 0000000..72b6b0c Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/8.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/80.jpg b/examples/data/Skin_Cancer_Dataset/malignant/80.jpg new file mode 100644 index 0000000..292e4ec Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/80.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/81.jpg b/examples/data/Skin_Cancer_Dataset/malignant/81.jpg new file mode 100644 index 0000000..1a484c9 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/81.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/82.jpg b/examples/data/Skin_Cancer_Dataset/malignant/82.jpg new file mode 100644 index 0000000..3583976 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/82.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/83.jpg b/examples/data/Skin_Cancer_Dataset/malignant/83.jpg new file mode 100644 index 0000000..1c461ab Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/83.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/84.jpg b/examples/data/Skin_Cancer_Dataset/malignant/84.jpg new file mode 100644 index 0000000..8c424b6 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/84.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/85.jpg b/examples/data/Skin_Cancer_Dataset/malignant/85.jpg new file mode 100644 index 0000000..f912437 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/85.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/86.jpg b/examples/data/Skin_Cancer_Dataset/malignant/86.jpg new file mode 100644 index 0000000..7fa2f5e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/86.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/87.jpg b/examples/data/Skin_Cancer_Dataset/malignant/87.jpg new file mode 100644 index 0000000..3939579 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/87.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/88.jpg b/examples/data/Skin_Cancer_Dataset/malignant/88.jpg new file mode 100644 index 0000000..1a81670 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/88.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/89.jpg b/examples/data/Skin_Cancer_Dataset/malignant/89.jpg new file mode 100644 index 0000000..b8976ef Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/89.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/9.jpg b/examples/data/Skin_Cancer_Dataset/malignant/9.jpg new file mode 100644 index 0000000..f060321 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/9.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/90.jpg b/examples/data/Skin_Cancer_Dataset/malignant/90.jpg new file mode 100644 index 0000000..1fe6e34 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/90.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/91.jpg b/examples/data/Skin_Cancer_Dataset/malignant/91.jpg new file mode 100644 index 0000000..6f948d1 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/91.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/92.jpg b/examples/data/Skin_Cancer_Dataset/malignant/92.jpg new file mode 100644 index 0000000..b704524 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/92.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/93.jpg b/examples/data/Skin_Cancer_Dataset/malignant/93.jpg new file mode 100644 index 0000000..550a33e Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/93.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/94.jpg b/examples/data/Skin_Cancer_Dataset/malignant/94.jpg new file mode 100644 index 0000000..5958099 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/94.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/95.jpg b/examples/data/Skin_Cancer_Dataset/malignant/95.jpg new file mode 100644 index 0000000..036cd43 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/95.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/96.jpg b/examples/data/Skin_Cancer_Dataset/malignant/96.jpg new file mode 100644 index 0000000..e24ffb5 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/96.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/97.jpg b/examples/data/Skin_Cancer_Dataset/malignant/97.jpg new file mode 100644 index 0000000..6b6b877 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/97.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/98.jpg b/examples/data/Skin_Cancer_Dataset/malignant/98.jpg new file mode 100644 index 0000000..f80b051 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/98.jpg differ diff --git a/examples/data/Skin_Cancer_Dataset/malignant/99.jpg b/examples/data/Skin_Cancer_Dataset/malignant/99.jpg new file mode 100644 index 0000000..c0db092 Binary files /dev/null and b/examples/data/Skin_Cancer_Dataset/malignant/99.jpg differ diff --git a/examples/data/dataset_features.npy b/examples/data/dataset_features.npy new file mode 100644 index 0000000..b4d6039 Binary files /dev/null and b/examples/data/dataset_features.npy differ diff --git a/examples/data/dataset_inputs.npy b/examples/data/dataset_inputs.npy new file mode 100644 index 0000000..c30f0e2 Binary files /dev/null and b/examples/data/dataset_inputs.npy differ diff --git a/examples/data/dataset_outputs.npy b/examples/data/dataset_outputs.npy new file mode 100644 index 0000000..9f9e518 Binary files /dev/null and b/examples/data/dataset_outputs.npy differ diff --git a/examples/data/outputs.npy b/examples/data/outputs.npy new file mode 100644 index 0000000..3523b07 Binary files /dev/null and b/examples/data/outputs.npy differ diff --git a/pygad/helper/unique.py b/pygad/helper/unique.py index af381fe..1a0ba63 100644 --- a/pygad/helper/unique.py +++ b/pygad/helper/unique.py @@ -603,11 +603,15 @@ def find_two_duplicates(self, return None, gene def unpack_gene_space(self, + range_min, + range_max, num_values_from_inf_range=100): """ Unpack the gene_space for the purpose of selecting a value that solves the duplicates. This is by replacing each range by a list of values. It accepts: + range_min: The range minimum value. + range_min: The range maximum value. num_values_from_inf_range: For infinite range of float values, a fixed number of values equal to num_values_from_inf_range is selected using the numpy.linspace() function. It returns the unpacked gene space. """ @@ -662,8 +666,8 @@ def unpack_gene_space(self, gene_space_unpacked[space_idx] = [space] elif space is None: # Randomly generate the value using the mutation range. - gene_space_unpacked[space_idx] = numpy.arange(start=self.random_mutation_min_val, - stop=self.random_mutation_max_val) + gene_space_unpacked[space_idx] = numpy.arange(start=range_min, + stop=range_max) elif type(space) is range: # Convert the range to a list. gene_space_unpacked[space_idx] = list(space) @@ -720,8 +724,8 @@ def unpack_gene_space(self, none_indices = numpy.where(numpy.array(gene_space_unpacked[space_idx]) == None)[0] if len(none_indices) > 0: for idx in none_indices: - random_value = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + random_value = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] gene_space_unpacked[space_idx][idx] = random_value diff --git a/pygad/kerasga/kerasga.py b/pygad/kerasga/kerasga.py index 0e1b618..cda2c4b 100644 --- a/pygad/kerasga/kerasga.py +++ b/pygad/kerasga/kerasga.py @@ -3,6 +3,20 @@ import tensorflow.keras def model_weights_as_vector(model): + """ + Reshapes the Keras model weight as a vector. + + Parameters + ---------- + model : TYPE + The Keras model. + + Returns + ------- + TYPE + The weights as a 1D vector. + + """ weights_vector = [] for layer in model.layers: # model.get_weights(): @@ -15,6 +29,22 @@ def model_weights_as_vector(model): return numpy.array(weights_vector) def model_weights_as_matrix(model, weights_vector): + """ + Reshapes the PyGAD 1D solution as a Keras weight matrix. + + Parameters + ---------- + model : TYPE + The Keras model. + weights_vector : TYPE + The PyGAD solution as a 1D vector. + + Returns + ------- + weights_matrix : TYPE + The Keras weights as a matrix. + + """ weights_matrix = [] start = 0 @@ -37,13 +67,45 @@ def model_weights_as_matrix(model, weights_vector): return weights_matrix -def predict(model, solution, data): +def predict(model, + solution, + data, + batch_size=None, + verbose=0, + steps=None): + """ + Use the PyGAD's solution to make predictions using the Keras model. + + Parameters + ---------- + model : TYPE + The Keras model. + solution : TYPE + A single PyGAD solution as 1D vector. + data : TYPE + The data or a generator. + batch_size : TYPE, optional + The batch size (i.e. number of samples per step or batch). The default is None. Check documentation of the Keras Model.predict() method for more information. + verbose : TYPE, optional + Verbosity mode. The default is 0. Check documentation of the Keras Model.predict() method for more information. + steps : TYPE, optional + The total number of steps (batches of samples). The default is None. Check documentation of the Keras Model.predict() method for more information. + + Returns + ------- + predictions : TYPE + The Keras model predictions. + + """ # Fetch the parameters of the best solution. solution_weights = model_weights_as_matrix(model=model, weights_vector=solution) _model = tensorflow.keras.models.clone_model(model) _model.set_weights(solution_weights) - predictions = _model.predict(data) + predictions = _model.predict(x=data, + batch_size=batch_size, + verbose=verbose, + steps=steps) return predictions diff --git a/pygad/pygad.py b/pygad/pygad.py index 95a59cd..01ace61 100644 --- a/pygad/pygad.py +++ b/pygad/pygad.py @@ -264,32 +264,78 @@ def __init__(self, self.gene_space = gene_space + # Validate init_range_low and init_range_high + # if type(init_range_low) in GA.supported_int_float_types: + # if type(init_range_high) in GA.supported_int_float_types: + # self.init_range_low = init_range_low + # self.init_range_high = init_range_high + # else: + # self.valid_parameters = False + # raise ValueError(f"The value passed to the 'init_range_high' parameter must be either integer or floating-point number but the value ({init_range_high}) of type {type(init_range_high)} found.") + # else: + # self.valid_parameters = False + # raise ValueError(f"The value passed to the 'init_range_low' parameter must be either integer or floating-point number but the value ({init_range_low}) of type {type(init_range_low)} found.") + # Validate init_range_low and init_range_high if type(init_range_low) in GA.supported_int_float_types: if type(init_range_high) in GA.supported_int_float_types: - self.init_range_low = init_range_low - self.init_range_high = init_range_high + if init_range_low == init_range_high: + if not self.suppress_warnings: + warnings.warn("The values of the 2 parameters 'init_range_low' and 'init_range_high' are equal and this might return the same value for some genes in the initial population.") else: self.valid_parameters = False - raise ValueError(f"The value passed to the 'init_range_high' parameter must be either integer or floating-point number but the value ({init_range_high}) of type {type(init_range_high)} found.") - else: - self.valid_parameters = False - raise ValueError(f"The value passed to the 'init_range_low' parameter must be either integer or floating-point number but the value ({init_range_low}) of type {type(init_range_low)} found.") + raise TypeError(f"Type mismatch between the 2 parameters 'init_range_low' {type(init_range_low)} and 'init_range_high' {type(init_range_high)}.") + elif type(init_range_low) in [list, tuple, numpy.ndarray]: + # The self.num_genes attribute is not created yet. + # if len(init_range_low) == self.num_genes: + # pass + # else: + # self.valid_parameters = False + # raise ValueError(f"The length of the 'init_range_low' parameter is {len(init_range_low)} which is different from the number of genes {self.num_genes}.") + + # Get the number of genes before validating the num_genes parameter. + if num_genes is None: + if initial_population is None: + self.valid_parameters = False + raise TypeError("When the parameter 'initial_population' is None, then the 2 parameters 'sol_per_pop' and 'num_genes' cannot be None too.") + elif not len(init_range_low) == len(initial_population[0]): + self.valid_parameters = False + raise ValueError(f"The length of the 'init_range_low' parameter is {len(init_range_low)} which is different from the number of genes {len(initial_population[0])}.") + elif not len(init_range_low) == num_genes: + self.valid_parameters = False + raise ValueError(f"The length of the 'init_range_low' parameter is {len(init_range_low)} which is different from the number of genes {num_genes}.") - # Validate random_mutation_min_val and random_mutation_max_val - if type(random_mutation_min_val) in GA.supported_int_float_types: - if type(random_mutation_max_val) in GA.supported_int_float_types: - if random_mutation_min_val == random_mutation_max_val: - if not self.suppress_warnings: - warnings.warn("The values of the 2 parameters 'random_mutation_min_val' and 'random_mutation_max_val' are equal and this causes a fixed change to all genes.") + if type(init_range_high) in [list, tuple, numpy.ndarray]: + if len(init_range_low) == len(init_range_high): + pass + else: + self.valid_parameters = False + raise ValueError(f"Size mismatch between the 2 parameters 'init_range_low' {len(init_range_low)} and 'init_range_high' {len(init_range_high)}.") + + # Validate the values in init_range_low + for val in init_range_low: + if type(val) in GA.supported_int_float_types: + pass + else: + self.valid_parameters = False + raise TypeError(f"When an iterable (list/tuple/numpy.ndarray) is assigned to the 'init_range_low' parameter, its elements must be numeric but the value {val} of type {type(val)} found.") + + # Validate the values in init_range_high + for val in init_range_high: + if type(val) in GA.supported_int_float_types: + pass + else: + self.valid_parameters = False + raise TypeError(f"When an iterable (list/tuple/numpy.ndarray) is assigned to the 'init_range_high' parameter, its elements must be numeric but the value {val} of type {type(val)} found.") else: self.valid_parameters = False - raise TypeError(f"The expected type of the 'random_mutation_max_val' parameter is numeric but {type(random_mutation_max_val)} found.") + raise TypeError(f"Type mismatch between the 2 parameters 'init_range_low' {type(init_range_low)} and 'init_range_high' {type(init_range_high)}. Both of them can be either numeric or iterable (list/tuple/numpy.ndarray).") else: self.valid_parameters = False - raise TypeError(f"The expected type of the 'random_mutation_min_val' parameter is numeric but {type(random_mutation_min_val)} found.") - self.random_mutation_min_val = random_mutation_min_val - self.random_mutation_max_val = random_mutation_max_val + raise TypeError(f"The expected type of the 'init_range_low' parameter is numeric or list/tuple/numpy.ndarray but {type(init_range_low)} found.") + + self.init_range_low = init_range_low + self.init_range_high = init_range_high # Validate gene_type if gene_type in GA.supported_int_float_types: @@ -304,6 +350,7 @@ def __init__(self, self.gene_type_single = False raise ValueError(f"Integers cannot have precision. Please use the integer data type directly instead of {gene_type}.") elif type(gene_type) in [list, tuple, numpy.ndarray]: + # Get the number of genes before validating the num_genes parameter. if num_genes is None: if initial_population is None: self.valid_parameters = False @@ -346,7 +393,8 @@ def __init__(self, raise ValueError(f"The value passed to the 'gene_type' parameter must be either a single integer, floating-point, list, tuple, or numpy.ndarray but ({gene_type}) of type {type(gene_type)} found.") # Call the unpack_gene_space() method in the pygad.helper.unique.Unique class. - self.gene_space_unpacked = self.unpack_gene_space() + self.gene_space_unpacked = self.unpack_gene_space(range_min=self.init_range_low, + range_max=self.init_range_high) # Build the initial population if initial_population is None: @@ -374,11 +422,11 @@ def __init__(self, # Number of solutions in the population. self.sol_per_pop = sol_per_pop - self.initialize_population(self.init_range_low, - self.init_range_high, - allow_duplicate_genes, - True, - self.gene_type) + self.initialize_population(low=self.init_range_low, + high=self.init_range_high, + allow_duplicate_genes=allow_duplicate_genes, + mutation_by_replacement=True, + gene_type=self.gene_type) else: self.valid_parameters = False raise TypeError(f"The expected type of both the sol_per_pop and num_genes parameters is int but {type(sol_per_pop)} and {type(num_genes)} found.") @@ -457,6 +505,53 @@ def __init__(self, self.valid_parameters = False raise ValueError(f"When the parameter 'gene_space' is nested, then its length must be equal to the value passed to the 'num_genes' parameter. Instead, length of gene_space ({len(gene_space)}) != num_genes ({self.num_genes})") + # Validate random_mutation_min_val and random_mutation_max_val + if type(random_mutation_min_val) in GA.supported_int_float_types: + if type(random_mutation_max_val) in GA.supported_int_float_types: + if random_mutation_min_val == random_mutation_max_val: + if not self.suppress_warnings: + warnings.warn("The values of the 2 parameters 'random_mutation_min_val' and 'random_mutation_max_val' are equal and this might cause a fixed mutation to some genes.") + else: + self.valid_parameters = False + raise TypeError(f"Type mismatch between the 2 parameters 'random_mutation_min_val' {type(random_mutation_min_val)} and 'random_mutation_max_val' {type(random_mutation_max_val)}.") + elif type(random_mutation_min_val) in [list, tuple, numpy.ndarray]: + if len(random_mutation_min_val) == self.num_genes: + pass + else: + self.valid_parameters = False + raise ValueError(f"The length of the 'random_mutation_min_val' parameter is {len(random_mutation_min_val)} which is different from the number of genes {self.num_genes}.") + if type(random_mutation_max_val) in [list, tuple, numpy.ndarray]: + if len(random_mutation_min_val) == len(random_mutation_max_val): + pass + else: + self.valid_parameters = False + raise ValueError(f"Size mismatch between the 2 parameters 'random_mutation_min_val' {len(random_mutation_min_val)} and 'random_mutation_max_val' {len(random_mutation_max_val)}.") + + # Validate the values in random_mutation_min_val + for val in random_mutation_min_val: + if type(val) in GA.supported_int_float_types: + pass + else: + self.valid_parameters = False + raise TypeError(f"When an iterable (list/tuple/numpy.ndarray) is assigned to the 'random_mutation_min_val' parameter, its elements must be numeric but the value {val} of type {type(val)} found.") + + # Validate the values in random_mutation_max_val + for val in random_mutation_max_val: + if type(val) in GA.supported_int_float_types: + pass + else: + self.valid_parameters = False + raise TypeError(f"When an iterable (list/tuple/numpy.ndarray) is assigned to the 'random_mutation_max_val' parameter, its elements must be numeric but the value {val} of type {type(val)} found.") + else: + self.valid_parameters = False + raise TypeError(f"Type mismatch between the 2 parameters 'random_mutation_min_val' {type(random_mutation_min_val)} and 'random_mutation_max_val' {type(random_mutation_max_val)}.") + else: + self.valid_parameters = False + raise TypeError(f"The expected type of the 'random_mutation_min_val' parameter is numeric or list/tuple/numpy.ndarray but {type(random_mutation_min_val)} found.") + + self.random_mutation_min_val = random_mutation_min_val + self.random_mutation_max_val = random_mutation_max_val + # Validating the number of parents to be selected for mating (num_parents_mating) if num_parents_mating <= 0: self.valid_parameters = False @@ -1252,9 +1347,17 @@ def initialize_population(self, shape=self.pop_size, dtype=object) # Loop through the genes, randomly generate the values of a single gene across the entire population, and add the values of each gene to the population. for gene_idx in range(self.num_genes): + + if type(self.init_range_low) in self.supported_int_float_types: + range_min = self.init_range_low + range_max = self.init_range_high + else: + range_min = self.init_range_low[gene_idx] + range_max = self.init_range_high[gene_idx] + # A vector of all values of this single gene across all solutions in the population. - gene_values = numpy.asarray(numpy.random.uniform(low=low, - high=high, + gene_values = numpy.asarray(numpy.random.uniform(low=range_min, + high=range_max, size=self.pop_size[0]), dtype=self.gene_type[gene_idx][0]) # Adding the current gene values to the population. @@ -1280,6 +1383,14 @@ def initialize_population(self, dtype=self.gene_type[0]) for sol_idx in range(self.sol_per_pop): for gene_idx in range(self.num_genes): + + if type(self.init_range_low) in self.supported_int_float_types: + range_min = self.init_range_low + range_max = self.init_range_high + else: + range_min = self.init_range_low[gene_idx] + range_max = self.init_range_high[gene_idx] + if self.gene_space[gene_idx] is None: # The following commented code replace the None value with a single number that will not change again. @@ -1290,8 +1401,8 @@ def initialize_population(self, # self.population[sol_idx, gene_idx] = list(self.gene_space[gene_idx]).copy() # The above problem is solved by keeping the None value in the gene_space parameter. This forces PyGAD to generate this value for each solution. - self.population[sol_idx, gene_idx] = numpy.asarray(numpy.random.uniform(low=low, - high=high, + self.population[sol_idx, gene_idx] = numpy.asarray(numpy.random.uniform(low=range_min, + high=range_max, size=1), dtype=self.gene_type[0])[0] elif type(self.gene_space[gene_idx]) in [numpy.ndarray, list, tuple, range]: @@ -1308,16 +1419,15 @@ def initialize_population(self, for idx, val in enumerate(self.gene_space[gene_idx]): if val is None: - self.gene_space[gene_idx][idx] = numpy.asarray(numpy.random.uniform(low=low, - high=high, + self.gene_space[gene_idx][idx] = numpy.asarray(numpy.random.uniform(low=range_min, + high=range_max, size=1), dtype=self.gene_type[0])[0] # Find the difference between the current gene space and the current values in the solution. unique_gene_values = list(set(self.gene_space[gene_idx]).difference( set(self.population[sol_idx, :gene_idx]))) if len(unique_gene_values) > 0: - self.population[sol_idx, gene_idx] = random.choice( - unique_gene_values) + self.population[sol_idx, gene_idx] = random.choice(unique_gene_values) else: # If there is no unique values, then we have to select a duplicate value. self.population[sol_idx, gene_idx] = random.choice( @@ -1331,20 +1441,17 @@ def initialize_population(self, elif type(self.gene_space[gene_idx]) is dict: if 'step' in self.gene_space[gene_idx].keys(): self.population[sol_idx, gene_idx] = numpy.asarray(numpy.random.choice(numpy.arange(start=self.gene_space[gene_idx]['low'], - stop=self.gene_space[ - gene_idx]['high'], + stop=self.gene_space[gene_idx]['high'], step=self.gene_space[gene_idx]['step']), size=1), dtype=self.gene_type[0])[0] else: self.population[sol_idx, gene_idx] = numpy.asarray(numpy.random.uniform(low=self.gene_space[gene_idx]['low'], - high=self.gene_space[ - gene_idx]['high'], + high=self.gene_space[gene_idx]['high'], size=1), dtype=self.gene_type[0])[0] elif type(self.gene_space[gene_idx]) in GA.supported_int_float_types: - self.population[sol_idx, - gene_idx] = self.gene_space[gene_idx] + self.population[sol_idx, gene_idx] = self.gene_space[gene_idx] else: # There is no more options. pass @@ -1356,53 +1463,54 @@ def initialize_population(self, dtype=object) for sol_idx in range(self.sol_per_pop): for gene_idx in range(self.num_genes): + + if type(self.init_range_low) in self.supported_int_float_types: + range_min = self.init_range_low + range_max = self.init_range_high + else: + range_min = self.init_range_low[gene_idx] + range_max = self.init_range_high[gene_idx] + if type(self.gene_space[gene_idx]) in [numpy.ndarray, list, tuple, range]: # Convert to list because tuple and range do not have copy(). # We copy the gene_space to a temp variable to keep its original value. # In the next for loop, the gene_space is changed. # Later, the gene_space is restored to its original value using the temp variable. - temp_gene_space = list( - self.gene_space[gene_idx]).copy() + temp_gene_space = list(self.gene_space[gene_idx]).copy() # Check if the gene space has None values. If any, then replace it with randomly generated values according to the 3 attributes init_range_low, init_range_high, and gene_type. for idx, val in enumerate(self.gene_space[gene_idx]): if val is None: - self.gene_space[gene_idx][idx] = numpy.asarray(numpy.random.uniform(low=low, - high=high, + self.gene_space[gene_idx][idx] = numpy.asarray(numpy.random.uniform(low=range_min, + high=range_max, size=1), dtype=self.gene_type[gene_idx][0])[0] - self.population[sol_idx, gene_idx] = random.choice( - self.gene_space[gene_idx]) - self.population[sol_idx, gene_idx] = self.gene_type[gene_idx][0]( - self.population[sol_idx, gene_idx]) + self.population[sol_idx, gene_idx] = random.choice(self.gene_space[gene_idx]) + self.population[sol_idx, gene_idx] = self.gene_type[gene_idx][0](self.population[sol_idx, gene_idx]) # Restore the gene_space from the temp_gene_space variable. self.gene_space[gene_idx] = temp_gene_space.copy() elif type(self.gene_space[gene_idx]) is dict: if 'step' in self.gene_space[gene_idx].keys(): self.population[sol_idx, gene_idx] = numpy.asarray(numpy.random.choice(numpy.arange(start=self.gene_space[gene_idx]['low'], - stop=self.gene_space[ - gene_idx]['high'], + stop=self.gene_space[gene_idx]['high'], step=self.gene_space[gene_idx]['step']), size=1), dtype=self.gene_type[gene_idx][0])[0] else: self.population[sol_idx, gene_idx] = numpy.asarray(numpy.random.uniform(low=self.gene_space[gene_idx]['low'], - high=self.gene_space[ - gene_idx]['high'], + high=self.gene_space[gene_idx]['high'], size=1), dtype=self.gene_type[gene_idx][0])[0] elif type(self.gene_space[gene_idx]) == type(None): - temp_gene_value = numpy.asarray(numpy.random.uniform(low=low, - high=high, + temp_gene_value = numpy.asarray(numpy.random.uniform(low=range_min, + high=range_max, size=1), dtype=self.gene_type[gene_idx][0])[0] - self.population[sol_idx, - gene_idx] = temp_gene_value.copy() + self.population[sol_idx, gene_idx] = temp_gene_value.copy() elif type(self.gene_space[gene_idx]) in GA.supported_int_float_types: - self.population[sol_idx, - gene_idx] = self.gene_space[gene_idx] + self.population[sol_idx, gene_idx] = self.gene_space[gene_idx] else: # There is no more options. pass @@ -1414,12 +1522,20 @@ def initialize_population(self, # 2) gene_type is not nested (gene_type_single is True). # Replace all the None values with random values using the init_range_low, init_range_high, and gene_type attributes. - for idx, curr_gene_space in enumerate(self.gene_space): + for gene_idx, curr_gene_space in enumerate(self.gene_space): + + if type(self.init_range_low) in self.supported_int_float_types: + range_min = self.init_range_low + range_max = self.init_range_high + else: + range_min = self.init_range_low[gene_idx] + range_max = self.init_range_high[gene_idx] + if curr_gene_space is None: - self.gene_space[idx] = numpy.asarray(numpy.random.uniform(low=low, - high=high, - size=1), - dtype=self.gene_type[0])[0] + self.gene_space[gene_idx] = numpy.asarray(numpy.random.uniform(low=range_min, + high=range_max, + size=1), + dtype=self.gene_type[0])[0] # Creating the initial population by randomly selecting the genes' values from the values inside the 'gene_space' parameter. if type(self.gene_space) is dict: @@ -1471,8 +1587,7 @@ def initialize_population(self, # It can be either range, numpy.ndarray, or list. # Create an empty population of dtype=object to support storing mixed data types within the same array. - self.population = numpy.zeros( - shape=self.pop_size, dtype=object) + self.population = numpy.zeros(shape=self.pop_size, dtype=object) # Loop through the genes, randomly generate the values of a single gene across the entire population, and add the values of each gene to the population. for gene_idx in range(self.num_genes): # A vector of all values of this single gene across all solutions in the population. @@ -1901,8 +2016,7 @@ def run(self): if not type(self.last_generation_offspring_mutation) is numpy.ndarray: raise TypeError(f"The output of the mutation step is expected to be of type (numpy.ndarray) but {type(self.last_generation_offspring_mutation)} found.") else: - self.last_generation_offspring_mutation = self.mutation( - self.last_generation_offspring_crossover) + self.last_generation_offspring_mutation = self.mutation(self.last_generation_offspring_crossover) if self.last_generation_offspring_mutation.shape != (self.num_offspring, self.num_genes): if self.last_generation_offspring_mutation.shape[0] != self.num_offspring: diff --git a/pygad/utils/mutation.py b/pygad/utils/mutation.py index 39a1a5d..6a9f6a9 100644 --- a/pygad/utils/mutation.py +++ b/pygad/utils/mutation.py @@ -53,6 +53,13 @@ def mutation_by_space(self, offspring): mutation_indices = numpy.array(random.sample(range(0, self.num_genes), self.mutation_num_genes)) for gene_idx in mutation_indices: + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + if self.gene_space_nested: # Returning the current gene space from the 'gene_space' attribute. if type(self.gene_space[gene_idx]) in [numpy.ndarray, list]: @@ -65,8 +72,8 @@ def mutation_by_space(self, offspring): value_from_space = curr_gene_space # If the gene space is None, apply mutation by adding a random value between the range defined by the 2 parameters 'random_mutation_min_val' and 'random_mutation_max_val'. elif curr_gene_space is None: - rand_val = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + rand_val = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] if self.mutation_by_replacement: value_from_space = rand_val @@ -125,8 +132,8 @@ def mutation_by_space(self, offspring): if value_from_space is None: # TODO: Return index 0. # TODO: Check if this if statement is necessary. - value_from_space = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + value_from_space = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # Assinging the selected value from the space to the gene. @@ -163,6 +170,14 @@ def mutation_probs_by_space(self, offspring): for offspring_idx in range(offspring.shape[0]): probs = numpy.random.random(size=offspring.shape[1]) for gene_idx in range(offspring.shape[1]): + + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + if probs[gene_idx] <= self.mutation_probability: if self.gene_space_nested: # Returning the current gene space from the 'gene_space' attribute. @@ -176,8 +191,8 @@ def mutation_probs_by_space(self, offspring): value_from_space = curr_gene_space # If the gene space is None, apply mutation by adding a random value between the range defined by the 2 parameters 'random_mutation_min_val' and 'random_mutation_max_val'. elif curr_gene_space is None: - rand_val = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + rand_val = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] if self.mutation_by_replacement: value_from_space = rand_val @@ -260,9 +275,17 @@ def mutation_randomly(self, offspring): for offspring_idx in range(offspring.shape[0]): mutation_indices = numpy.array(random.sample(range(0, self.num_genes), self.mutation_num_genes)) for gene_idx in mutation_indices: + + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + # Generating a random value. - random_value = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + random_value = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # If the mutation_by_replacement attribute is True, then the random value replaces the current gene value. if self.mutation_by_replacement: @@ -293,8 +316,8 @@ def mutation_randomly(self, offspring): if self.allow_duplicate_genes == False: offspring[offspring_idx], _, _ = self.solve_duplicate_genes_randomly(solution=offspring[offspring_idx], - min_val=self.random_mutation_min_val, - max_val=self.random_mutation_max_val, + min_val=range_min, + max_val=range_max, mutation_by_replacement=self.mutation_by_replacement, gene_type=self.gene_type, num_trials=10) @@ -314,10 +337,18 @@ def mutation_probs_randomly(self, offspring): for offspring_idx in range(offspring.shape[0]): probs = numpy.random.random(size=offspring.shape[1]) for gene_idx in range(offspring.shape[1]): + + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + if probs[gene_idx] <= self.mutation_probability: # Generating a random value. - random_value = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + random_value = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # If the mutation_by_replacement attribute is True, then the random value replaces the current gene value. if self.mutation_by_replacement: @@ -348,8 +379,8 @@ def mutation_probs_randomly(self, offspring): if self.allow_duplicate_genes == False: offspring[offspring_idx], _, _ = self.solve_duplicate_genes_randomly(solution=offspring[offspring_idx], - min_val=self.random_mutation_min_val, - max_val=self.random_mutation_max_val, + min_val=range_min, + max_val=range_max, mutation_by_replacement=self.mutation_by_replacement, gene_type=self.gene_type, num_trials=10) @@ -532,6 +563,13 @@ def adaptive_mutation_by_space(self, offspring): mutation_indices = numpy.array(random.sample(range(0, self.num_genes), adaptive_mutation_num_genes)) for gene_idx in mutation_indices: + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + if self.gene_space_nested: # Returning the current gene space from the 'gene_space' attribute. if type(self.gene_space[gene_idx]) in [numpy.ndarray, list]: @@ -544,8 +582,8 @@ def adaptive_mutation_by_space(self, offspring): value_from_space = curr_gene_space # If the gene space is None, apply mutation by adding a random value between the range defined by the 2 parameters 'random_mutation_min_val' and 'random_mutation_max_val'. elif curr_gene_space is None: - rand_val = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + rand_val = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] if self.mutation_by_replacement: value_from_space = rand_val @@ -600,8 +638,8 @@ def adaptive_mutation_by_space(self, offspring): if value_from_space is None: - value_from_space = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + value_from_space = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # Assinging the selected value from the space to the gene. @@ -646,9 +684,17 @@ def adaptive_mutation_randomly(self, offspring): adaptive_mutation_num_genes = self.mutation_num_genes[1] mutation_indices = numpy.array(random.sample(range(0, self.num_genes), adaptive_mutation_num_genes)) for gene_idx in mutation_indices: + + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + # Generating a random value. - random_value = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + random_value = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # If the mutation_by_replacement attribute is True, then the random value replaces the current gene value. if self.mutation_by_replacement: @@ -678,8 +724,8 @@ def adaptive_mutation_randomly(self, offspring): if self.allow_duplicate_genes == False: offspring[offspring_idx], _, _ = self.solve_duplicate_genes_randomly(solution=offspring[offspring_idx], - min_val=self.random_mutation_min_val, - max_val=self.random_mutation_max_val, + min_val=range_min, + max_val=range_max, mutation_by_replacement=self.mutation_by_replacement, gene_type=self.gene_type, num_trials=10) @@ -710,6 +756,14 @@ def adaptive_mutation_probs_by_space(self, offspring): probs = numpy.random.random(size=offspring.shape[1]) for gene_idx in range(offspring.shape[1]): + + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + if probs[gene_idx] <= adaptive_mutation_probability: if self.gene_space_nested: # Returning the current gene space from the 'gene_space' attribute. @@ -723,8 +777,8 @@ def adaptive_mutation_probs_by_space(self, offspring): value_from_space = curr_gene_space # If the gene space is None, apply mutation by adding a random value between the range defined by the 2 parameters 'random_mutation_min_val' and 'random_mutation_max_val'. elif curr_gene_space is None: - rand_val = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + rand_val = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] if self.mutation_by_replacement: value_from_space = rand_val @@ -778,8 +832,8 @@ def adaptive_mutation_probs_by_space(self, offspring): value_from_space = random.choice(values_to_select_from) if value_from_space is None: - value_from_space = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + value_from_space = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # Assinging the selected value from the space to the gene. @@ -825,10 +879,18 @@ def adaptive_mutation_probs_randomly(self, offspring): probs = numpy.random.random(size=offspring.shape[1]) for gene_idx in range(offspring.shape[1]): + + if type(self.random_mutation_min_val) in self.supported_int_float_types: + range_min = self.random_mutation_min_val + range_max = self.random_mutation_max_val + else: + range_min = self.random_mutation_min_val[gene_idx] + range_max = self.random_mutation_max_val[gene_idx] + if probs[gene_idx] <= adaptive_mutation_probability: # Generating a random value. - random_value = numpy.random.uniform(low=self.random_mutation_min_val, - high=self.random_mutation_max_val, + random_value = numpy.random.uniform(low=range_min, + high=range_max, size=1)[0] # If the mutation_by_replacement attribute is True, then the random value replaces the current gene value. if self.mutation_by_replacement: @@ -858,8 +920,8 @@ def adaptive_mutation_probs_randomly(self, offspring): if self.allow_duplicate_genes == False: offspring[offspring_idx], _, _ = self.solve_duplicate_genes_randomly(solution=offspring[offspring_idx], - min_val=self.random_mutation_min_val, - max_val=self.random_mutation_max_val, + min_val=range_min, + max_val=range_max, mutation_by_replacement=self.mutation_by_replacement, gene_type=self.gene_type, num_trials=10) diff --git a/tests/test_crossover_mutation.py b/tests/test_crossover_mutation.py index f65a795..757d912 100644 --- a/tests/test_crossover_mutation.py +++ b/tests/test_crossover_mutation.py @@ -154,6 +154,82 @@ def test_zero_crossover_probability_zero_mutation_probability(): assert result == True +def test_random_mutation_manual_call(): + result, ga_instance = output_crossover_mutation(mutation_type="random", + random_mutation_min_val=888, + random_mutation_max_val=999) + ga_instance.mutation_num_genes = 9 + + temp_offspring = numpy.array(initial_population[0:1]) + offspring = ga_instance.random_mutation(offspring=temp_offspring.copy()) + + comp = offspring - temp_offspring + comp_sorted = sorted(comp.copy()) + comp_sorted = numpy.abs(numpy.unique(comp_sorted)) + + # The other 1 added to include the last value in the range. + assert len(comp_sorted) in range(1, 1 + 1 + ga_instance.mutation_num_genes) + assert comp_sorted[0] == 0 + +def test_random_mutation_manual_call2(): + result, ga_instance = output_crossover_mutation(mutation_type="random", + random_mutation_min_val=888, + random_mutation_max_val=999) + ga_instance.mutation_num_genes = 10 + + temp_offspring = numpy.array(initial_population[0:1]) + offspring = ga_instance.random_mutation(offspring=temp_offspring.copy()) + + comp = offspring - temp_offspring + comp_sorted = sorted(comp.copy()) + comp_sorted = numpy.abs(numpy.unique(comp_sorted)) + + # The other 1 added to include the last value in the range. + assert len(comp_sorted) in range(1, 1 + 1 + ga_instance.mutation_num_genes) + # assert comp_sorted[0] == 0 + +def test_random_mutation_manual_call3(): + # Use random_mutation_min_val & random_mutation_max_val as numbers. + random_mutation_min_val = 888 + random_mutation_max_val = 999 + result, ga_instance = output_crossover_mutation(mutation_type="random", + random_mutation_min_val=random_mutation_min_val, + random_mutation_max_val=random_mutation_max_val, + mutation_by_replacement=True) + ga_instance.mutation_num_genes = 10 + + temp_offspring = numpy.array(initial_population[0:1]) + offspring = ga_instance.random_mutation(offspring=temp_offspring.copy()) + + comp = offspring + comp_sorted = sorted(comp.copy()) + comp_sorted = numpy.abs(numpy.unique(comp)) + + value_space = list(range(random_mutation_min_val, random_mutation_max_val)) + for value in comp_sorted: + assert value in value_space + +def test_random_mutation_manual_call4(): + # Use random_mutation_min_val & random_mutation_max_val as lists. + random_mutation_min_val = [888]*10 + random_mutation_max_val = [999]*10 + result, ga_instance = output_crossover_mutation(mutation_type="random", + random_mutation_min_val=random_mutation_min_val, + random_mutation_max_val=random_mutation_max_val, + mutation_by_replacement=True) + ga_instance.mutation_num_genes = 10 + + temp_offspring = numpy.array(initial_population[0:1]) + offspring = ga_instance.random_mutation(offspring=temp_offspring.copy()) + + comp = offspring + comp_sorted = sorted(comp.copy()) + comp_sorted = numpy.abs(numpy.unique(comp)) + + value_space = list(range(random_mutation_min_val[0], random_mutation_max_val[0])) + for value in comp_sorted: + assert value in value_space + if __name__ == "__main__": print() test_no_crossover_no_mutation() @@ -186,3 +262,15 @@ def test_zero_crossover_probability_zero_mutation_probability(): test_zero_crossover_probability_zero_mutation_probability() print() + test_random_mutation_manual_call() + print() + + test_random_mutation_manual_call2() + print() + + test_random_mutation_manual_call3() + print() + + test_random_mutation_manual_call4() + print() + diff --git a/tests/test_gene_space.py b/tests/test_gene_space.py index 7a375f2..3a1b21f 100644 --- a/tests/test_gene_space.py +++ b/tests/test_gene_space.py @@ -65,12 +65,37 @@ def fitness_func(ga, solution, idx): num_outside = 0 if ga_instance.gene_space_nested == True: for gene_idx in range(ga_instance.num_genes): + + if type(ga_instance.init_range_low) in ga_instance.supported_int_float_types: + range_min_init = ga_instance.init_range_low + range_max_init = ga_instance.init_range_high + else: + range_min_init = ga_instance.init_range_low[gene_idx] + range_max_init = ga_instance.init_range_high[gene_idx] + if type(ga_instance.random_mutation_min_val) in ga_instance.supported_int_float_types: + range_min_mutation = ga_instance.random_mutation_min_val + range_max_mutation = ga_instance.random_mutation_max_val + else: + range_min_mutation = ga_instance.random_mutation_min_val[gene_idx] + range_max_mutation = ga_instance.random_mutation_max_val[gene_idx] + all_gene_values = ga_instance.solutions[:, gene_idx] if type(ga_instance.gene_space[gene_idx]) in [list, tuple, range, numpy.ndarray]: current_gene_space = list(ga_instance.gene_space[gene_idx]) - for val in all_gene_values: - if val in current_gene_space: - # print(val, current_gene_space) + # print("current_gene_space", current_gene_space) + for val_idx, val in enumerate(all_gene_values): + if None in current_gene_space: + if (val in current_gene_space) or (val >= range_min_init and val < range_max_init) or (val >= range_min_mutation and val < range_max_mutation): + pass + else: + # print("###########") + # print(gene_idx, val) + # print(current_gene_space) + # print(range_min_mutation, range_max_mutation) + # print("\n\n") + num_outside += 1 + elif val in current_gene_space: + # print("val, current_gene_space", val, current_gene_space) pass else: # print(gene_idx, val, current_gene_space) @@ -98,14 +123,51 @@ def fitness_func(ga, solution, idx): pass else: num_outside += 1 + elif ga_instance.gene_space[gene_idx] is None: + for val in all_gene_values: + # print(val) + if (val >= range_min_init and val < range_max_init) or (val >= range_min_mutation and val < range_max_mutation): + pass + else: + # print("###########") + # print(gene_idx, val) + # print(ga_instance.gene_space[gene_idx]) + # print(range_min_init, range_max_init) + # print(range_min_mutation, range_max_mutation) + # print("\n\n") + num_outside += 1 else: for gene_idx in range(ga_instance.num_genes): + + if type(ga_instance.init_range_low) in ga_instance.supported_int_float_types: + range_min_init = ga_instance.init_range_low + range_max_init = ga_instance.init_range_high + else: + range_min_init = ga_instance.init_range_low[gene_idx] + range_max_init = ga_instance.init_range_high[gene_idx] + if type(ga_instance.random_mutation_min_val) in ga_instance.supported_int_float_types: + range_min_mutation = ga_instance.random_mutation_min_val + range_max_mutation = ga_instance.random_mutation_max_val + else: + range_min_mutation = ga_instance.random_mutation_min_val[gene_idx] + range_max_mutation = ga_instance.random_mutation_max_val[gene_idx] + all_gene_values = ga_instance.solutions[:, gene_idx] # print("all_gene_values", gene_idx, all_gene_values) if type(ga_instance.gene_space) in [list, tuple, range, numpy.ndarray]: current_gene_space = list(ga_instance.gene_space) for val in all_gene_values: - if val in current_gene_space: + if None in current_gene_space: + if (val in current_gene_space) or (val >= range_min_init and val < range_max_init) or (val >= range_min_mutation and val < range_max_mutation): + pass + else: + # print("###########") + # print(gene_idx, val) + # print(current_gene_space) + # print(range_min_mutation, range_max_mutation) + # print("\n\n") + num_outside += 1 + elif val in current_gene_space: pass else: num_outside += 1 @@ -144,6 +206,11 @@ def test_gene_space_list(): assert num_outside == 0 +def test_gene_space_list_None(): + num_outside, _ = number_respect_gene_space(gene_space=[30, None, 40, 50, None, 60, 70, None, None, None]) + + assert num_outside == 0 + def test_gene_space_numpy(): num_outside, _ = number_respect_gene_space(gene_space=numpy.array(list(range(10)))) @@ -318,6 +385,38 @@ def test_nested_gene_space_list2(): assert num_outside == 0 +def test_nested_gene_space_list3_None(): + num_outside, ga_instance = number_respect_gene_space(gene_space=[[0, None], + [1, 2], + [2, None], + [3, 4], + [None, 5], + None, + [None, 7], + [None, None], + [8, 9], + None], + mutation_by_replacement=True) + + assert num_outside == 0 + +def test_nested_gene_space_list4_None_custom_mutation_range(): + num_outside, ga_instance = number_respect_gene_space(gene_space=[[0, None], + [1, 2], + [2, None], + [3, 4], + [None, 5], + None, + [None, 7], + [None, None], + [8, 9], + None], + random_mutation_min_val=20, + random_mutation_max_val=40, + mutation_by_replacement=True) + + assert num_outside == 0 + def test_nested_gene_space_mix(): num_outside, ga_instance = number_respect_gene_space(gene_space=[[0, 1, 2, 3, 4], numpy.arange(5, 10), @@ -329,7 +428,8 @@ def test_nested_gene_space_mix(): numpy.arange(35, 40), numpy.arange(40, 45), [45, 46, 47, 48, 49]], - gene_type=int) + gene_type=int, + mutation_by_replacement=True) assert num_outside == 0 @@ -344,7 +444,7 @@ def test_nested_gene_space_mix_nested_gene_type(): numpy.arange(35, 40), numpy.arange(40, 45), [45, 46, 47, 48, 49]], - gene_type=[int, float, numpy.float64, [float, 3], [float, 4], numpy.int16, [numpy.float32, 1], int, float, [float, 3]]) + gene_type=[int, float, numpy.float64, [float, 3], int, numpy.int16, [numpy.float32, 1], int, float, [float, 3]]) # print(ga_instance.population) assert num_outside == 0 @@ -434,6 +534,8 @@ def test_nested_gene_space_nested_gene_type_adaptive_mutation(): test_gene_space_list() print() + test_gene_space_list_None() + print() test_gene_space_list_nested_gene_type() print() @@ -478,6 +580,12 @@ def test_nested_gene_space_nested_gene_type_adaptive_mutation(): test_nested_gene_space_list2() print() + test_nested_gene_space_list3_None() + print() + + test_nested_gene_space_list4_None_custom_mutation_range() + print() + test_nested_gene_space_mix() print() diff --git a/tests/test_gene_space_allow_duplicate_genes.py b/tests/test_gene_space_allow_duplicate_genes.py index 2a65f1a..4d78dea 100644 --- a/tests/test_gene_space_allow_duplicate_genes.py +++ b/tests/test_gene_space_allow_duplicate_genes.py @@ -379,39 +379,39 @@ def test_nested_gene_space_mix_initial_population_single_gene_type(): assert num_outside == 0 if __name__ == "__main__": - # print() - # test_gene_space_range() - # print() - # test_gene_space_range_nested_gene_type() - # print() - - # test_gene_space_numpy_arange() - # print() - # test_gene_space_numpy_arange_nested_gene_type() - # print() - - # test_gene_space_list() - # print() - # test_gene_space_list_nested_gene_type() - # print() - - # test_gene_space_list_single_value() - # print() - # test_gene_space_list_single_value_nested_gene_type() - # print() - - # test_gene_space_numpy() - # print() - # test_gene_space_numpy_nested_gene_type() - # print() - - # test_gene_space_dict_without_step() - # print() - # test_gene_space_dict_without_step_nested_gene_type() - # print() - - # test_gene_space_dict_with_step() - # print() + print() + test_gene_space_range() + print() + test_gene_space_range_nested_gene_type() + print() + + test_gene_space_numpy_arange() + print() + test_gene_space_numpy_arange_nested_gene_type() + print() + + test_gene_space_list() + print() + test_gene_space_list_nested_gene_type() + print() + + test_gene_space_list_single_value() + print() + test_gene_space_list_single_value_nested_gene_type() + print() + + test_gene_space_numpy() + print() + test_gene_space_numpy_nested_gene_type() + print() + + test_gene_space_dict_without_step() + print() + test_gene_space_dict_without_step_nested_gene_type() + print() + + test_gene_space_dict_with_step() + print() test_gene_space_dict_with_step_nested_gene_type() print()