# Keras and the Last Number Problem

Let's see if we can do better than our simple hidden layer NN with the last number problem.

In [3]:
import numpy as np
import keras
from keras.utils import np_utils

2023-05-02 08:24:38.881568: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-05-02 08:24:38.910368: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-05-02 08:24:38.910877: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


We'll use the same data class

In [4]:
class ModelDataCategorical:
    """this is the model data for our "last number" training set.  We
    produce input of length N, consisting of numbers 0-9 and store
    the result in a 10-element array as categorical data.

    """
    def __init__(self, N=10):
        self.N = N
        
        # our model input data
        self.x = np.random.randint(0, high=10, size=N)
        self.x_scaled = self.x / 10 + 0.05
        
        # our scaled model output data
        self.y = np.array([self.x[-1]])
        self.y_scaled = np.zeros(10) + 0.01
        self.y_scaled[self.x[-1]] = 0.99
        
    def interpret_result(self, out):
        """take the network output and return the number we predict"""
        return np.argmax(out)

For Keras, we need to pack the scaled data (both input and output) into arrays.  We'll use
the Keras `np_utils.to_categorical()` to make the data categorical.

Let's make both a training set and a test set

In [6]:
x_train = []
y_train = []
for _ in range(10000):
    m = ModelDataCategorical()
    x_train.append(m.x_scaled)
    y_train.append(m.y)

x_train = np.asarray(x_train)
y_train = np_utils.to_categorical(y_train, 10)

In [7]:
x_test = []
y_test = []
for _ in range(1000):
    m = ModelDataCategorical()
    x_test.append(m.x_scaled)
    y_test.append(m.y)

x_test = np.asarray(x_test)
y_test = np_utils.to_categorical(y_test, 10)

Check to make sure the data looks like we expect:

In [10]:
x_train[0]

array([0.15, 0.35, 0.35, 0.25, 0.25, 0.55, 0.75, 0.75, 0.95, 0.15])

In [11]:
y_train[0]

array([0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)

Now let's build our network.  We'll use just a single hidden layer,
but instead of the sigmoid used before, we'll use RELU and the softmax activations.

In [12]:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from tensorflow.keras.optimizers import RMSprop

In [13]:
model = Sequential()
model.add(Dense(100, input_dim=10, activation="relu"))
model.add(Dropout(0.1))
model.add(Dense(10, activation="softmax"))

In [14]:
rms = RMSprop()
model.compile(loss='categorical_crossentropy',
              optimizer=rms, metrics=['accuracy'])

Now we can train and test each epoch to see how we do

In [13]:
epochs = 100
batch_size = 256
model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size,
          validation_data=(x_test, y_test), verbose=2)

Epoch 1/100
40/40 - 0s - loss: 1.0717 - accuracy: 0.6682 - val_loss: 1.0455 - val_accuracy: 0.7210 - 72ms/epoch - 2ms/step
Epoch 2/100
40/40 - 0s - loss: 1.0422 - accuracy: 0.6832 - val_loss: 1.0176 - val_accuracy: 0.7890 - 50ms/epoch - 1ms/step
Epoch 3/100
40/40 - 0s - loss: 1.0140 - accuracy: 0.7036 - val_loss: 0.9844 - val_accuracy: 0.7540 - 51ms/epoch - 1ms/step
Epoch 4/100
40/40 - 0s - loss: 0.9882 - accuracy: 0.7129 - val_loss: 0.9576 - val_accuracy: 0.8150 - 50ms/epoch - 1ms/step
Epoch 5/100
40/40 - 0s - loss: 0.9619 - accuracy: 0.7325 - val_loss: 0.9358 - val_accuracy: 0.7930 - 50ms/epoch - 1ms/step
Epoch 6/100
40/40 - 0s - loss: 0.9370 - accuracy: 0.7417 - val_loss: 0.9178 - val_accuracy: 0.7960 - 51ms/epoch - 1ms/step
Epoch 7/100
40/40 - 0s - loss: 0.9147 - accuracy: 0.7533 - val_loss: 0.8856 - val_accuracy: 0.8390 - 52ms/epoch - 1ms/step
Epoch 8/100
40/40 - 0s - loss: 0.8883 - accuracy: 0.7750 - val_loss: 0.8759 - val_accuracy: 0.8050 - 50ms/epoch - 1ms/step
Epoch 9/100
40/4

<keras.callbacks.History at 0x7f63906cd790>

As we see, the network is essentially perfect now.