Skip to content Skip to sidebar Skip to footer

Raising To A Square With Tensorflow With A Dataset Class

I want to write a neural network which look for a x^2 distribution without a predefined model. Precisely, it is given some points in [-1,1] with their squares to train, and then it

Solution 1:

There are two problems that conspire to give your model poor accuracy, and both involve this line:

sess.run(optimizer, {X: sess.run(ds_x), Y: sess.run(ds_y)})
  1. Only one training step will execute because this code is not in a loop. Your original code ran len(x_train)//rozm_paczki steps, which ought to make more progress.

  2. The two calls to sess.run(ds_x) and sess.run(ds_y) run in separate steps, which means they will contain values from different batches that are unrelated. Each call to sess.run(ds_x) or sess.run(ds_y) moves the Iterator on to the next batch, and discards any parts of the input element that you did not explicitly request in the sess.run() call. Essentially, you will get X from batch i and Y from batch i+1 (or vice versa), and the model will train on invalid data. If you want to get values from the same batch, you need to do it in a single sess.run([ds_x, ds_y]) call.

There are two further concerns that might impact efficiency:

  1. The Dataset is not shuffled. Your original code calls np.random.shuffle() at the beginning of each epoch. You should include a dataset = dataset.shuffle(len(x_train)) before dataset = dataset.repeat().

  2. It is inefficient to fetch the values from the the Iterator back to Python (e.g. when you do sess.run(ds_x)) and feed them back into the training step. It is more efficient to pass the output of the Iterator.get_next() operation directly into the feed-forward step as inputs.

Putting this all together, here's a rewritten version of your program that addresses these four points, and achieves the correct results. (Unfortunately my Polish isn't good enough to preserve the comments, so I've translated to English.)

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

# Generate training data.
x_train = np.random.rand(10**3, 1).astype(np.float32) * 4 - 2
y_train = x_train ** 2# Define hyperparameters.
DIMENSIONS = [50,50,50,1]
NUM_EPOCHS = 500
BATCH_SIZE = 200

dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train))
dataset = dataset.shuffle(len(x_train))  # (Point 3.) Shuffle each epoch.
dataset = dataset.repeat(NUM_EPOCHS)
dataset = dataset.batch(BATCH_SIZE)
iterator = dataset.make_one_shot_iterator()

# (Point 2.) Ensure that `X` and `Y` correspond to the same batch of data.# (Point 4.) Pass the tensors returned from `iterator.get_next()`# directly as the input of the network.
X, Y = iterator.get_next()

# Initialize variables.
weights = []
biases = []
n_inputs = 1for i, n_outputs inenumerate(DIMENSIONS):
  with tf.variable_scope("layer_{}".format(i)):
    w = tf.get_variable(name="W", shape=[n_inputs, n_outputs],
                        initializer=tf.random_normal_initializer(
                            mean=0.0, stddev=0.02, seed=42))
    b = tf.get_variable(name="b", shape=[n_outputs],
                        initializer=tf.zeros_initializer)
    weights.append(w)
    biases.append(b)
    n_inputs = n_outputs

defforward_pass(X,weights,biases):
  h = X
  for i inrange(len(weights)):
    h=tf.add(tf.matmul(h, weights[i]), biases[i])
    h=tf.nn.relu(h)
  return h

output_layer = forward_pass(X, weights, biases)
loss = tf.reduce_sum(tf.reduce_mean(
    tf.squared_difference(output_layer, Y), 1))
optimizer = tf.train.AdamOptimizer(learning_rate=0.003).minimize(loss)
saver = tf.train.Saver()

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())

  # (Point 1.) Run the `optimizer` in a loop. Use try-while-except to iterate# until all elements in `dataset` have been consumed.try:
    whileTrue:
      sess.run(optimizer)
  except tf.errors.OutOfRangeError:
    pass

  save = saver.save(sess, "./model.ckpt")
  print("Model saved to path: %s" % save)

  # Evaluate network.
  x_test = np.linspace(-1, 1, 600)
  network_outputs = sess.run(output_layer, feed_dict={X: x_test.reshape(-1, 1)})

plt.plot(x_test,x_test**2,color='r',label='y=x^2')
plt.plot(x_test,network_outputs,color='b',label='NN prediction')
plt.legend(loc='right')
plt.show()

Post a Comment for "Raising To A Square With Tensorflow With A Dataset Class"