Lstm Inputs For Tensorflow
Solution 1:
An RNN predicts the value of N+1 given the values from 1 to N so far. (LSTM is just one way to implement an RNN cell.)
The short answer is:
- train your model using back propagation on your complete sequences [[x1x1,x1],...,[xmxm,xm]]
- run your trained model forward on the start of your sequence [x1x1,x1,...] then sample from the model to predict the rest of your sequence [xmxm,xm,...].
The longer answer is:
Your example just shows the initialization of the model. You also need to implement a training function to run back propagation as well as a sample function that predicts the results.
The following code snippets are mix & match and are for illustration purposes only...
For training just feed in your complete sequences with start + rest in your data iterator.
For example in the sample code tensorflow/models/rnn/ptb_word_lm.py the training loop computes a cost function for batches of input_data against targets (which are the input_data shifted by one timestep)
# compute a learning rate decay
session.run(tf.assign(self.learning_rate_variable, learning_rate))
logger.info("Epoch: %d Learning rate: %.3f" % (i + 1, session.run(self.learning_rate_variable)))
"""Runs the model on the given data."""
epoch_size = ((len(training_data) // self.batch_size) - 1) // self.num_steps
costs = 0.0
iters = 0
state = self.initial_state.eval()
for step, (x, y) inenumerate(self.data_iterator(training_data, self.batch_size, self.num_steps)):
# x and y should have shape [batch_size, num_steps]
cost, state, _ = session.run([self.cost_function, self.final_state, self.train_op],
{self.input_data: x,
self.targets: y,
self.initial_state: state})
costs += cost
iters += self.num_steps
Note the data iterator in tensorflow/models/rnn/reader.py returns the input data as 'x' and the targets as 'y' which are just shifted one step forward from x. (You would need to create a data iterator like this that packages your set of training sequences.)
def ptb_iterator(raw_data, batch_size, num_steps):
raw_data = np.array(raw_data, dtype=np.int32)
data_len = len(raw_data)
batch_len = data_len // batch_size
data = np.zeros([batch_size, batch_len], dtype=np.int32)
for i in range(batch_size):
data[i] = raw_data[batch_len * i:batch_len * (i + 1)]
epoch_size = (batch_len - 1) // num_stepsif epoch_size == 0:
raise ValueError("epoch_size == 0, decrease batch_size or num_steps")
for i in range(epoch_size):
x = data[:, i*num_steps:(i+1)*num_steps]
y = data[:, i*num_steps+1:(i+1)*num_steps+1]
yield (x, y)
After training, you run the model forward to make predictions for sequences by feeding in the start of your sequence start_x=[X1, X2, X3,...]...this snippets assumes binary values representing classes, you'd have to adjust the sampling function for float values.
defsample(self, sess, num=25, start_x):
# return state tensor with batch size 1 set to zeros, eval
state = self.rnn_layers.zero_state(1, tf.float32).eval()
# run model forward through the start of the sequencefor char in start_x:
# create a 1,1 tensor/scalar set to zero
x = np.zeros((1, 1))
# set to the vocab index
x[0, 0] = char
# fetch: final_state# input_data = x, initial_state = state
[state] = sess.run([self.final_state], {self.input_data: x, self.initial_state:state})
defweighted_pick(weights):
# an array of cummulative sum of weights
t = np.cumsum(weights)
# scalar sum of tensor
s = np.sum(weights)
# randomly selects a value from the probability distributionreturn(int(np.searchsorted(t, np.random.rand(1)*s)))
# PREDICT REST OF SEQUENCE
rest_x = []
# get last character in init
char = start_x[-1]
# sample next num chars in the sequence after init
score = 0.0for n in xrange(num):
# init input to zeros
x = np.zeros((1, 1))
# lookup character index
x[0, 0] = char
# probs = tf.nn.softmax(self.logits)# fetch: probs, final_state# input_data = x, initial_state = state
[probs, state] = sess.run([self.output_layer, self.final_state], {self.input_data: x, self.initial_state:state})
p = probs[0]
logger.info("output=%s" % np.shape(p))
# sample = int(np.random.choice(len(p), p=p))# select a random value from the probability distribution
sample = weighted_pick(p)
score += p[sample]
# look up the key with the index
logger.debug("sample[%d]=%d" % (n, sample))
pred = self.vocabulary[sample]
logger.debug("pred=%s" % pred)
# add the car to the output
rest_x.append(pred)
# set the next input character
char = pred
return rest_x, score
Post a Comment for "Lstm Inputs For Tensorflow"