python - How to set parameters of the Adadelta Algorithm in Tensorflow correctly? -


i've been using tensorflow regression purposes. neural net small 10 input neurons, 12 hidden neurons in single layer , 5 output neurons.

  • activation function relu
  • cost square distance between output , real value
  • my neural net trains correctly other optimizers such gradientdescent, adam, adagrad.

however when try use adadelta, neural net won't train. variables stay same @ every step.

i have tried every initial learning_rate possible (from 1.0e-6 10) , different weights initialization : same.

does have slight idea of going on ?

thanks much

short answer: don't use adadelta

very few people use today, should instead stick to:

  • tf.train.momentumoptimizer 0.9 momentum standard , works well. drawback have find best learning rate.
  • tf.train.rmspropoptimizer: results less dependent on learning rate. algorithm very similar adadelta, performs better in opinion.

if want use adadelta, use parameters paper: learning_rate=1., rho=0.95, epsilon=1e-6. bigger epsilon @ start, prepared wait bit longer other optimizers see convergence.

note in paper, don't use learning rate, same keeping equal 1.


long answer

adadelta has slow start. full algorithm paper is:

adadelta

the issue accumulate square of updates.

  • at step 0, running average of these updates zero, first update small.
  • as first update small, running average of updates small @ beginning, kind of vicious circle @ beginning

i think adadelta performs better bigger networks yours, , after iterations should equal performance of rmsprop or adam.


here code play bit adadelta optimizer:

import tensorflow tf  v = tf.variable(10.) loss = v * v  optimizer = tf.train.adadeltaoptimizer(1., 0.95, 1e-6) train_op = optimizer.minimize(loss)  accum = optimizer.get_slot(v, "accum")  # accumulator of square gradients accum_update = optimizer.get_slot(v, "accum_update")  # accumulator of square updates  sess = tf.session() sess.run(tf.initialize_all_variables())  in range(100):     sess.run(train_op)     print "%.3f \t %.3f \t %.6f" % tuple(sess.run([v, accum, accum_update])) 

the first 10 lines:

  v       accum     accum_update 9.994    20.000      0.000001 9.988    38.975      0.000002 9.983    56.979      0.000003 9.978    74.061      0.000004 9.973    90.270      0.000005 9.968    105.648     0.000006 9.963    120.237     0.000006 9.958    134.077     0.000007 9.953    147.205     0.000008 9.948    159.658     0.000009 

Comments

Popular posts from this blog

magento2 - Magento 2 admin grid add filter to collection -

Android volley - avoid multiple requests of the same kind to the server? -

Combining PHP Registration and Login into one class with multiple functions in one PHP file -