master
/ job_logs / job-gpu-5c00aa091afd942d0d0ee1d3.log

job-gpu-5c00aa091afd942d0d0ee1d3.log @6135863

4dccd7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2018-11-30T03:10:34.743220274Z SYSTEM: Preparing env...
2018-11-30T03:10:35.306998732Z SYSTEM: Running...
2018-11-30T03:10:40.29264211Z Writing to /home/jovyan/work/results/tb_results/tensorboard_log/1543547440
2018-11-30T03:10:40.292681573Z 
2018-11-30T03:10:40.566405264Z ============================================================
2018-11-30T03:10:40.566450076Z All final and intermediate outputs will be stored in ./results/output_poem/
2018-11-30T03:10:40.566456622Z ============================================================
2018-11-30T03:10:40.566461037Z 
2018-11-30T03:10:40.569689958Z 11:10:40 INFO:args are:
2018-11-30T03:10:40.569718805Z Namespace(batch_size=16, best_model='', best_valid_ppl=inf, cell_type='lstm', data_path='./datasets/yangsaisai-poetrydatasets-0_0_1/', debug=False, dropout=0.0, embedding_size=128, encoding='utf-8', hidden_size=128, init_dir='', init_model='', input_dropout=0.0, learning_rate=0.005, max_grad_norm=5.0, num_epochs=8, num_layers=2, num_unrollings=64, output_dir='./results/output_poem', progress_freq=100, save_best_model='./results/output_poem/best_model/model', save_model='./results/output_poem/save_model/model', tb_log_dir='/home/jovyan/work/results/tb_results/tensorboard_log/1543547440', test=False, train_frac=0.9, valid_frac=0.05, verbose=0)
2018-11-30T03:10:40.569730565Z 11:10:40 INFO:Parameters are:
2018-11-30T03:10:40.569735401Z {
2018-11-30T03:10:40.569748051Z     "batch_size": 16,
2018-11-30T03:10:40.569753585Z     "cell_type": "lstm",
2018-11-30T03:10:40.569809768Z     "dropout": 0.0,
2018-11-30T03:10:40.569814986Z     "embedding_size": 128,
2018-11-30T03:10:40.569819543Z     "hidden_size": 128,
2018-11-30T03:10:40.569823986Z     "input_dropout": 0.0,
2018-11-30T03:10:40.569828419Z     "learning_rate": 0.005,
2018-11-30T03:10:40.569832716Z     "max_grad_norm": 5.0,
2018-11-30T03:10:40.569837163Z     "num_layers": 2,
2018-11-30T03:10:40.569841581Z     "num_unrollings": 64
2018-11-30T03:10:40.569846028Z }
2018-11-30T03:10:40.569850169Z 
2018-11-30T03:10:40.772924673Z tensor_file:./datasets/yangsaisai-poetrydatasets-0_0_1/poem_ids.txt
2018-11-30T03:10:40.77330874Z Loading dataset from ./datasets/yangsaisai-poetrydatasets-0_0_1/poem_ids.txt
2018-11-30T03:10:41.189893627Z file maxSeqLen = 64
2018-11-30T03:10:41.192855107Z Loaded ./datasets/yangsaisai-poetrydatasets-0_0_1/:  training  samples:65235 ,validationSamples:3837,testingSamples:7676
2018-11-30T03:10:42.059726048Z tensor_file:./datasets/yangsaisai-poetrydatasets-0_0_1/poem_ids.txt
2018-11-30T03:10:42.060155614Z Loading dataset from ./datasets/yangsaisai-poetrydatasets-0_0_1/poem_ids.txt
2018-11-30T03:10:42.696564323Z file maxSeqLen = 64
2018-11-30T03:10:42.696616762Z Loaded ./datasets/yangsaisai-poetrydatasets-0_0_1/:  training  samples:65235 ,validationSamples:3837,testingSamples:7676
2018-11-30T03:10:42.795364938Z tensor_file:./datasets/yangsaisai-poetrydatasets-0_0_1/poem_ids.txt
2018-11-30T03:10:42.795388552Z Loading dataset from ./datasets/yangsaisai-poetrydatasets-0_0_1/poem_ids.txt
2018-11-30T03:10:43.616568931Z file maxSeqLen = 64
2018-11-30T03:10:43.616608912Z Loaded ./datasets/yangsaisai-poetrydatasets-0_0_1/:  training  samples:65235 ,validationSamples:3837,testingSamples:7676
2018-11-30T03:10:43.651584221Z 11:10:43 INFO:Creating graph
2018-11-30T03:11:00.143061766Z 11:11:00 INFO:Start training
2018-11-30T03:11:00.143107049Z 
2018-11-30T03:11:00.144052163Z 2018-11-30 11:11:00.143303: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-11-30T03:11:00.380932571Z 2018-11-30 11:11:00.379799: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-11-30T03:11:00.380987034Z 2018-11-30 11:11:00.380696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
2018-11-30T03:11:00.380997311Z name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
2018-11-30T03:11:00.381003247Z pciBusID: 0000:00:07.0
2018-11-30T03:11:00.381008171Z totalMemory: 15.90GiB freeMemory: 15.61GiB
2018-11-30T03:11:00.381013071Z 2018-11-30 11:11:00.380730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-11-30T03:11:02.841998771Z 2018-11-30 11:11:02.841332: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-30T03:11:02.842049974Z 2018-11-30 11:11:02.841366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2018-11-30T03:11:02.842057724Z 2018-11-30 11:11:02.841373: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2018-11-30T03:11:02.84220613Z 2018-11-30 11:11:02.841810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15129 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0)
2018-11-30T03:11:09.973909133Z 11:11:09 INFO:=================== Epoch 0 ===================
2018-11-30T03:11:09.973936137Z 
2018-11-30T03:11:09.973942317Z 11:11:09 INFO:Training on training set
2018-11-30T03:11:31.524400055Z 11:11:31 INFO:2.5%, step:99, perplexity: 611.084, speed: 4794 words