안녕하세요.
K-fold cv를 해보려고 하는데, 자꾸 중간에 아래와 같이 Killed 메세지가 발생합니다.
gpu_device.cc:1001] 0: N
2019-02-22 11:23:21.115342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21551 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:08:00.0, compute capability: 6.1)
./train.sh: line 6: 10 Killed python3 /stroke/train.py
2019-02-22 23:55:25.407161: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-02-22 23:55:25.666164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:08:00.0
로컬에서는 잘 돌아가는데, 어떤 원인으로 docker에서 killed되는지 잘 모르겠습니�� .
ID는 c3c5e1a5-35f5-40c9-8e96-4241baf9ff57 입니다.
확인 부탁드립니다.
감사합니다.
Created by checksys container state가 "OOMKilled": true 로 나오네요.
main memory 사용량을 초과한 것으로 보입니다.