0. 호스트 정보
A : gpucluster.kobic.kr
B : compute-0-0
A,B]# vi /etc/hosts
192.168.1.1 gpucluster.kobic.kr gpucluster
192.168.1.2 compute-0-0.local compute-0-0
1. B 호스트에 GPU Drivers, Toolkit, SDK 설치
http://ctrlcv.co.cc/entry/Install-NVIDIA-DRIVER-CUDA-Toolkit-SDK-%EC%84%A4%EC%B9%98CentOS-55
2..헤드노드에 torque 설치
다운로드 : http://clusterresources.com/downloads/torque
A # cd /usr/local/src
A # wget http://www.adaptivecomputing.com/resources/downloads/torque/torque-3.0.5.tar.gz
A # tar zxvfp torque-3.0.5.tar.gz
A # cd torque-3.0.5
### 컴파일 및 설치
A # ./configure --with-debug --enable-nvidia-gpus --enable-server --enable-clients --enable-docs --enable-mom --with-default-server=gpucluster.kobic.kr --with-scp && make && make install
### 패키지 생성
A # make packages
### 서버 기본 설정
A # ./torque.setup gpucluster.kobic.kr
initializing TORQUE (admin: gpucluster.kobic.kr@gpucluster.kobic.kr)
Max open servers: 10239
Max open servers: 10239
### 노드 추가
A # vi /var/spool/torque/server_priv/nodes
compute-0-0 np=16
### 데몬 실행
A # pbs_server
A # pbs_sched
### 데몬 재시작
A # killall pbs_server
A # pbs_server
A # killall pbs_sched
A # pbs_sched
### rc.local 추가
A # echo "/usr/local/sbin/pbs_server" >> /etc/rc.d/rc.local
A # echo "/usr/local/sbin/pbs_sched" >> /etc/rc.d/rc.local
### PBS 노드 확인
A # pbsnodes -a
A # qnodes
3. 계산 노드에 torque 설치
A # scp *.sh compute-0-0:/usr/local/src/
B # cd /usr/local/src
B # ./torque-package-mom-linux-x86_64.sh --install
B # ./torque-package-clients-linux-x86_64.sh --install
B # ./torque-package-doc-linux-x86_64.sh --install
B # ./torque-package-devel-linux-x86_64.sh --install
B # echo "/usr/local/sbin/pbs_mom" >> /etc/rc.d/rc.local
B # echo "\$pbsserver gpucluster.kobic.kr" > /var/spool/torque/mom_priv/config
B # echo "\$logevent 255" >> /var/spool/torque/mom_priv/config
B # cat /var/spool/torque/server_name
gpucluster.kobic.kr
B # pbs_mom
최근댓글