Greenplum
Workload Mangement (gp-wlm)
1)
Greenplum workload Management 란?
n 참고 URL
-
http://gpcc.docs.pivotal.io/300/gp-wlm/topics/gpwlm-docs.html- http://gpcc.docs.pivotal.io/210/gp-wlm/welcome.html
2)
gp-wlm 설치
n 사전 준비 사항(Prerequisites)
-
Red Hat Enterprise
Linux (RHEL) 64-bit 5.5+ or 6 or CentOS 64-bit 5.5+ or 6- Greenplum Database version 4.3.x
- Pivotal Greenplum Command Center installer
n 설치 파일
-
Network.pivotal.io 에서 다운로드- 다운로드 위치: Greenplum Command Center
- 설치 파일 : Greenplum Database -- Command Center 3.0.1 설치 후
/usr/local/greenplum-cc-web 하위 경로에 gp-wlm-1.6.0.bin 설치 파일이 있음.
n Gp-wlm 설치
-
설치 경로를 /home/gpadmin 로 설치해야지만 가능(gpadmin 계정을 이용하기 때문)- /usr/local/gp-wlm 으로 할 경우, 설치시 에러 발생
$ su – gpadmin
$ cd /usr/local/greenplum-cc-web
$ chmod +x gp-wlm-1.6.0.bin
$ ./gp-wlm-1.6.0.bin
--install=/home/gpadmin/
## 재설치가 필요할 경우
$ ./gp-wlm-1.6.0.bin --install=/home/gpadmin/
--force
## gp-wlm_path.sh 를 환경 설정
파일에Source 함.
$ vi ~/.bash_profile
. /home/gpadmin/gp-wlm/gp-wlm_path.sh
## 삭제시
$ /home/gpadmin/gp-wlm/bin/uninstall
--symlink /home/gpadmin/gp-wlm
|
3)
gp-wlm 서비스
n gp-wlm 구동 utility를 위한 경로
-
/home/gpadmin/gp-wlm/bin/svc-mgr.sh- $ svc-mgr.sh –help
n gp-wlm 실행 Command
구 분
|
명령어
|
Gp-wlm Start
|
./svc-mgr.sh --service=all
--action=cluster-start
|
Gp-wlm Stop
|
./svc-mgr.sh --service=all
--action=cluster-stop
|
Gp-wlm 상태
|
./svc-mgr.sh --service=all
--action=cluster-status
|
Gp-wlm Restart
|
./svc-mgr.sh --service=all
--action=cluster-restart
|
Gp-wlm
enable
|
./svc-mgr.sh --service=all
--action=cluster-enable
|
Gp-wlm
disable
|
./svc-mgr.sh --service=all
--action=cluster-disable
|
n gp-wlm 상태 확인(정상적인 Case)
## 특정 호스트에서 수행
./svc-mgr.sh --service=all --action=status
RabbitMQ is running out of the current
installation. (PID=22541)
agent (pid 22732) is running...
cfgmon (pid 22858) is running...
rulesengine (pid 22921) is running...
## 클러스터 수행
[gpadmin@gpmdw bin]$ ./svc-mgr.sh
--service=all --action=cluster-status
gpmdw.gphd.local:
RabbitMQ is running out of the current
installation. (PID=7396)
gpsdw1.gphd.local:
RabbitMQ is running out of the current
installation. (PID=4047)
gpsdw2.gphd.local:
RabbitMQ is running out of the current
installation. (PID=4027)
agent (pid 7614) is running...
gpsdw1.gphd.local:
agent (pid 4320) is running...
gpsdw2.gphd.local:
agent (pid 4300) is running...
cfgmon (pid 7766) is running...
gpsdw1.gphd.local:
cfgmon (pid 4481) is running...
gpsdw2.gphd.local:
cfgmon (pid 4461) is running...
rulesengine (pid 7850) is running...
gpsdw1.gphd.local:
rulesengine (pid 4561) is running...
gpsdw2.gphd.local:
rulesengine (pid 4545) is running...
svcmon (pid 8001) is running...
gpsdw1.gphd.local:
svcmon (pid 4899) is running...
gpsdw2.gphd.local:
svcmon (pid 4876) is running...
[gpadmin@gpmdw bin]$
|
4)
gp-wlm 사용법
Usage: gp-wlm [-g | gptop]
[--rq-add=
[--rq-delete=
[--rq-modify=
[--rq-useradd=
[--rq-userdel=
[--rule-add=[transient]
[--rule-delete=all|
[--rule-modify=[transient]
[--rule-show=all|
[--describe=
[--config-show
[--config-modify
[--set-domain=
[--version] [--help]
[--usage]Usage: gp-wlm [-g | gptop]
[--rq-add=
[--rq-delete=
[--rq-modify=
[--rq-useradd=
[--rq-userdel=
[--rule-add=[transient]
[--rule-delete=all|
[--rule-modify=[transient]
[--rule-show=all|
[--describe=
[--config-show
[--config-modify
[--set-domain=
[--version] [--help] [--usage]
|
5)
gptop (모니터링)
n putty 설정
-
Connection > Data
> Terminal Details > Terminal-type String : xterm-color 또는 putty 로 설정- Window > Translatioin > Remote Character set: Use font encoding 으로 설정
n putty 설정 화면
|
n putty 에서 gptop 수행 화면
-
메뉴를 위해서는 F2를 클릭하고 좌/우 화살표(<- -="">) ->로 원하는 모니터링 가능 함.
|
6)
Rule 적용
n Rule 기본 기능
-
host:throttle_gpdb_query : 쿼리 수행 시 CPU, Memory, IO 제어
-
host:pg_cancel_backend : 쿼리 취소 기능- pg_terminate_backend : 쿼리 취소 기능
- gpdb_record : 임계치의 시스템 리소스를 사용했을 때 로깅 기능
n
Rule 적용 범위
-
계정 / 세션 / Host / 프로세스 /
-
시스템 리소스 : cpu/memory/io
n
Rule 적용
$ gp-wlm
## 적용된 Rule 확인
gpmdw.gphd.local/gpdb-cluster> rule show all
--- Name ---
----------- Expression -----------
udba_ss_tot_cpu_throttle_log
gpdb_record(message="udba_ss_tot_cpu_throttle_log") when
session_id:host:total_cpu > 100 and
session_id:host:pid:usename = 'udba'
udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=5)
when session_id:host:total_cpu > 200 and
session_id:host:pid:usename = 'udba' and session_id:host:pid:runtime
> 0
|
7)
Rule 샘플
n
Rule 적용시 주의 사항
-
한줄로 Command 를 수행해야 함.(여러 라인으로 Command 수행시 에러 발생)
n
Record high cpu
utilization queries
-
Cpu 임계치 이상일 경우 DB 로그 적재 (실제 파일로 보관되며, external table 로 확인이 가능 함)
rule add simple gpdb_record(message="Too much
cpu for gpadmin")
when session_id:host:total_cpu > 100
and session_id:host:pid:usename = ‘gpadmin’
|
n
Throttle the cpu
utilization of a query
-
개별 프로세스 CPU Max 를 설정 함.
when host:pid:cpu_util > 20
and session_id:host:pid:usename = 'gpadmin'
and session_id:host:pid:runtime > 20
|
n
Cancel any query
running longer than 120 seconds
-
개별 프로세스 CPU Max 를 설정 함.
rule add kill_long pg_terminate_backend()
when session_id:host:pid:runtime > 120
|
n
Throttle and even out
skew
-
개별 프로세스 CPU Max 를 설정 함.
rule add skewrule
host:throttle_gpdb_query(max_cpu=50)
when session_id:host:total_cpu > 100
and session_id:host:pid:current_query =~
/select.*skewtest/
|
n
Complex rule
-
개별 프로세스 CPU Max 를 설정 함.
rule add comborule gpdb_record(message="My
Message")
when ((session_id:host:total_cpu > 90 and
session_id:host:pid:runtime > 45)
or session_id:cpu_skew > 20)
and session_id:host:pid:current_query =~
/select.*test/
|
n
Record queries with
high memory usage
-
개별 프로세스 CPU Max 를 설정 함.
rule add transient mem_high_segment_useage_20
gpdb_record(message=”MEM: high segment pctusage -
20%”) when
host:pid:resident_size_pct > 20
and session_id:host:pid:usename =~/.*/
|
n
Record queries with
memory (rss) skew above 10%
-
개별 프로세스 CPU Max 를 설정 함.
rule add mem_skew_10 gpdb_record(message="MEM:
query skew 10")
when session_id:resident_size_pct_skew > 10
and session_id:host:pid:usename =~/.*/
|
n
특정 계정의 세션에서 Total CPU가 100 일 때 로그남기고, CPU 조절하는 Case
rule add udba_ss_tot_cpu_throttle_log
gpdb_record(message="udba_ss_tot_cpu_throttle_log") when
session_id:host:total_cpu > 100 and
session_id:host:pid:usename = 'udba'
rule add modify udba_ss_tot_cpu_throttle_log
gpdb_record(message="udba_ss_tot_cpu_throttle_log") when
session_id:host:total_cpu > 100 and
session_id:host:pid:usename = 'udba'
rule modify udba_ss_tot_cpu_throttle
host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 100
and session_id:host:pid:usename =
'udba'
|
n
Rule 수정(modify)
rule add udba_ss_tot_cpu_throttle
host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 100
and session_id:host:pid:usename =
'udba'
rule modify udba_ss_tot_cpu_throttle
host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 200
and session_id:host:pid:usename =
'udba'
|
8)
CPU / Memory 리소스 모니터링
n
Rule 수정(modify)
[gpadmin@gpsdw1 ~]$ cat chk_process.sh
DT=`date "+%Y-%M-%d %H:%M:%S"`
HEADER="=========Date========|===Session===|=Pcnt=|==Cpu==|==Mem=="
for i in `seq 1 14200`
do
echo
$HEADER | awk -F"|" '{print
$1"\t"$2"\t"$3"\t"$4"\t"$5}'
ps auxwww
| grep gpadmin | grep postgres | grep con | grep -v grep | awk '{cpu[$17] +=
$3}{ cnt[$17] += 1}{mem[$17] += $4}
END {for ( i in cpu) print i"\t\t" cnt[i]"\t"cpu[i]"\t"mem[i]}'
| awk -F"\t" '{ if($2>40 || $3>300 || $4>10)print $0}' |
awk -v date=`date "+%Y-%M-%d_%H:%M:%S"` '{print date"\t"
$0}'
echo
sleep 2
done
[gpadmin@gpsdw1 ~]$
[gpadmin@gpsdw1 ~]$
./chk_process.sh
=========Date======== ===Session=== =Pcnt=
==Cpu== ==Mem==
2017-54-12_12:54:29 con2107 112 82.6
22.6
=========Date======== ===Session=== =Pcnt=
==Cpu== ==Mem==
2017-54-12_12:54:31 con2107 112 82.7
22.6
=========Date======== ===Session=== =Pcnt=
==Cpu== ==Mem==
2017-54-12_12:54:33 con2107 112 80.2
22.6
|