Greenplum Workload Management(gp-wlm)

Greenplum Workload Mangement (gp-wlm)

1)        Greenplum workload Management ?

n  참고 URL
               -       http://gpcc.docs.pivotal.io/300/gp-wlm/topics/gpwlm-docs.html
               -       http://gpcc.docs.pivotal.io/210/gp-wlm/welcome.html

2)        gp-wlm 설치

n  사전 준비 사항(Prerequisites)
               -       Red Hat Enterprise Linux (RHEL) 64-bit 5.5+ or 6 or CentOS 64-bit 5.5+ or 6
               -       Greenplum Database version 4.3.x
               -       Pivotal Greenplum Command Center installer

n  설치 파일
                -       Network.pivotal.io 에서 다운로드
                -       다운로드 위치: Greenplum Command Center
                -       설치 파일 : Greenplum Database -- Command Center 3.0.1 설치
                                /usr/local/greenplum-cc-web 하위 경로에 gp-wlm-1.6.0.bin 설치 파일이 있음.

n  Gp-wlm 설치
               -       설치 경로를 /home/gpadmin 설치해야지만 가능(gpadmin 계정을 이용하기 때문)
               -       /usr/local/gp-wlm 으로 경우, 설치시 에러 발생


$ su – gpadmin

$ cd /usr/local/greenplum-cc-web

$ chmod +x gp-wlm-1.6.0.bin

$ ./gp-wlm-1.6.0.bin --install=/home/gpadmin/

## 재설치가 필요할 경우

$ ./gp-wlm-1.6.0.bin --install=/home/gpadmin/ --force


## gp-wlm_path.sh 를 환경 설정 파일에Source .

$ vi ~/.bash_profile

. /home/gpadmin/gp-wlm/gp-wlm_path.sh


## 삭제시

$ /home/gpadmin/gp-wlm/bin/uninstall --symlink /home/gpadmin/gp-wlm

3)        gp-wlm 서비스

n  gp-wlm 구동 utility 위한 경로
           -       /home/gpadmin/gp-wlm/bin/svc-mgr.sh
           -       $ svc-mgr.sh –help
n  gp-wlm 실행 Command

구  분


Gp-wlm Start

./svc-mgr.sh --service=all --action=cluster-start

Gp-wlm Stop

./svc-mgr.sh --service=all --action=cluster-stop

Gp-wlm 상태

./svc-mgr.sh --service=all --action=cluster-status

Gp-wlm Restart

./svc-mgr.sh --service=all --action=cluster-restart

Gp-wlm enable

./svc-mgr.sh --service=all --action=cluster-enable

Gp-wlm disable

./svc-mgr.sh --service=all --action=cluster-disable

n  gp-wlm 상태 확인(정상적인 Case)

## 특정 호스트에서 수행

./svc-mgr.sh --service=all --action=status

RabbitMQ is running out of the current installation. (PID=22541)

agent (pid 22732) is running...

cfgmon (pid 22858) is running...

rulesengine (pid 22921) is running...


## 클러스터 수행

[gpadmin@gpmdw bin]$ ./svc-mgr.sh --service=all --action=cluster-status


RabbitMQ is running out of the current installation. (PID=7396)


RabbitMQ is running out of the current installation. (PID=4047)


RabbitMQ is running out of the current installation. (PID=4027)

agent (pid 7614) is running...


agent (pid 4320) is running...


agent (pid 4300) is running...

cfgmon (pid 7766) is running...


cfgmon (pid 4481) is running...


cfgmon (pid 4461) is running...

rulesengine (pid 7850) is running...


rulesengine (pid 4561) is running...


rulesengine (pid 4545) is running...

svcmon (pid 8001) is running...


svcmon (pid 4899) is running...


svcmon (pid 4876) is running...

[gpadmin@gpmdw bin]$

4)        gp-wlm 사용법

Usage: gp-wlm [-g | gptop]

            [--rq-add= with ]


            [--rq-modify= with ] [--rq-show=all]

            [--rq-useradd= to ]

            [--rq-userdel= from ]

            [--rule-add=[transient] ]

            [--rule-delete=all|] [--rule-dump=] [--rule-import=]

            [--rule-modify=[transient] ] [--rule-restore=]

            [--rule-show=all| [ ]]


            [--config-show ] [--config-describe ]

            [--config-modify =]

            [--set-domain=] [--set-host=] [--schema-path=]

            [--version] [--help] [--usage]Usage: gp-wlm [-g | gptop]

            [--rq-add= with ]


            [--rq-modify= with ] [--rq-show=all]

            [--rq-useradd= to ]

            [--rq-userdel= from ]

            [--rule-add=[transient] ]

            [--rule-delete=all|] [--rule-dump=] [--rule-import=]

            [--rule-modify=[transient] ] [--rule-restore=]

            [--rule-show=all| [ ]]


            [--config-show ] [--config-describe ]

            [--config-modify =]

            [--set-domain=] [--set-host=] [--schema-path=]

            [--version] [--help] [--usage]

5)        gptop (모니터링)

n  putty 설정
            -       Connection > Data > Terminal Details > Terminal-type String : xterm-color 또는 putty 설정
            -       Window > Translatioin > Remote Character set: Use font encoding 으로 설정

n  putty 설정 화면



n  putty 에서 gptop 수행 화면
           -       메뉴를 위해서는 F2 클릭하고 / 화살표(<- -="">) 원하는 모니터링 가능 .


6)        Rule 적용

   n  Rule 기본 기능

-       host:throttle_gpdb_query  : 쿼리 수행 CPU, Memory, IO 제어 
           -       host:pg_cancel_backend    : 쿼리 취소 기능
           -       pg_terminate_backend   : 쿼리 취소 기능
           -       gpdb_record                          : 임계치의 시스템 리소스를 사용했을 로깅 기능
n  Rule 적용 범위

-       계정 / 세션 / Host / 프로세스 /
            -       시스템 리소스 : cpu/memory/io

n  Rule 적용


$ gp-wlm

## 적용된 Rule 확인

gpmdw.gphd.local/gpdb-cluster> rule show all


--- Name ---    ----------- Expression -----------

 udba_ss_tot_cpu_throttle_log    gpdb_record(message="udba_ss_tot_cpu_throttle_log") when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'


 udba_ss_tot_cpu_throttle        host:throttle_gpdb_query(max_cpu=5) when session_id:host:total_cpu > 200 and  session_id:host:pid:usename = 'udba' and session_id:host:pid:runtime > 0


7)        Rule 샘플

n  Rule 적용시 주의 사항
         -       한줄로 Command 수행해야 .(여러 라인으로 Command 수행시 에러 발생)

n  Record high cpu utilization queries
         -       Cpu 임계치 이상일 경우 DB 로그 적재
                     (실제 파일로 보관되며, external table 확인이 가능 )


rule add simple gpdb_record(message="Too much cpu for gpadmin")

when session_id:host:total_cpu > 100

and session_id:host:pid:usename = ‘gpadmin’


n  Throttle the cpu utilization of a query
         -       개별 프로세스 CPU Max 설정 .


when host:pid:cpu_util > 20

and session_id:host:pid:usename = 'gpadmin'

and session_id:host:pid:runtime > 20


n  Cancel any query running longer than 120 seconds
         -       개별 프로세스 CPU Max 설정 .


rule add kill_long pg_terminate_backend()

when session_id:host:pid:runtime > 120


n  Throttle and even out skew
        -       개별 프로세스 CPU Max 설정 .


rule add skewrule host:throttle_gpdb_query(max_cpu=50)

when session_id:host:total_cpu > 100

and session_id:host:pid:current_query =~ /select.*skewtest/


n  Complex rule
         -       개별 프로세스 CPU Max 설정 .


rule add comborule gpdb_record(message="My Message")

when ((session_id:host:total_cpu > 90 and session_id:host:pid:runtime > 45)

or session_id:cpu_skew > 20)

and session_id:host:pid:current_query =~ /select.*test/


n  Record queries with high memory usage
         -       개별 프로세스 CPU Max 설정 .


rule add transient mem_high_segment_useage_20

gpdb_record(message=”MEM: high segment pctusage - 20%”) when

host:pid:resident_size_pct > 20

and session_id:host:pid:usename =~/.*/


n  Record queries with memory (rss) skew above 10%
         -       개별 프로세스 CPU Max 설정 .


rule add mem_skew_10 gpdb_record(message="MEM: query skew 10")

when session_id:resident_size_pct_skew > 10

and session_id:host:pid:usename =~/.*/



n  특정 계정의 세션에서 Total CPU 100 로그남기고, CPU 조절하는 Case


rule add udba_ss_tot_cpu_throttle_log gpdb_record(message="udba_ss_tot_cpu_throttle_log") when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'


rule add modify udba_ss_tot_cpu_throttle_log gpdb_record(message="udba_ss_tot_cpu_throttle_log") when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'


rule modify udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'


n  Rule 수정(modify)


rule add  udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'


rule modify udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 200 and  session_id:host:pid:usename = 'udba'


8)        CPU / Memory 리소스 모니터링

n  Rule 수정(modify)


[gpadmin@gpsdw1 ~]$ cat chk_process.sh

DT=`date "+%Y-%M-%d %H:%M:%S"`


for i in `seq 1 14200`



    echo $HEADER | awk -F"|" '{print $1"\t"$2"\t"$3"\t"$4"\t"$5}'

    ps auxwww | grep gpadmin | grep postgres | grep con | grep -v grep | awk '{cpu[$17] += $3}{ cnt[$17] += 1}{mem[$17] += $4}  END {for ( i in cpu) print i"\t\t" cnt[i]"\t"cpu[i]"\t"mem[i]}' | awk -F"\t" '{ if($2>40 || $3>300 || $4>10)print $0}' | awk -v date=`date "+%Y-%M-%d_%H:%M:%S"` '{print date"\t" $0}'



    sleep 2


[gpadmin@gpsdw1 ~]$


[gpadmin@gpsdw1 ~]$ ./chk_process.sh

=========Date========   ===Session===   =Pcnt=  ==Cpu== ==Mem==

2017-54-12_12:54:29     con2107         112     82.6    22.6


=========Date========   ===Session===   =Pcnt=  ==Cpu== ==Mem==

2017-54-12_12:54:31     con2107         112     82.7    22.6


=========Date========   ===Session===   =Pcnt=  ==Cpu== ==Mem==

2017-54-12_12:54:33     con2107         112     80.2    22.6



