2017년 10월 31일 화요일

Greenplum Workload Management(gp-wlm)


Greenplum Workload Mangement (gp-wlm)


1)        Greenplum workload Management ?


n  참고 URL
               -       http://gpcc.docs.pivotal.io/300/gp-wlm/topics/gpwlm-docs.html
               -       http://gpcc.docs.pivotal.io/210/gp-wlm/welcome.html

2)        gp-wlm 설치


n  사전 준비 사항(Prerequisites)
               -       Red Hat Enterprise Linux (RHEL) 64-bit 5.5+ or 6 or CentOS 64-bit 5.5+ or 6
               -       Greenplum Database version 4.3.x
               -       Pivotal Greenplum Command Center installer

n  설치 파일
                -       Network.pivotal.io 에서 다운로드
                -       다운로드 위치: Greenplum Command Center
                -       설치 파일 : Greenplum Database -- Command Center 3.0.1 설치
                                /usr/local/greenplum-cc-web 하위 경로에 gp-wlm-1.6.0.bin 설치 파일이 있음.

n  Gp-wlm 설치
               -       설치 경로를 /home/gpadmin 설치해야지만 가능(gpadmin 계정을 이용하기 때문)
               -       /usr/local/gp-wlm 으로 경우, 설치시 에러 발생



 

$ su – gpadmin

$ cd /usr/local/greenplum-cc-web

$ chmod +x gp-wlm-1.6.0.bin

$ ./gp-wlm-1.6.0.bin --install=/home/gpadmin/

## 재설치가 필요할 경우

$ ./gp-wlm-1.6.0.bin --install=/home/gpadmin/ --force

 

## gp-wlm_path.sh 를 환경 설정 파일에Source .

$ vi ~/.bash_profile

. /home/gpadmin/gp-wlm/gp-wlm_path.sh

 

## 삭제시

$ /home/gpadmin/gp-wlm/bin/uninstall --symlink /home/gpadmin/gp-wlm
 

3)        gp-wlm 서비스


n  gp-wlm 구동 utility 위한 경로
           -       /home/gpadmin/gp-wlm/bin/svc-mgr.sh
           -       $ svc-mgr.sh –help
 
n  gp-wlm 실행 Command
 


구  분

명령어

Gp-wlm Start

./svc-mgr.sh --service=all --action=cluster-start

Gp-wlm Stop

./svc-mgr.sh --service=all --action=cluster-stop

Gp-wlm 상태

./svc-mgr.sh --service=all --action=cluster-status

Gp-wlm Restart

./svc-mgr.sh --service=all --action=cluster-restart

Gp-wlm enable

./svc-mgr.sh --service=all --action=cluster-enable

Gp-wlm disable

./svc-mgr.sh --service=all --action=cluster-disable

 
n  gp-wlm 상태 확인(정상적인 Case)



## 특정 호스트에서 수행

./svc-mgr.sh --service=all --action=status

RabbitMQ is running out of the current installation. (PID=22541)

agent (pid 22732) is running...

cfgmon (pid 22858) is running...

rulesengine (pid 22921) is running...

 

## 클러스터 수행

[gpadmin@gpmdw bin]$ ./svc-mgr.sh --service=all --action=cluster-status

gpmdw.gphd.local:

RabbitMQ is running out of the current installation. (PID=7396)

gpsdw1.gphd.local:

RabbitMQ is running out of the current installation. (PID=4047)

gpsdw2.gphd.local:

RabbitMQ is running out of the current installation. (PID=4027)

agent (pid 7614) is running...

gpsdw1.gphd.local:

agent (pid 4320) is running...

gpsdw2.gphd.local:

agent (pid 4300) is running...

cfgmon (pid 7766) is running...

gpsdw1.gphd.local:

cfgmon (pid 4481) is running...

gpsdw2.gphd.local:

cfgmon (pid 4461) is running...

rulesengine (pid 7850) is running...

gpsdw1.gphd.local:

rulesengine (pid 4561) is running...

gpsdw2.gphd.local:

rulesengine (pid 4545) is running...

svcmon (pid 8001) is running...

gpsdw1.gphd.local:

svcmon (pid 4899) is running...

gpsdw2.gphd.local:

svcmon (pid 4876) is running...

[gpadmin@gpmdw bin]$

4)        gp-wlm 사용법




Usage: gp-wlm [-g | gptop]

            [--rq-add= with ]

            [--rq-delete=]

            [--rq-modify= with ] [--rq-show=all]

            [--rq-useradd= to ]

            [--rq-userdel= from ]

            [--rule-add=[transient] ]

            [--rule-delete=all|] [--rule-dump=] [--rule-import=]

            [--rule-modify=[transient] ] [--rule-restore=]

            [--rule-show=all| [ ]]

            [--describe=]

            [--config-show ] [--config-describe ]

            [--config-modify =]

            [--set-domain=] [--set-host=] [--schema-path=]

            [--version] [--help] [--usage]Usage: gp-wlm [-g | gptop]

            [--rq-add= with ]

            [--rq-delete=]

            [--rq-modify= with ] [--rq-show=all]

            [--rq-useradd= to ]

            [--rq-userdel= from ]

            [--rule-add=[transient] ]

            [--rule-delete=all|] [--rule-dump=] [--rule-import=]

            [--rule-modify=[transient] ] [--rule-restore=]

            [--rule-show=all| [ ]]

            [--describe=]

            [--config-show ] [--config-describe ]

            [--config-modify =]

            [--set-domain=] [--set-host=] [--schema-path=]

            [--version] [--help] [--usage]

5)        gptop (모니터링)


n  putty 설정
            -       Connection > Data > Terminal Details > Terminal-type String : xterm-color 또는 putty 설정
            -       Window > Translatioin > Remote Character set: Use font encoding 으로 설정

n  putty 설정 화면



 

 
 
 
 
 
 
 


n  putty 에서 gptop 수행 화면
           -       메뉴를 위해서는 F2 클릭하고 / 화살표(<- -="">) 원하는 모니터링 가능 .




 

6)        Rule 적용

   n  Rule 기본 기능

-       host:throttle_gpdb_query  : 쿼리 수행 CPU, Memory, IO 제어 
           -       host:pg_cancel_backend    : 쿼리 취소 기능
           -       pg_terminate_backend   : 쿼리 취소 기능
           -       gpdb_record                          : 임계치의 시스템 리소스를 사용했을 로깅 기능
 
n  Rule 적용 범위

-       계정 / 세션 / Host / 프로세스 /
            -       시스템 리소스 : cpu/memory/io

n  Rule 적용



 

$ gp-wlm

## 적용된 Rule 확인

gpmdw.gphd.local/gpdb-cluster> rule show all

 

--- Name ---    ----------- Expression -----------

 udba_ss_tot_cpu_throttle_log    gpdb_record(message="udba_ss_tot_cpu_throttle_log") when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'

 

 udba_ss_tot_cpu_throttle        host:throttle_gpdb_query(max_cpu=5) when session_id:host:total_cpu > 200 and  session_id:host:pid:usename = 'udba' and session_id:host:pid:runtime > 0

 

7)        Rule 샘플


n  Rule 적용시 주의 사항
         -       한줄로 Command 수행해야 .(여러 라인으로 Command 수행시 에러 발생)

n  Record high cpu utilization queries
         -       Cpu 임계치 이상일 경우 DB 로그 적재
                     (실제 파일로 보관되며, external table 확인이 가능 )



 

rule add simple gpdb_record(message="Too much cpu for gpadmin")

when session_id:host:total_cpu > 100

and session_id:host:pid:usename = ‘gpadmin’

 


n  Throttle the cpu utilization of a query
         -       개별 프로세스 CPU Max 설정 .



 


when host:pid:cpu_util > 20

and session_id:host:pid:usename = 'gpadmin'

and session_id:host:pid:runtime > 20

 

 
n  Cancel any query running longer than 120 seconds
         -       개별 프로세스 CPU Max 설정 .



 

rule add kill_long pg_terminate_backend()

when session_id:host:pid:runtime > 120

 

n  Throttle and even out skew
        -       개별 프로세스 CPU Max 설정 .



 

rule add skewrule host:throttle_gpdb_query(max_cpu=50)

when session_id:host:total_cpu > 100

and session_id:host:pid:current_query =~ /select.*skewtest/

 

n  Complex rule
         -       개별 프로세스 CPU Max 설정 .



 

rule add comborule gpdb_record(message="My Message")

when ((session_id:host:total_cpu > 90 and session_id:host:pid:runtime > 45)

or session_id:cpu_skew > 20)

and session_id:host:pid:current_query =~ /select.*test/

 

 
n  Record queries with high memory usage
         -       개별 프로세스 CPU Max 설정 .



 

rule add transient mem_high_segment_useage_20

gpdb_record(message=”MEM: high segment pctusage - 20%”) when

host:pid:resident_size_pct > 20

and session_id:host:pid:usename =~/.*/

 
 

n  Record queries with memory (rss) skew above 10%
         -       개별 프로세스 CPU Max 설정 .



 

rule add mem_skew_10 gpdb_record(message="MEM: query skew 10")

when session_id:resident_size_pct_skew > 10

and session_id:host:pid:usename =~/.*/

 

 

n  특정 계정의 세션에서 Total CPU 100 로그남기고, CPU 조절하는 Case



 

rule add udba_ss_tot_cpu_throttle_log gpdb_record(message="udba_ss_tot_cpu_throttle_log") when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'

 

rule add modify udba_ss_tot_cpu_throttle_log gpdb_record(message="udba_ss_tot_cpu_throttle_log") when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'

 

rule modify udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'

 

n  Rule 수정(modify)



 

rule add  udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 100 and  session_id:host:pid:usename = 'udba'

 

rule modify udba_ss_tot_cpu_throttle host:throttle_gpdb_query(max_cpu=10) when session_id:host:total_cpu > 200 and  session_id:host:pid:usename = 'udba'

 

8)        CPU / Memory 리소스 모니터링


n  Rule 수정(modify)



 

[gpadmin@gpsdw1 ~]$ cat chk_process.sh

DT=`date "+%Y-%M-%d %H:%M:%S"`

HEADER="=========Date========|===Session===|=Pcnt=|==Cpu==|==Mem=="

for i in `seq 1 14200`

do

 

    echo $HEADER | awk -F"|" '{print $1"\t"$2"\t"$3"\t"$4"\t"$5}'

    ps auxwww | grep gpadmin | grep postgres | grep con | grep -v grep | awk '{cpu[$17] += $3}{ cnt[$17] += 1}{mem[$17] += $4}  END {for ( i in cpu) print i"\t\t" cnt[i]"\t"cpu[i]"\t"mem[i]}' | awk -F"\t" '{ if($2>40 || $3>300 || $4>10)print $0}' | awk -v date=`date "+%Y-%M-%d_%H:%M:%S"` '{print date"\t" $0}'

 

    echo

    sleep 2

done

[gpadmin@gpsdw1 ~]$

 

[gpadmin@gpsdw1 ~]$ ./chk_process.sh

=========Date========   ===Session===   =Pcnt=  ==Cpu== ==Mem==

2017-54-12_12:54:29     con2107         112     82.6    22.6

 

=========Date========   ===Session===   =Pcnt=  ==Cpu== ==Mem==

2017-54-12_12:54:31     con2107         112     82.7    22.6

 

=========Date========   ===Session===   =Pcnt=  ==Cpu== ==Mem==

2017-54-12_12:54:33     con2107         112     80.2    22.6

 

 
 
 

Greenplum Disaster Recovery

Greenplum DR를 사용하면, 재해 발생 전 특정 복구 시점으로 복구 지원 Greenplum DR은 Full 백업/복구, Incremental 백업/복구, WAL 로그 기반으로 DR 기능 제공 Greenplum Disaster Recovery 지...