πŸ’» Programming

[AWS/SQS] cloudwatch datapoint μ‘°νšŒν•˜κΈ°

μ•ˆλ…•ν•˜μ„Έμš”, μ΄λ²ˆμ—” μ΄λ²€νŠΈλ“œλ¦¬λΈ μ„œλΉ„μŠ€μ˜ κ°œμ„ μž‘μ—…μ„ ν•˜λ‹€κ°€ μ•Œκ²Œλœ cloudwatch ν†΅κ³„μ§€ν‘œ μ‘°νšŒλ°©λ²•μ„ κ³΅μœ λ“œλ¦½λ‹ˆλ‹€.

이 μž‘μ—…μ„ ν•˜κ²Œλœ 이유λ₯Ό λ§μ”€λ“œλ¦¬μžλ©΄ μ΄λ ‡μŠ΅λ‹ˆλ‹€.

ν˜„μž¬ SQS λ₯Ό μ΄μš©ν•œ μ΄λ²€νŠΈλ“œλ¦¬λΈ ν™˜κ²½μ—μ„œ λ™μž‘ν•˜λŠ” μ„œλΉ„μŠ€κ°€ μžˆμŠ΅λ‹ˆλ‹€. 

이런 μ €λŸ° 정보듀을 λ™κΈ°ν™”ν•˜κΈ° μœ„ν•œ λͺ©μ μœΌλ‘œ μ‚¬μš©ν•˜κ³  있고, 이 μ„œλΉ„μŠ€λŠ” λ©€ν‹°μ“°λ ˆλ“œ ν™˜κ²½μ—μ„œ λ™μž‘ν•˜λ„λ‘ λ˜μ–΄μžˆμŠ΅λ‹ˆλ‹€.

ν˜Ήμ‹œλΌλ„ μ΄λ²€νŠΈκ°€ 지연될 경우λ₯Ό λŒ€λΉ„ν•΄μ„œ μ“°λ ˆλ“œ 개수λ₯Ό μˆ˜μ‹œλ‘œ μˆ˜λ™μ‘°μ ˆν•  수 μžˆλ„λ‘ κ΅¬μ„±ν•΄λ‘μ—ˆμ£ .

그리고 지연이 λ°œμƒν•˜μ—¬ SQS λ©”μ‹œμ§€κ°€ λ°œν–‰λœ λ’€ 일정 μ‹œκ°„λ™μ•ˆ 처리λ₯Ό λͺ»ν•˜μ—¬ 큐에 계속 λ‚¨μ•„μžˆκ²Œλ˜λ©΄ alertκ°€ λ°œμƒν•˜λ„λ‘ λ˜μ–΄μžˆμŠ΅λ‹ˆλ‹€.

즉, SQS의 ApproximateAgeOfOldestMessage μ§€ν‘œκ°’μ„λ³΄κ³  λ„ˆλ¬΄ μ˜€λž˜λ™μ•ˆ μ²˜λ¦¬κ°€ μ•ˆλ  경우 alertλ₯Ό λ°›κ³  μˆ˜λ™μœΌλ‘œ μ“°λ ˆλ“œ 개수λ₯Ό μ‘°μ ˆν•˜λŠ” ν˜•νƒœλ‘œ μœ„κΈ°λ₯Ό λ²—μ–΄λ‚˜κ³  μžˆμŠ΅λ‹ˆλ‹€. 그런데 이 λ“œλ¬Όλ”” λ“œλ¬Έ 사건이라도 개발자라면, 그리고 κ°€λŠ₯ν•œ μΌ€μ΄μŠ€λΌλ©΄ κ·Έλƒ₯ μ „λΆ€ λ‹€ μžλ™ν™”λ₯Ό 해놓아야 ν•˜μ§€ μ•Šμ„κΉŒ 생각이 λ“€μ–΄ κ°œμ„  μž‘μ—…μ— λ“€μ–΄κ°”μŠ΅λ‹ˆλ‹€.

 

일단 SQSμ—μ„œ μ œκ³΅ν•˜λŠ” λͺ¨λ‹ˆν„°λ§ μ§€ν‘œλŠ” μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€. (AWS μ½˜μ†”ν™”λ©΄μ˜ λͺ¨λ‹ˆν„°λ§ νƒ­μ—μ„œ λ³Ό 수 μžˆλŠ” κ²ƒλ“€μž…λ‹ˆλ‹€)

  • Approximate Number Of Messages Delayed
  • Approximate Number Of Messages Not Visible
  • Approximate Number Of Messages Visible
  • Approximate Age Of Oldest Message
  • Number Of Empty Receives
  • Number Of Messages Deleted
  • Number Of Messages Received
  • Number Of Messages Sent
  • Sent Message Size

이 μ§€ν‘œμ— ν•΄λ‹Ήν•˜λŠ” ν†΅κ³„μˆ˜μΉ˜λŠ” cloudwatch μ—μ„œ μˆ˜μ§‘μ΄ λ©λ‹ˆλ‹€.

ν΄λΌμš°λ“œμ›ŒμΉ˜μ—μ„œ μˆ˜μ§‘λœ 데이터λ₯Ό 가지고 λͺ¨λ‹ˆν„°λ§νƒ­μ— κ·Έλž˜ν”„λ‘œ λ³΄μ—¬μ£ΌλŠ” 것이죠.

 

이제 μ œκ°€ μ›ν•˜λŠ” Approximate Age Of Oldest Message 에 λŒ€ν•œ 데이터λ₯Ό 뽑아보도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€.

일단 μ•±μ˜ ꡬ성은 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

  • SpringBoot 2.3.x
  • AWS Java SDK 1.11.x
  • Java 11

AWS java sdk μ—μ„œλŠ” cloudwatch μ„œλΉ„μŠ€μ—μ„œ μ œκ³΅ν•˜λŠ” APIλ₯Ό ν˜ΈμΆœν•˜μ—¬ ν΄λΌμš°λ“œμ›ŒμΉ˜ 데이터λ₯Ό μ‘°νšŒν•  수 μžˆλ„λ‘ cloudwatch clientλ₯Ό μ œκ³΅ν•©λ‹ˆλ‹€. 제일 λ¨Όμ € 이 ν΄λΌμ΄μ–ΈνŠΈλ₯Ό 빈으둜 λ“±λ‘ν•©λ‹ˆλ‹€.

@Configuration
public class AWSConfig {

    private AWSCredentialsProvider awsCredentialsProvider() {
        List<AWSCredentialsProvider> credentialsProviders = new ArrayList<>();
        credentialsProviders.add(new InstanceProfileCredentialsProvider(true));
        credentialsProviders.add(new ProfileCredentialsProvider());
        return new AWSCredentialsProviderChain(credentialsProviders);
    }

    @Bean
    public AmazonCloudWatch cloudWatchClient() {
        return AmazonCloudWatchClientBuilder.standard()
                .withCredentials(awsCredentialsProvider())
                .withRegion(Regions.fromName("ap-northeast-2"))
                .build()
                ;
    }
}

 

그리고 μ„œλΉ„μŠ€ λ ˆμ΄μ–΄μ—μ„œ 이 cloudWatchClientλ₯Ό κ°€μ Έλ‹€ μ¨λ³΄κ² μŠ΅λ‹ˆλ‹€.

    private void getQueueStatus(String queueName) {
        long currentMillis = System.currentTimeMillis();
        long fiveMinutesInMillis = 5 * 60 * 1000;
        GetMetricStatisticsRequest statisticsRequest = new GetMetricStatisticsRequest()
                .withNamespace("AWS/SQS").withMetricName("ApproximateAgeOfOldestMessage")
                .withStatistics(Statistic.Maximum).withPeriod(300)
                .withStartTime(new Date(currentMillis - fiveMinutesInMillis))
                .withEndTime(new Date(currentMillis))
                .withDimensions(new Dimension().withName("QueueName").withValue(queueName));

        GetMetricStatisticsResult result = cloudWatch.getMetricStatistics(statisticsRequest);
        log.debug("dataPoints: {}", result.getDatapoints());
    }

 

cloudWatchClient λ₯Ό μ΄μš©ν•˜μ—¬ ν†΅κ³„μˆ˜μΉ˜ 데이터λ₯Ό μ‘°νšŒν•˜λ €λ©΄ GetMetricStatisticsRequest 객체λ₯Ό λ§Œλ“€μ–΄μ„œ λ„£μ–΄μ£Όμ–΄μ•Ό ν•©λ‹ˆλ‹€. 

이 객체에 μ„€μ •ν•΄μ€˜μ•Ό ν•˜λŠ” κ°’λ“€ 쀑 ν•„μˆ˜μ μΈ κ²ƒλ“€λ§Œ μ„€μ •ν•΄λ³΄μ•˜μŠ΅λ‹ˆλ‹€. 

κ°„λž΅νžˆ μ„€λͺ…ν•˜μžλ©΄ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

  • withNameSpace: cloudwatchμ—μ„œ μ„œλΉ„μŠ€λ₯Ό κ΅¬λΆ„ν•˜λŠ” κ°’ (ex. "AWS/SQS", "AWS/EC2", etc.)
  • withMetricName: μ‘°νšŒν•˜κ³ μžν•˜λŠ” λ©”νŠΈλ¦­ λͺ…
  • withStatistics: Statistic μ—μ„œ μ œκ³΅ν•˜λŠ” 톡계기쀀(?), enum으둜 μ •μ˜λ˜μ–΄μžˆμŒ
    • SampleCount
    • Average
    • Sum
    • Minimum
    • Maximum
  • withStartTime, withEndTime: μ‘°νšŒν•˜λ €λŠ” 데이터 ꡬ간 (λ°μ΄ν„°μ˜ μ‹œμž‘ μ‹œμ κ³Ό μ’…λ£Œ μ‹œμ )
  • withPeriod: μ‘°νšŒν•˜λ €λŠ” 데이터 ꡬ간 λ‚΄μ—μ„œμ˜ 데이터 간격. 예λ₯Όλ“€λ©΄ μ§€λ‚œ 1μ‹œκ°„ λ™μ•ˆ λͺ‡ λΆ„ κ°„κ²©μœΌλ‘œ 데이터λ₯Ό μ‘°νšŒν• μ§€λ₯Ό 의미. μ΄ˆλ‹¨μœ„κ°’
  • withDimensions: SQS의 경우 "QueueName" ν•˜λ‚˜λ§Œ 있고, 이 κ°’μœΌλ‘œ μ–΄λ–€ sqs에 λŒ€ν•œ 데이터인지 ꡬ뢄 κ°€λŠ₯.

(μ°Έκ³ : Available CloudWatch metrics for Amazon SQS)

 

섀정값은 ν˜„μž¬ μ§€λ‚œ 5λΆ„λ™μ•ˆ(from endTime to startTime) 5뢄간격(period)의 데이터λ₯Ό μ‘°νšŒν•˜λ„λ‘ λ˜μ–΄μžˆμœΌλ―€λ‘œ 1개의 data point κ°€ μ‘°νšŒκ°€ λ©λ‹ˆλ‹€. 그리고 unit은 μ΄ˆλ‹¨μœ„λ‘œ λ‚˜μ˜΅λ‹ˆλ‹€.

μœ„ μ½”λ“œλ₯Ό μ‹€ν–‰ν•΄μ„œ μ‘°νšŒν•œ sqs의 Approximate Age Of Oldest Message μ§€ν‘œκ°’μ€ λ‹€μŒκ³Ό 같이 좜λ ₯λ©λ‹ˆλ‹€.

dataPoints: [{Timestamp: Thu Dec 29 16:08:00 KST 2022,Maximum: 249.0,Unit: Seconds,}]

 

섀정값을 λ³€κ²½ν•˜μ—¬ periodλ₯Ό 60으둜 λ„£μ–΄μ„œ μ‹€ν–‰ν•˜λ©΄ 5κ°œκ°€ μ‘°νšŒλ©λ‹ˆλ‹€.

dataPoints: [{Timestamp: Thu Dec 29 17:03:00 KST 2022,Maximum: 3254.0,Unit: Seconds,}, {Timestamp: Thu Dec 29 17:07:00 KST 2022,Maximum: 3554.0,Unit: Seconds,}, {Timestamp: Thu Dec 29 17:05:00 KST 2022,Maximum: 3433.0,Unit: Seconds,}, {Timestamp: Thu Dec 29 17:06:00 KST 2022,Maximum: 3491.0,Unit: Seconds,}, {Timestamp: Thu Dec 29 17:04:00 KST 2022,Maximum: 3370.0,Unit: Seconds,}]

 

쑰회된 5개의 λ°μ΄ν„°λŠ” μ§€λ‚œ 5λΆ„ ꡬ간(startTime, endTime)μ—μ„œ 1λΆ„ 간격(period) 데이터λ₯Ό μ‘°νšŒν–ˆμ„ λ•Œμ˜ κ²°κ³Όμž…λ‹ˆλ‹€. 그리고 이 값은 Approximate Age Of Oldest Message, 즉, λŒ€λž΅μ μœΌλ‘œ μ–Όλ§ˆλ‚˜ μ˜€λž˜λ˜μ—ˆλŠ”κ°€λ₯Ό λ‚˜νƒ€λ‚΄λŠ” κ°’μ΄λ―€λ‘œ 1λΆ„ 간격 데이터λ₯Ό μ‘°νšŒν•œλ‹€λ©΄ μ•½ 1λΆ„(60초)의 μ‹œκ°„μ°¨μ΄κ°€ 있겠죠. 좜λ ₯된 λ°μ΄ν„°μ˜ μˆœμ„œκ°€ μ‹œκ°„μˆœμ΄ μ•„λ‹ˆλ‹ˆ μ‹œκ°„μˆœμœΌλ‘œ 정렬해보면 μ•½ 1λΆ„ 정도 차이가 λ‚œλ‹€λŠ” 것을 확인할 수 μžˆμŠ΅λ‹ˆλ‹€. 

[
    {Timestamp: Thu Dec 29 17:03:00 KST 2022,Maximum: 3254.0,Unit: Seconds,}, 
    {Timestamp: Thu Dec 29 17:04:00 KST 2022,Maximum: 3370.0,Unit: Seconds,}
    {Timestamp: Thu Dec 29 17:05:00 KST 2022,Maximum: 3433.0,Unit: Seconds,}, 
    {Timestamp: Thu Dec 29 17:06:00 KST 2022,Maximum: 3491.0,Unit: Seconds,}, 
    {Timestamp: Thu Dec 29 17:07:00 KST 2022,Maximum: 3554.0,Unit: Seconds,}, 
]

 

17:03 μ—μ„œ 17:04λŠ” μ˜ˆμ™Έμ μœΌλ‘œ μ•½ 2λΆ„ 차이가 λ‚˜λ„€μš” ^^;;

μ΄μƒμœΌλ‘œ AWS cloudwatch API둜 SQS의 metric을 μ‘°νšŒν•˜λŠ” 방법에 λŒ€ν•΄ μ•Œμ•„λ³΄μ•˜μŠ΅λ‹ˆλ‹€.

 

μ €λŠ” μ΄λ ‡κ²Œ μ‘°νšŒν•œ 데이터λ₯Ό 가지고 일정 μ‹œκ°„μ„ λ„˜μ–΄μ„€ 경우 μ“°λ ˆλ“œ 개수λ₯Ό scale in/out ν•˜λ„λ‘ μ„œλΉ„μŠ€λ₯Ό κ΅¬ν˜„ν–ˆμŠ΅λ‹ˆλ‹€.

 

도움이 λ˜μ…¨λ‹€λ©΄ 곡감꾹 λΆ€νƒλ“œλ €μš”~