Spiky kubernetes HPA with metric number of pubsub unacked messsages












0















Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.



We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.



What I meant by spiky is as follows:




  1. The number of unacked messages will go up more than the target value

  2. The autoscaler will increase the number of pods

  3. Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler

  4. The number of unacked will decrease until it goes below target and it will stay very low

  5. The autoscaler will reduce the number of pods to the minimum number of pods

  6. The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes


Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub



We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.



Is there a way so that the HPA become more stable (non-spiky) in this case?



Any comment, suggestion, or answer is well appreciated.



Thanks!










share|improve this question























  • Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.

    – Digil
    Nov 23 '18 at 1:59











  • Yes, I have read the documentation. I use External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...

    – Yosua Michael
    Nov 26 '18 at 4:04











  • Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.

    – Digil
    Nov 30 '18 at 1:43













  • Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!

    – Yosua Michael
    Nov 30 '18 at 10:17
















0















Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.



We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.



What I meant by spiky is as follows:




  1. The number of unacked messages will go up more than the target value

  2. The autoscaler will increase the number of pods

  3. Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler

  4. The number of unacked will decrease until it goes below target and it will stay very low

  5. The autoscaler will reduce the number of pods to the minimum number of pods

  6. The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes


Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub



We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.



Is there a way so that the HPA become more stable (non-spiky) in this case?



Any comment, suggestion, or answer is well appreciated.



Thanks!










share|improve this question























  • Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.

    – Digil
    Nov 23 '18 at 1:59











  • Yes, I have read the documentation. I use External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...

    – Yosua Michael
    Nov 26 '18 at 4:04











  • Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.

    – Digil
    Nov 30 '18 at 1:43













  • Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!

    – Yosua Michael
    Nov 30 '18 at 10:17














0












0








0








Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.



We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.



What I meant by spiky is as follows:




  1. The number of unacked messages will go up more than the target value

  2. The autoscaler will increase the number of pods

  3. Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler

  4. The number of unacked will decrease until it goes below target and it will stay very low

  5. The autoscaler will reduce the number of pods to the minimum number of pods

  6. The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes


Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub



We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.



Is there a way so that the HPA become more stable (non-spiky) in this case?



Any comment, suggestion, or answer is well appreciated.



Thanks!










share|improve this question














Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.



We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.



What I meant by spiky is as follows:




  1. The number of unacked messages will go up more than the target value

  2. The autoscaler will increase the number of pods

  3. Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler

  4. The number of unacked will decrease until it goes below target and it will stay very low

  5. The autoscaler will reduce the number of pods to the minimum number of pods

  6. The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes


Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub



We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.



Is there a way so that the HPA become more stable (non-spiky) in this case?



Any comment, suggestion, or answer is well appreciated.



Thanks!







kubernetes google-cloud-platform autoscaling google-cloud-pubsub google-kubernetes-engine






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 22 '18 at 10:12









Yosua MichaelYosua Michael

223




223













  • Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.

    – Digil
    Nov 23 '18 at 1:59











  • Yes, I have read the documentation. I use External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...

    – Yosua Michael
    Nov 26 '18 at 4:04











  • Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.

    – Digil
    Nov 30 '18 at 1:43













  • Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!

    – Yosua Michael
    Nov 30 '18 at 10:17



















  • Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.

    – Digil
    Nov 23 '18 at 1:59











  • Yes, I have read the documentation. I use External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...

    – Yosua Michael
    Nov 26 '18 at 4:04











  • Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.

    – Digil
    Nov 30 '18 at 1:43













  • Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!

    – Yosua Michael
    Nov 30 '18 at 10:17

















Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.

– Digil
Nov 23 '18 at 1:59





Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.

– Digil
Nov 23 '18 at 1:59













Yes, I have read the documentation. I use External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...

– Yosua Michael
Nov 26 '18 at 4:04





Yes, I have read the documentation. I use External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...

– Yosua Michael
Nov 26 '18 at 4:04













Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.

– Digil
Nov 30 '18 at 1:43







Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.

– Digil
Nov 30 '18 at 1:43















Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!

– Yosua Michael
Nov 30 '18 at 10:17





Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!

– Yosua Michael
Nov 30 '18 at 10:17












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428544%2fspiky-kubernetes-hpa-with-metric-number-of-pubsub-unacked-messsages%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428544%2fspiky-kubernetes-hpa-with-metric-number-of-pubsub-unacked-messsages%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Ottavio Pratesi

Tricia Helfer

15 giugno