Spiky kubernetes HPA with metric number of pubsub unacked messsages
Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.
We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.
What I meant by spiky is as follows:
- The number of unacked messages will go up more than the target value
- The autoscaler will increase the number of pods
- Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler
- The number of unacked will decrease until it goes below target and it will stay very low
- The autoscaler will reduce the number of pods to the minimum number of pods
- The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes
Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub
We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.
Is there a way so that the HPA become more stable (non-spiky) in this case?
Any comment, suggestion, or answer is well appreciated.
Thanks!
kubernetes google-cloud-platform autoscaling google-cloud-pubsub google-kubernetes-engine
add a comment |
Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.
We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.
What I meant by spiky is as follows:
- The number of unacked messages will go up more than the target value
- The autoscaler will increase the number of pods
- Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler
- The number of unacked will decrease until it goes below target and it will stay very low
- The autoscaler will reduce the number of pods to the minimum number of pods
- The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes
Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub
We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.
Is there a way so that the HPA become more stable (non-spiky) in this case?
Any comment, suggestion, or answer is well appreciated.
Thanks!
kubernetes google-cloud-platform autoscaling google-cloud-pubsub google-kubernetes-engine
Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.
– Digil
Nov 23 '18 at 1:59
Yes, I have read the documentation. I useExternalmetric type and have tried bothValueandAverageValue. Unfortunately the autoscaling is still very spiky...
– Yosua Michael
Nov 26 '18 at 4:04
Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.
– Digil
Nov 30 '18 at 1:43
Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!
– Yosua Michael
Nov 30 '18 at 10:17
add a comment |
Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.
We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.
What I meant by spiky is as follows:
- The number of unacked messages will go up more than the target value
- The autoscaler will increase the number of pods
- Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler
- The number of unacked will decrease until it goes below target and it will stay very low
- The autoscaler will reduce the number of pods to the minimum number of pods
- The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes
Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub
We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.
Is there a way so that the HPA become more stable (non-spiky) in this case?
Any comment, suggestion, or answer is well appreciated.
Thanks!
kubernetes google-cloud-platform autoscaling google-cloud-pubsub google-kubernetes-engine
Currently we have a pipeline of data streaming: api call -> google pub/sub -> BigQuery. The number of api call will depend on the traffic on the website.
We create a kubernetes deployment (in GKE) for ingesting data from pub/sub to BigQuery. This deployment have a horizontal pod autoscaler (HPA) with with metricName: pubsub.googleapis.com|subscription|num_undelivered_messages and targetValue: "5000". This structure able to autoscale when the traffic have a sudden increase. However, it will cause a spiky scaling.
What I meant by spiky is as follows:
- The number of unacked messages will go up more than the target value
- The autoscaler will increase the number of pods
- Since the number of unacked will slowly decrease, but since it is still above target value the autoscaler will still increase the number of pods --> this happen until we hit the max number of pods in the autoscaler
- The number of unacked will decrease until it goes below target and it will stay very low
- The autoscaler will reduce the number of pods to the minimum number of pods
- The number of unacked messages will increase again and will go similar situation with (1) and it will go into a loop/cycle of spikes
Here are the chart when it goes spiky (the traffic is going up but it is stable and non-spiky):
The spiky number of unacknowledged message in pub/sub
We set an alarm in stackdriver if the number of unacknowledged message is more than 20k, and in this situation it will always triggered frequently.
Is there a way so that the HPA become more stable (non-spiky) in this case?
Any comment, suggestion, or answer is well appreciated.
Thanks!
kubernetes google-cloud-platform autoscaling google-cloud-pubsub google-kubernetes-engine
kubernetes google-cloud-platform autoscaling google-cloud-pubsub google-kubernetes-engine
asked Nov 22 '18 at 10:12
Yosua MichaelYosua Michael
223
223
Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.
– Digil
Nov 23 '18 at 1:59
Yes, I have read the documentation. I useExternalmetric type and have tried bothValueandAverageValue. Unfortunately the autoscaling is still very spiky...
– Yosua Michael
Nov 26 '18 at 4:04
Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.
– Digil
Nov 30 '18 at 1:43
Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!
– Yosua Michael
Nov 30 '18 at 10:17
add a comment |
Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.
– Digil
Nov 23 '18 at 1:59
Yes, I have read the documentation. I useExternalmetric type and have tried bothValueandAverageValue. Unfortunately the autoscaling is still very spiky...
– Yosua Michael
Nov 26 '18 at 4:04
Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.
– Digil
Nov 30 '18 at 1:43
Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!
– Yosua Michael
Nov 30 '18 at 10:17
Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.
– Digil
Nov 23 '18 at 1:59
Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.
– Digil
Nov 23 '18 at 1:59
Yes, I have read the documentation. I use
External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...– Yosua Michael
Nov 26 '18 at 4:04
Yes, I have read the documentation. I use
External metric type and have tried both Value and AverageValue. Unfortunately the autoscaling is still very spiky...– Yosua Michael
Nov 26 '18 at 4:04
Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.
– Digil
Nov 30 '18 at 1:43
Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.
– Digil
Nov 30 '18 at 1:43
Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!
– Yosua Michael
Nov 30 '18 at 10:17
Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!
– Yosua Michael
Nov 30 '18 at 10:17
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428544%2fspiky-kubernetes-hpa-with-metric-number-of-pubsub-unacked-messsages%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53428544%2fspiky-kubernetes-hpa-with-metric-number-of-pubsub-unacked-messsages%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you checked this document about 'Autoscaling on metrics not related to Kubernetes objects'? see if that suits your scenario.
– Digil
Nov 23 '18 at 1:59
Yes, I have read the documentation. I use
Externalmetric type and have tried bothValueandAverageValue. Unfortunately the autoscaling is still very spiky...– Yosua Michael
Nov 26 '18 at 4:04
Seems like this is a defect within the GKE version. Which version are you using? As per the documentation this issue is already addressed in the kubernetes version 1.12. Hopefully the same will be applied to the latest GKE version. May be GKE 1.12 or latest.
– Digil
Nov 30 '18 at 1:43
Currently I am still using version 1.10.6-gke.11. The latest version of kubernetes that available in GKE is 1.11.3-gke.18. Will try to upgrade it then. Thanks!
– Yosua Michael
Nov 30 '18 at 10:17