Detect changes during bulk indexing
We are using Elasticsearch v5.6.12 for our database. We update this frequently using the bulk REST api. Some of the time the individual requests won't change anything (i.e. the value of the document that Elasticsearch is already up to date). How can I detect these instances?
I saw this (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html) but I'm not sure it's applicable in our situation.
elasticsearch
add a comment |
We are using Elasticsearch v5.6.12 for our database. We update this frequently using the bulk REST api. Some of the time the individual requests won't change anything (i.e. the value of the document that Elasticsearch is already up to date). How can I detect these instances?
I saw this (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html) but I'm not sure it's applicable in our situation.
elasticsearch
add a comment |
We are using Elasticsearch v5.6.12 for our database. We update this frequently using the bulk REST api. Some of the time the individual requests won't change anything (i.e. the value of the document that Elasticsearch is already up to date). How can I detect these instances?
I saw this (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html) but I'm not sure it's applicable in our situation.
elasticsearch
We are using Elasticsearch v5.6.12 for our database. We update this frequently using the bulk REST api. Some of the time the individual requests won't change anything (i.e. the value of the document that Elasticsearch is already up to date). How can I detect these instances?
I saw this (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html) but I'm not sure it's applicable in our situation.
elasticsearch
elasticsearch
asked Nov 20 at 15:28
bm1729
1,277824
1,277824
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You can use the noop
detection when checking the result of your bulk queries.
When the bulk query returns, you can iterate over each update result and check if the result
field has a value of noop
(vs updated
)
# Say the document is indexed
PUT test/doc/1
{
"test": "123"
}
# Now you want to bulk update it
POST test/doc/_bulk
{"update":{"_id": "1"}}
{"doc":{"test":"123"}} <-- this will yield `result: noop`
{"update":{"_id": "1"}}
{"doc":{"test":"1234"}} <-- this will yield `result: updated`
{"update":{"_id": "2"}}
{"doc":{"test":"3456"}, "doc_as_upsert": true} <-- this will yield `result: created`
Result:
{
"took" : 6,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 2,
"result" : "noop", <-- see "noop"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
},
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 3,
"result" : "updated", <-- see "updated"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_version" : 1,
"result" : "created", <-- see "created"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
]
}
As you can see, when specifying doc_as_upsert: true
for document with id 2, the document will be created and the result
field value will be created
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
|
show 3 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396315%2fdetect-changes-during-bulk-indexing%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use the noop
detection when checking the result of your bulk queries.
When the bulk query returns, you can iterate over each update result and check if the result
field has a value of noop
(vs updated
)
# Say the document is indexed
PUT test/doc/1
{
"test": "123"
}
# Now you want to bulk update it
POST test/doc/_bulk
{"update":{"_id": "1"}}
{"doc":{"test":"123"}} <-- this will yield `result: noop`
{"update":{"_id": "1"}}
{"doc":{"test":"1234"}} <-- this will yield `result: updated`
{"update":{"_id": "2"}}
{"doc":{"test":"3456"}, "doc_as_upsert": true} <-- this will yield `result: created`
Result:
{
"took" : 6,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 2,
"result" : "noop", <-- see "noop"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
},
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 3,
"result" : "updated", <-- see "updated"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_version" : 1,
"result" : "created", <-- see "created"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
]
}
As you can see, when specifying doc_as_upsert: true
for document with id 2, the document will be created and the result
field value will be created
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
|
show 3 more comments
You can use the noop
detection when checking the result of your bulk queries.
When the bulk query returns, you can iterate over each update result and check if the result
field has a value of noop
(vs updated
)
# Say the document is indexed
PUT test/doc/1
{
"test": "123"
}
# Now you want to bulk update it
POST test/doc/_bulk
{"update":{"_id": "1"}}
{"doc":{"test":"123"}} <-- this will yield `result: noop`
{"update":{"_id": "1"}}
{"doc":{"test":"1234"}} <-- this will yield `result: updated`
{"update":{"_id": "2"}}
{"doc":{"test":"3456"}, "doc_as_upsert": true} <-- this will yield `result: created`
Result:
{
"took" : 6,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 2,
"result" : "noop", <-- see "noop"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
},
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 3,
"result" : "updated", <-- see "updated"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_version" : 1,
"result" : "created", <-- see "created"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
]
}
As you can see, when specifying doc_as_upsert: true
for document with id 2, the document will be created and the result
field value will be created
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
|
show 3 more comments
You can use the noop
detection when checking the result of your bulk queries.
When the bulk query returns, you can iterate over each update result and check if the result
field has a value of noop
(vs updated
)
# Say the document is indexed
PUT test/doc/1
{
"test": "123"
}
# Now you want to bulk update it
POST test/doc/_bulk
{"update":{"_id": "1"}}
{"doc":{"test":"123"}} <-- this will yield `result: noop`
{"update":{"_id": "1"}}
{"doc":{"test":"1234"}} <-- this will yield `result: updated`
{"update":{"_id": "2"}}
{"doc":{"test":"3456"}, "doc_as_upsert": true} <-- this will yield `result: created`
Result:
{
"took" : 6,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 2,
"result" : "noop", <-- see "noop"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
},
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 3,
"result" : "updated", <-- see "updated"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_version" : 1,
"result" : "created", <-- see "created"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
]
}
As you can see, when specifying doc_as_upsert: true
for document with id 2, the document will be created and the result
field value will be created
You can use the noop
detection when checking the result of your bulk queries.
When the bulk query returns, you can iterate over each update result and check if the result
field has a value of noop
(vs updated
)
# Say the document is indexed
PUT test/doc/1
{
"test": "123"
}
# Now you want to bulk update it
POST test/doc/_bulk
{"update":{"_id": "1"}}
{"doc":{"test":"123"}} <-- this will yield `result: noop`
{"update":{"_id": "1"}}
{"doc":{"test":"1234"}} <-- this will yield `result: updated`
{"update":{"_id": "2"}}
{"doc":{"test":"3456"}, "doc_as_upsert": true} <-- this will yield `result: created`
Result:
{
"took" : 6,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 2,
"result" : "noop", <-- see "noop"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
},
{
"update" : {
"_index" : "test",
"_type" : "doc",
"_id" : "1",
"_version" : 3,
"result" : "updated", <-- see "updated"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
},
{
"_index" : "test",
"_type" : "doc",
"_id" : "2",
"_version" : 1,
"result" : "created", <-- see "created"
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
]
}
As you can see, when specifying doc_as_upsert: true
for document with id 2, the document will be created and the result
field value will be created
edited Nov 20 at 15:55
answered Nov 20 at 15:36
Val
101k6132169
101k6132169
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
|
show 3 more comments
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
What will the result field be in the case that the document doesn't already exist?
– bm1729
Nov 20 at 15:52
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
It depends on what you're sending in your bulk query (index, update, etc). Feel free to show your bulk query
– Val
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
Also, is it able to give you the diff (what was there and what you have changed it to)?
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
So far we've been using index.
– bm1729
Nov 20 at 15:53
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
No ES will not give you the diff. See my updated answer to see if it fits your use case.
– Val
Nov 20 at 15:56
|
show 3 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396315%2fdetect-changes-during-bulk-indexing%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown