File Lock in scala
How to lock a .txt file in Hdfs path using scala.So that the other process cannot access the locked file until it unlocks the file.It need to be implemented using scala in hdfs.Can anyone help me with the detailed code.
scala apache-spark locking hdfs
add a comment |
How to lock a .txt file in Hdfs path using scala.So that the other process cannot access the locked file until it unlocks the file.It need to be implemented using scala in hdfs.Can anyone help me with the detailed code.
scala apache-spark locking hdfs
AFAIK there is no such feature provided by HDFS. You can implement any form of distributed advisory locking on your own or move a file to another directory/change permissions to temporarily restrict access.
– yǝsʞǝlA
Nov 26 '18 at 7:56
HDFS doesn't support file locks - what exactly are you trying to achieve? If you have Hadoop you also have ZooKeeper installed and you can use that for that
– Arnon Rotem-Gal-Oz
Nov 26 '18 at 10:00
add a comment |
How to lock a .txt file in Hdfs path using scala.So that the other process cannot access the locked file until it unlocks the file.It need to be implemented using scala in hdfs.Can anyone help me with the detailed code.
scala apache-spark locking hdfs
How to lock a .txt file in Hdfs path using scala.So that the other process cannot access the locked file until it unlocks the file.It need to be implemented using scala in hdfs.Can anyone help me with the detailed code.
scala apache-spark locking hdfs
scala apache-spark locking hdfs
asked Nov 26 '18 at 7:49
vijayvijay
16
16
AFAIK there is no such feature provided by HDFS. You can implement any form of distributed advisory locking on your own or move a file to another directory/change permissions to temporarily restrict access.
– yǝsʞǝlA
Nov 26 '18 at 7:56
HDFS doesn't support file locks - what exactly are you trying to achieve? If you have Hadoop you also have ZooKeeper installed and you can use that for that
– Arnon Rotem-Gal-Oz
Nov 26 '18 at 10:00
add a comment |
AFAIK there is no such feature provided by HDFS. You can implement any form of distributed advisory locking on your own or move a file to another directory/change permissions to temporarily restrict access.
– yǝsʞǝlA
Nov 26 '18 at 7:56
HDFS doesn't support file locks - what exactly are you trying to achieve? If you have Hadoop you also have ZooKeeper installed and you can use that for that
– Arnon Rotem-Gal-Oz
Nov 26 '18 at 10:00
AFAIK there is no such feature provided by HDFS. You can implement any form of distributed advisory locking on your own or move a file to another directory/change permissions to temporarily restrict access.
– yǝsʞǝlA
Nov 26 '18 at 7:56
AFAIK there is no such feature provided by HDFS. You can implement any form of distributed advisory locking on your own or move a file to another directory/change permissions to temporarily restrict access.
– yǝsʞǝlA
Nov 26 '18 at 7:56
HDFS doesn't support file locks - what exactly are you trying to achieve? If you have Hadoop you also have ZooKeeper installed and you can use that for that
– Arnon Rotem-Gal-Oz
Nov 26 '18 at 10:00
HDFS doesn't support file locks - what exactly are you trying to achieve? If you have Hadoop you also have ZooKeeper installed and you can use that for that
– Arnon Rotem-Gal-Oz
Nov 26 '18 at 10:00
add a comment |
1 Answer
1
active
oldest
votes
If you understand the Hadoop concept, it is optimized for write once read many (WORM) operation. It means there will be many clients reading the same file and conceptually it would not be possible to lock any file. Moreover, while writing the file on HDFS file system, Name Node (or master node) will a take care of consistency so there is no need to lock the files. Hope this answer your question. If not, please elaborate your overall problem statement to find a better insight into the solutions approach or right platform.
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
1
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53476695%2ffile-lock-in-scala%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you understand the Hadoop concept, it is optimized for write once read many (WORM) operation. It means there will be many clients reading the same file and conceptually it would not be possible to lock any file. Moreover, while writing the file on HDFS file system, Name Node (or master node) will a take care of consistency so there is no need to lock the files. Hope this answer your question. If not, please elaborate your overall problem statement to find a better insight into the solutions approach or right platform.
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
1
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
add a comment |
If you understand the Hadoop concept, it is optimized for write once read many (WORM) operation. It means there will be many clients reading the same file and conceptually it would not be possible to lock any file. Moreover, while writing the file on HDFS file system, Name Node (or master node) will a take care of consistency so there is no need to lock the files. Hope this answer your question. If not, please elaborate your overall problem statement to find a better insight into the solutions approach or right platform.
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
1
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
add a comment |
If you understand the Hadoop concept, it is optimized for write once read many (WORM) operation. It means there will be many clients reading the same file and conceptually it would not be possible to lock any file. Moreover, while writing the file on HDFS file system, Name Node (or master node) will a take care of consistency so there is no need to lock the files. Hope this answer your question. If not, please elaborate your overall problem statement to find a better insight into the solutions approach or right platform.
If you understand the Hadoop concept, it is optimized for write once read many (WORM) operation. It means there will be many clients reading the same file and conceptually it would not be possible to lock any file. Moreover, while writing the file on HDFS file system, Name Node (or master node) will a take care of consistency so there is no need to lock the files. Hope this answer your question. If not, please elaborate your overall problem statement to find a better insight into the solutions approach or right platform.
answered Nov 26 '18 at 10:30
H RoyH Roy
16927
16927
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
1
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
add a comment |
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
1
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
Maxid.txt file contains the maxbatch id in it.I want to raed the id & increment the value .At the same time another process should not increment the value .When this process gets completed i need to unlock the file.So the other process can have acess to it. Finally i need to generate a lock on the file which i am trying to read.So that other simultaneous process cannot read it.I need to wait till it writes the max value to thta file.& unlock the file .so the next processs can gain access to it,
– vijay
Nov 26 '18 at 11:00
1
1
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
As I indicated that HDFS is WORM approach, you have to build an exclusive locking approach. Since all the Hadoop ecosystem projects which do the computation, generate a Job ID and you need to make sure that one JobID is working on this file, another Job id should not access it. You can maintain the lock flag in the global context (somewhere in RDBMS) and your program can check if the flag is open for other Jobs or not before accessing the file, if the job is still running, the new job may not trigger.
– H Roy
Nov 26 '18 at 12:09
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
@vijay If you intent to modify the data, some kind of DB would be better than plain files. Maybe HBase or Cassandra already provide a built-in lock.
– Luis Miguel Mejía Suárez
Nov 26 '18 at 15:24
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53476695%2ffile-lock-in-scala%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
AFAIK there is no such feature provided by HDFS. You can implement any form of distributed advisory locking on your own or move a file to another directory/change permissions to temporarily restrict access.
– yǝsʞǝlA
Nov 26 '18 at 7:56
HDFS doesn't support file locks - what exactly are you trying to achieve? If you have Hadoop you also have ZooKeeper installed and you can use that for that
– Arnon Rotem-Gal-Oz
Nov 26 '18 at 10:00