java- How to get filepath in MySQL and get the subsequent file from directory?












2















I have a method in Java requires to scan through a table in MySQL that looks for filepath.



Here is a sample table filequeue:



 UniqueID   FilePath                 Status     
1 C:Folder1abc.pdf Active
2 C:Folder1def.pdf Active
3 C:Folder1efg.pdf Error


I would like to scan through the table and look for files with Status= Active. Then I will grab the filepath and locate the actual file from the location and start doing some processing to these files(extracting text).



I am new to Java and so far I am doing this way as shown below:



public void doScan_DB() throws Exception{

Properties props=new Properties();


InputStream in = getClass().getResourceAsStream("/db.properties");

props.load(in);
in.close();



String driver = props.getProperty("jdbc.driver");
if(driver!=null){
Class.forName(driver);

}

String url=props.getProperty("jdbc.url");
String username=props.getProperty("jdbc.username");
String password=props.getProperty("jdbc.password");

Connection con = DriverManager.getConnection(url,username,password);
Statement statement = con.createStatement();
ResultSet rs=statement.executeQuery("select * from filequeue where Status='Active'");

while(rs.next()){

// grab those files and call index()

}

}




}


From here, how do I proceed to capture the file and then call an index function to do some extraction of text to the files?



Also, do let me know if my way of doing it is wrong.



EDIT:
Include my other function to extracts PDF texts:



 public void doScan() throws Exception{


File folder = new File("D:\PDF1");
File listOfFiles = folder.listFiles();

for (File file : listOfFiles) {
if (file.isFile()) {
// HashSet<String> uniqueWords = new HashSet<>();
ArrayList<String> list
= new ArrayList<String>();
String path = "D:\PDF1\" + file.getName();
try (PDDocument document = PDDocument.load(new File(path))) {

if (!document.isEncrypted()) {

PDFTextStripper tStripper = new PDFTextStripper();
String pdfFileInText = tStripper.getText(document);
String lines = pdfFileInText.split("\r?\n");
for (String line : lines) {
String words = line.split(" ");
// words.replaceAll("([\W]+$)|(^[\W]+)", ""));


for (String word : words) {
// check if one or more special characters at end of string then remove OR
// check special characters in beginning of the string then remove
// uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
list.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
// uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
}

}


}
} catch (IOException e) {
System.err.println("Exception while trying to read pdf document - " + e);
}

String words1 =list.toArray(new String[list.size()]);
// String words2 =uniqueWords.toArray(new String[uniqueWords.size()]);

// MysqlAccessIndex connection = new MysqlAccessIndex();



index(words1,path);




System.out.println("Completed");

}
}









share|improve this question





























    2















    I have a method in Java requires to scan through a table in MySQL that looks for filepath.



    Here is a sample table filequeue:



     UniqueID   FilePath                 Status     
    1 C:Folder1abc.pdf Active
    2 C:Folder1def.pdf Active
    3 C:Folder1efg.pdf Error


    I would like to scan through the table and look for files with Status= Active. Then I will grab the filepath and locate the actual file from the location and start doing some processing to these files(extracting text).



    I am new to Java and so far I am doing this way as shown below:



    public void doScan_DB() throws Exception{

    Properties props=new Properties();


    InputStream in = getClass().getResourceAsStream("/db.properties");

    props.load(in);
    in.close();



    String driver = props.getProperty("jdbc.driver");
    if(driver!=null){
    Class.forName(driver);

    }

    String url=props.getProperty("jdbc.url");
    String username=props.getProperty("jdbc.username");
    String password=props.getProperty("jdbc.password");

    Connection con = DriverManager.getConnection(url,username,password);
    Statement statement = con.createStatement();
    ResultSet rs=statement.executeQuery("select * from filequeue where Status='Active'");

    while(rs.next()){

    // grab those files and call index()

    }

    }




    }


    From here, how do I proceed to capture the file and then call an index function to do some extraction of text to the files?



    Also, do let me know if my way of doing it is wrong.



    EDIT:
    Include my other function to extracts PDF texts:



     public void doScan() throws Exception{


    File folder = new File("D:\PDF1");
    File listOfFiles = folder.listFiles();

    for (File file : listOfFiles) {
    if (file.isFile()) {
    // HashSet<String> uniqueWords = new HashSet<>();
    ArrayList<String> list
    = new ArrayList<String>();
    String path = "D:\PDF1\" + file.getName();
    try (PDDocument document = PDDocument.load(new File(path))) {

    if (!document.isEncrypted()) {

    PDFTextStripper tStripper = new PDFTextStripper();
    String pdfFileInText = tStripper.getText(document);
    String lines = pdfFileInText.split("\r?\n");
    for (String line : lines) {
    String words = line.split(" ");
    // words.replaceAll("([\W]+$)|(^[\W]+)", ""));


    for (String word : words) {
    // check if one or more special characters at end of string then remove OR
    // check special characters in beginning of the string then remove
    // uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
    list.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
    // uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
    }

    }


    }
    } catch (IOException e) {
    System.err.println("Exception while trying to read pdf document - " + e);
    }

    String words1 =list.toArray(new String[list.size()]);
    // String words2 =uniqueWords.toArray(new String[uniqueWords.size()]);

    // MysqlAccessIndex connection = new MysqlAccessIndex();



    index(words1,path);




    System.out.println("Completed");

    }
    }









    share|improve this question



























      2












      2








      2








      I have a method in Java requires to scan through a table in MySQL that looks for filepath.



      Here is a sample table filequeue:



       UniqueID   FilePath                 Status     
      1 C:Folder1abc.pdf Active
      2 C:Folder1def.pdf Active
      3 C:Folder1efg.pdf Error


      I would like to scan through the table and look for files with Status= Active. Then I will grab the filepath and locate the actual file from the location and start doing some processing to these files(extracting text).



      I am new to Java and so far I am doing this way as shown below:



      public void doScan_DB() throws Exception{

      Properties props=new Properties();


      InputStream in = getClass().getResourceAsStream("/db.properties");

      props.load(in);
      in.close();



      String driver = props.getProperty("jdbc.driver");
      if(driver!=null){
      Class.forName(driver);

      }

      String url=props.getProperty("jdbc.url");
      String username=props.getProperty("jdbc.username");
      String password=props.getProperty("jdbc.password");

      Connection con = DriverManager.getConnection(url,username,password);
      Statement statement = con.createStatement();
      ResultSet rs=statement.executeQuery("select * from filequeue where Status='Active'");

      while(rs.next()){

      // grab those files and call index()

      }

      }




      }


      From here, how do I proceed to capture the file and then call an index function to do some extraction of text to the files?



      Also, do let me know if my way of doing it is wrong.



      EDIT:
      Include my other function to extracts PDF texts:



       public void doScan() throws Exception{


      File folder = new File("D:\PDF1");
      File listOfFiles = folder.listFiles();

      for (File file : listOfFiles) {
      if (file.isFile()) {
      // HashSet<String> uniqueWords = new HashSet<>();
      ArrayList<String> list
      = new ArrayList<String>();
      String path = "D:\PDF1\" + file.getName();
      try (PDDocument document = PDDocument.load(new File(path))) {

      if (!document.isEncrypted()) {

      PDFTextStripper tStripper = new PDFTextStripper();
      String pdfFileInText = tStripper.getText(document);
      String lines = pdfFileInText.split("\r?\n");
      for (String line : lines) {
      String words = line.split(" ");
      // words.replaceAll("([\W]+$)|(^[\W]+)", ""));


      for (String word : words) {
      // check if one or more special characters at end of string then remove OR
      // check special characters in beginning of the string then remove
      // uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
      list.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
      // uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
      }

      }


      }
      } catch (IOException e) {
      System.err.println("Exception while trying to read pdf document - " + e);
      }

      String words1 =list.toArray(new String[list.size()]);
      // String words2 =uniqueWords.toArray(new String[uniqueWords.size()]);

      // MysqlAccessIndex connection = new MysqlAccessIndex();



      index(words1,path);




      System.out.println("Completed");

      }
      }









      share|improve this question
















      I have a method in Java requires to scan through a table in MySQL that looks for filepath.



      Here is a sample table filequeue:



       UniqueID   FilePath                 Status     
      1 C:Folder1abc.pdf Active
      2 C:Folder1def.pdf Active
      3 C:Folder1efg.pdf Error


      I would like to scan through the table and look for files with Status= Active. Then I will grab the filepath and locate the actual file from the location and start doing some processing to these files(extracting text).



      I am new to Java and so far I am doing this way as shown below:



      public void doScan_DB() throws Exception{

      Properties props=new Properties();


      InputStream in = getClass().getResourceAsStream("/db.properties");

      props.load(in);
      in.close();



      String driver = props.getProperty("jdbc.driver");
      if(driver!=null){
      Class.forName(driver);

      }

      String url=props.getProperty("jdbc.url");
      String username=props.getProperty("jdbc.username");
      String password=props.getProperty("jdbc.password");

      Connection con = DriverManager.getConnection(url,username,password);
      Statement statement = con.createStatement();
      ResultSet rs=statement.executeQuery("select * from filequeue where Status='Active'");

      while(rs.next()){

      // grab those files and call index()

      }

      }




      }


      From here, how do I proceed to capture the file and then call an index function to do some extraction of text to the files?



      Also, do let me know if my way of doing it is wrong.



      EDIT:
      Include my other function to extracts PDF texts:



       public void doScan() throws Exception{


      File folder = new File("D:\PDF1");
      File listOfFiles = folder.listFiles();

      for (File file : listOfFiles) {
      if (file.isFile()) {
      // HashSet<String> uniqueWords = new HashSet<>();
      ArrayList<String> list
      = new ArrayList<String>();
      String path = "D:\PDF1\" + file.getName();
      try (PDDocument document = PDDocument.load(new File(path))) {

      if (!document.isEncrypted()) {

      PDFTextStripper tStripper = new PDFTextStripper();
      String pdfFileInText = tStripper.getText(document);
      String lines = pdfFileInText.split("\r?\n");
      for (String line : lines) {
      String words = line.split(" ");
      // words.replaceAll("([\W]+$)|(^[\W]+)", ""));


      for (String word : words) {
      // check if one or more special characters at end of string then remove OR
      // check special characters in beginning of the string then remove
      // uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
      list.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
      // uniqueWords.add(word.replaceAll("([\W]+$)|(^[\W]+)", ""));
      }

      }


      }
      } catch (IOException e) {
      System.err.println("Exception while trying to read pdf document - " + e);
      }

      String words1 =list.toArray(new String[list.size()]);
      // String words2 =uniqueWords.toArray(new String[uniqueWords.size()]);

      // MysqlAccessIndex connection = new MysqlAccessIndex();



      index(words1,path);




      System.out.println("Completed");

      }
      }






      java mysql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 '18 at 8:23







      Daredevil

















      asked Nov 23 '18 at 7:57









      DaredevilDaredevil

      19011




      19011
























          1 Answer
          1






          active

          oldest

          votes


















          1














          You can get the path and file by



              while(rs.next()){

          String path= rs.getString(2);
          // Create a PdfDocument instance
          PdfDocument doc = new PdfDocument();
          try {
          // Load an existing document
          doc.load(path);
          // Get page count and display it on console output
          System.out.println(
          "Number of pages in sample_doc1.pdf is " +
          doc.getPageCount());
          // Close document
          doc.close();
          } catch (IOException | PdfException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
          }
          }


          You will be needing additional JARS which will give you predefined methods for PDF.



          Visit this link for more information



          https://www.gnostice.com/nl_article.asp?id=101&t=How_to_Read_and_Write_PDF_Files_in_Java






          share|improve this answer
























          • I already have JAR files for PDF called PDFBox

            – Daredevil
            Nov 23 '18 at 8:15











          • But how does this get the file from that directory and do something too it?

            – Daredevil
            Nov 23 '18 at 8:16











          • There are many jars available, please visit the link once and give it a read :)

            – Ayush
            Nov 23 '18 at 8:16











          • PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

            – Ayush
            Nov 23 '18 at 8:19











          • So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

            – Daredevil
            Nov 23 '18 at 8:20











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53442675%2fjava-how-to-get-filepath-in-mysql-and-get-the-subsequent-file-from-directory%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          You can get the path and file by



              while(rs.next()){

          String path= rs.getString(2);
          // Create a PdfDocument instance
          PdfDocument doc = new PdfDocument();
          try {
          // Load an existing document
          doc.load(path);
          // Get page count and display it on console output
          System.out.println(
          "Number of pages in sample_doc1.pdf is " +
          doc.getPageCount());
          // Close document
          doc.close();
          } catch (IOException | PdfException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
          }
          }


          You will be needing additional JARS which will give you predefined methods for PDF.



          Visit this link for more information



          https://www.gnostice.com/nl_article.asp?id=101&t=How_to_Read_and_Write_PDF_Files_in_Java






          share|improve this answer
























          • I already have JAR files for PDF called PDFBox

            – Daredevil
            Nov 23 '18 at 8:15











          • But how does this get the file from that directory and do something too it?

            – Daredevil
            Nov 23 '18 at 8:16











          • There are many jars available, please visit the link once and give it a read :)

            – Ayush
            Nov 23 '18 at 8:16











          • PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

            – Ayush
            Nov 23 '18 at 8:19











          • So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

            – Daredevil
            Nov 23 '18 at 8:20
















          1














          You can get the path and file by



              while(rs.next()){

          String path= rs.getString(2);
          // Create a PdfDocument instance
          PdfDocument doc = new PdfDocument();
          try {
          // Load an existing document
          doc.load(path);
          // Get page count and display it on console output
          System.out.println(
          "Number of pages in sample_doc1.pdf is " +
          doc.getPageCount());
          // Close document
          doc.close();
          } catch (IOException | PdfException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
          }
          }


          You will be needing additional JARS which will give you predefined methods for PDF.



          Visit this link for more information



          https://www.gnostice.com/nl_article.asp?id=101&t=How_to_Read_and_Write_PDF_Files_in_Java






          share|improve this answer
























          • I already have JAR files for PDF called PDFBox

            – Daredevil
            Nov 23 '18 at 8:15











          • But how does this get the file from that directory and do something too it?

            – Daredevil
            Nov 23 '18 at 8:16











          • There are many jars available, please visit the link once and give it a read :)

            – Ayush
            Nov 23 '18 at 8:16











          • PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

            – Ayush
            Nov 23 '18 at 8:19











          • So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

            – Daredevil
            Nov 23 '18 at 8:20














          1












          1








          1







          You can get the path and file by



              while(rs.next()){

          String path= rs.getString(2);
          // Create a PdfDocument instance
          PdfDocument doc = new PdfDocument();
          try {
          // Load an existing document
          doc.load(path);
          // Get page count and display it on console output
          System.out.println(
          "Number of pages in sample_doc1.pdf is " +
          doc.getPageCount());
          // Close document
          doc.close();
          } catch (IOException | PdfException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
          }
          }


          You will be needing additional JARS which will give you predefined methods for PDF.



          Visit this link for more information



          https://www.gnostice.com/nl_article.asp?id=101&t=How_to_Read_and_Write_PDF_Files_in_Java






          share|improve this answer













          You can get the path and file by



              while(rs.next()){

          String path= rs.getString(2);
          // Create a PdfDocument instance
          PdfDocument doc = new PdfDocument();
          try {
          // Load an existing document
          doc.load(path);
          // Get page count and display it on console output
          System.out.println(
          "Number of pages in sample_doc1.pdf is " +
          doc.getPageCount());
          // Close document
          doc.close();
          } catch (IOException | PdfException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
          }
          }


          You will be needing additional JARS which will give you predefined methods for PDF.



          Visit this link for more information



          https://www.gnostice.com/nl_article.asp?id=101&t=How_to_Read_and_Write_PDF_Files_in_Java







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 23 '18 at 8:15









          AyushAyush

          758




          758













          • I already have JAR files for PDF called PDFBox

            – Daredevil
            Nov 23 '18 at 8:15











          • But how does this get the file from that directory and do something too it?

            – Daredevil
            Nov 23 '18 at 8:16











          • There are many jars available, please visit the link once and give it a read :)

            – Ayush
            Nov 23 '18 at 8:16











          • PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

            – Ayush
            Nov 23 '18 at 8:19











          • So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

            – Daredevil
            Nov 23 '18 at 8:20



















          • I already have JAR files for PDF called PDFBox

            – Daredevil
            Nov 23 '18 at 8:15











          • But how does this get the file from that directory and do something too it?

            – Daredevil
            Nov 23 '18 at 8:16











          • There are many jars available, please visit the link once and give it a read :)

            – Ayush
            Nov 23 '18 at 8:16











          • PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

            – Ayush
            Nov 23 '18 at 8:19











          • So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

            – Daredevil
            Nov 23 '18 at 8:20

















          I already have JAR files for PDF called PDFBox

          – Daredevil
          Nov 23 '18 at 8:15





          I already have JAR files for PDF called PDFBox

          – Daredevil
          Nov 23 '18 at 8:15













          But how does this get the file from that directory and do something too it?

          – Daredevil
          Nov 23 '18 at 8:16





          But how does this get the file from that directory and do something too it?

          – Daredevil
          Nov 23 '18 at 8:16













          There are many jars available, please visit the link once and give it a read :)

          – Ayush
          Nov 23 '18 at 8:16





          There are many jars available, please visit the link once and give it a read :)

          – Ayush
          Nov 23 '18 at 8:16













          PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

          – Ayush
          Nov 23 '18 at 8:19





          PdfDocument doc = new PdfDocument(); creates an instance . doc.load(path) loads the file into that instance which will make "doc" ready for further actions. You can work on variable "doc" and call your methods on it.

          – Ayush
          Nov 23 '18 at 8:19













          So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

          – Daredevil
          Nov 23 '18 at 8:20





          So it loads and locates the pdf file from directory, is that correct? Then I can just call whatever method or function to extract text from that pdf right?

          – Daredevil
          Nov 23 '18 at 8:20




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53442675%2fjava-how-to-get-filepath-in-mysql-and-get-the-subsequent-file-from-directory%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Costa Masnaga

          Fotorealismo

          Sidney Franklin