Split a string into a list of tokens

up vote
2
down vote

favorite

I have the following assignment that I succeeded in solving, but the code is very inefficient. I would appreciate if someone could show me a more efficient way, perhaps with substring. Note that I am not allowed to use imports or regexes or add more functions.

/**

 * Separates a given string into tokens, which are the "words" that are

 * separated by one or more occurrences of the given separator character.

 * Returns the tokens as an array of String values.

 */

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    String tokens = new String[charRunCount (source,separator)+1];

    String tmp = ""; // a string in order to take a word, then run over this string

    int j = 0;

    int i = 0;

    while (i < tokens.length) {

        if ( source.charAt (j) != separator ) {

            do {

                tmp += source.charAt (j);

                if ( j >= source.length () - 1 ) {

                    break;

                }

                else { // so that we math the source length

                    j++;

                }

            } while (source.charAt (j) != separator); 

        }

        if ( source.charAt (j) == separator ) {

            j++;

            while (source.charAt (j) == separator) {

                j++;

            }

        }

        tokens[i] = tmp;// taking the token into place

        tmp = ""; //resetting the token so we can begin anew



        i++;

    }

    return tokens;

}

the charRunCount() function:

public static int charRunCount(String str, char c){

    char last = 0;

    int counter = 0;

    for (int i = 0; i < str.length(); i++) {

        // whenever a run starts.

        if (last != c && str.charAt(i) == c) {

            counter++;

        }

        last = str.charAt(i);

    }

    return counter;

}

edited Nov 6 at 9:37

asked Nov 5 at 16:32

Yuki1112

112

Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Please check that I haven't misrepresented your code, and correct it if I have.
– Toby Speight
Nov 5 at 16:49

Does Java have string views? That is, a non-owning view onto a string? If it does, you should consider using them. You marked your question with the performance tag so using string views rather than allocating a string for each token will considerably improve performance.
– Kerndog73
Nov 6 at 7:26

add a comment |

up vote
2
down vote

favorite

/**

 * Separates a given string into tokens, which are the "words" that are

 * separated by one or more occurrences of the given separator character.

 * Returns the tokens as an array of String values.

 */

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    String tokens = new String[charRunCount (source,separator)+1];

    String tmp = ""; // a string in order to take a word, then run over this string

    int j = 0;

    int i = 0;

    while (i < tokens.length) {

        if ( source.charAt (j) != separator ) {

            do {

                tmp += source.charAt (j);

                if ( j >= source.length () - 1 ) {

                    break;

                }

                else { // so that we math the source length

                    j++;

                }

            } while (source.charAt (j) != separator); 

        }

        if ( source.charAt (j) == separator ) {

            j++;

            while (source.charAt (j) == separator) {

                j++;

            }

        }

        tokens[i] = tmp;// taking the token into place

        tmp = ""; //resetting the token so we can begin anew



        i++;

    }

    return tokens;

}

the charRunCount() function:

public static int charRunCount(String str, char c){

    char last = 0;

    int counter = 0;

    for (int i = 0; i < str.length(); i++) {

        // whenever a run starts.

        if (last != c && str.charAt(i) == c) {

            counter++;

        }

        last = str.charAt(i);

    }

    return counter;

}

edited Nov 6 at 9:37

asked Nov 5 at 16:32

Yuki1112

112

Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Please check that I haven't misrepresented your code, and correct it if I have.
– Toby Speight
Nov 5 at 16:49

Does Java have string views? That is, a non-owning view onto a string? If it does, you should consider using them. You marked your question with the performance tag so using string views rather than allocating a string for each token will considerably improve performance.
– Kerndog73
Nov 6 at 7:26

add a comment |

up vote
2
down vote

favorite

/**

 * Separates a given string into tokens, which are the "words" that are

 * separated by one or more occurrences of the given separator character.

 * Returns the tokens as an array of String values.

 */

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    String tokens = new String[charRunCount (source,separator)+1];

    String tmp = ""; // a string in order to take a word, then run over this string

    int j = 0;

    int i = 0;

    while (i < tokens.length) {

        if ( source.charAt (j) != separator ) {

            do {

                tmp += source.charAt (j);

                if ( j >= source.length () - 1 ) {

                    break;

                }

                else { // so that we math the source length

                    j++;

                }

            } while (source.charAt (j) != separator); 

        }

        if ( source.charAt (j) == separator ) {

            j++;

            while (source.charAt (j) == separator) {

                j++;

            }

        }

        tokens[i] = tmp;// taking the token into place

        tmp = ""; //resetting the token so we can begin anew



        i++;

    }

    return tokens;

}

the charRunCount() function:

public static int charRunCount(String str, char c){

    char last = 0;

    int counter = 0;

    for (int i = 0; i < str.length(); i++) {

        // whenever a run starts.

        if (last != c && str.charAt(i) == c) {

            counter++;

        }

        last = str.charAt(i);

    }

    return counter;

}

edited Nov 6 at 9:37

asked Nov 5 at 16:32

Yuki1112

112

/**

 * Separates a given string into tokens, which are the "words" that are

 * separated by one or more occurrences of the given separator character.

 * Returns the tokens as an array of String values.

 */

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    String tokens = new String[charRunCount (source,separator)+1];

    String tmp = ""; // a string in order to take a word, then run over this string

    int j = 0;

    int i = 0;

    while (i < tokens.length) {

        if ( source.charAt (j) != separator ) {

            do {

                tmp += source.charAt (j);

                if ( j >= source.length () - 1 ) {

                    break;

                }

                else { // so that we math the source length

                    j++;

                }

            } while (source.charAt (j) != separator); 

        }

        if ( source.charAt (j) == separator ) {

            j++;

            while (source.charAt (j) == separator) {

                j++;

            }

        }

        tokens[i] = tmp;// taking the token into place

        tmp = ""; //resetting the token so we can begin anew



        i++;

    }

    return tokens;

}

the charRunCount() function:

public static int charRunCount(String str, char c){

    char last = 0;

    int counter = 0;

    for (int i = 0; i < str.length(); i++) {

        // whenever a run starts.

        if (last != c && str.charAt(i) == c) {

            counter++;

        }

        last = str.charAt(i);

    }

    return counter;

}

java performance strings parsing reinventing-the-wheel

edited Nov 6 at 9:37

asked Nov 5 at 16:32

Yuki1112

112

edited Nov 6 at 9:37

asked Nov 5 at 16:32

Yuki1112

112

edited Nov 6 at 9:37

asked Nov 5 at 16:32

Yuki1112

112

asked Nov 5 at 16:32

Yuki1112

112

asked Nov 5 at 16:32

Yuki1112

112

Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Please check that I haven't misrepresented your code, and correct it if I have.
– Toby Speight
Nov 5 at 16:49

Does Java have string views? That is, a non-owning view onto a string? If it does, you should consider using them. You marked your question with the performance tag so using string views rather than allocating a string for each token will considerably improve performance.
– Kerndog73
Nov 6 at 7:26

add a comment |

Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Please check that I haven't misrepresented your code, and correct it if I have.
– Toby Speight
Nov 5 at 16:49

Does Java have string views? That is, a non-owning view onto a string? If it does, you should consider using them. You marked your question with the performance tag so using string views rather than allocating a string for each token will considerably improve performance.
– Kerndog73
Nov 6 at 7:26

Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Please check that I haven't misrepresented your code, and correct it if I have.
– Toby Speight
Nov 5 at 16:49

Does Java have string views? That is, a non-owning view onto a string? If it does, you should consider using them. You marked your question with the performance tag so using string views rather than allocating a string for each token will considerably improve performance.
– Kerndog73
Nov 6 at 7:26

add a comment |

4 Answers
4

active

oldest

votes

up vote
2
down vote

tmp is never a good name for a variable. In this case, you should call it token instead, or perhaps word. And you rightly complain that building strings using repeated += operations is inefficient, and correctly suggest that .substring() would be better.

Logically, then, you need to find the starting and ending indexes of each token! So, let's define those helper functions (as private static functions):

/**

 * Considering str starting at startIndex, find the index at which the 

 * next token starts.

 *

 * @return The index of the start of a token (or str.length() if no more

 *         tokens).

 */

private static int start(String str, char sep, int startIndex) {

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) == sep; i++);

    return i;

}



/**

 * Considering str starting at startIndex, find the index at which the 

 * current token ends.

 *

 * @return The index just beyond the end of a token (the index of a

 *         sep character, or str.length() if this is the last token)

 */

private static int end(String str, char sep, int startIndex) {

    assert(str.charAt(sep) != sep);

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) != sep; i++);

    return i;

}

Then, we can use them in tokenize():

public static String tokenize(String str, char sep) {

    int tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); e = end(str, sep, s)) {

        tokenCount++;

    }

    String tokens = new String[tokenCount];

    tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); ) {

        tokens[tokenCount++] = str.substring(s, e = end(str, sep, s));

    }

    assert(tokens.length == tokenCount);

    return tokens;

}

Notice that now you can take advantage of the helper functions to predetermine the size of the array. Also, a lot of the repetitiveness of your conditions and loops is eliminated.

All of the analysis is done using string indexes, so there is no string manipulation other than just the essential .substring() calls — and even the trim() call is gone!

answered Nov 6 at 7:43

200_success

127k15148410

thank you for your answer, I agree It's more efficient to separate the functions, however we are not allowed to do that in this assignment
– Yuki1112
Nov 6 at 9:36

What, exactly, are you not allowed to do, and why didn't you state the restrictions up front in your question? Didn't you define a charRunCount() helper function in your own solution?
– 200_success
Nov 6 at 13:52

add a comment |

up vote
0
down vote

Not sure of what you mean with "I am not allowed to use imports or regexes or add more functions." but if you can use String#indexOf this can be greatly simplified:

public class Tokenizer {

    private final char separator;



    public Tokenizer(char separator) {

        this.separator = separator;

    }



    public List<String> tokenize(String string) {

        List<String> tokens = new LinkedList<>();

        int start = 0, end = 0;

        while ( start < string.length() && (end = string.indexOf(separator, start))>-1 ) {

            tokens.add(string.substring(start, end));

            start = end+1;

        }

        tokens.add(string.substring(start));

        return tokens;

    }

}

answered Nov 7 at 7:17

gervais.b

96139

1

List would be imported from java.util.
– 200_success
Nov 7 at 13:11

Indeed. Then we should either loop twice to count the occurrence of "separator" and create a good sized array then loop to populate it. Or manage the increase of the array size by ourselves.
– gervais.b
Nov 7 at 22:03

add a comment |

up vote
0
down vote

I'd like to suggest to do it in a a way, that

doesn't have two loops that basically do the same thing

is more Java-esque

Instead of storing the result in an array, which requires you to first count the potential tokens, store the tokens in a LinkedList.

Normally one would then just return that list instead of an array, because in Java arrays are usually used as an implementation detail hidden inside objects. If you yet need to return an array, LinkedList has a toArray method, that copies the list into an array.

My solution would look like this:

public static String tokenize(String str, char separator) {



    Collection<String> result = new LinkedList<>();



    int len = str.length();

    int pos = 0;

    int start = 0; 



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray(new String[result.size()]);

}

EDIT: Here's a version without additional import :)

static class LinkedList {



    private class Node {

        Node(String value) {

            this.value = value;

        }



        String value;

        Node next;

    }



    private Node first = null;

    private Node last = null;

    private int size = 0;



    public void add(String value) {

        Node node = new Node(value);

        size++;

        if (first == null) {

            first = last = node;

            return;

        }

        last.next = node;

        last = node;

    }



    public String toArray() {

        String array = new String[size];



        int i = 0;

        Node current = first;

        while (current != null) {

            array[i] = current.value;

            i++;

            current = current.next;

        }



        return array;

    }



}



public static String tokenize(String str, char separator) {



    LinkedList result = new LinkedList();



    int len = str.length();

    int pos = 0;

    int start = 0;



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray();

}

edited Nov 7 at 7:57

answered Nov 6 at 9:36

RoToRa

6,1761236

1

Collection would be imported from java.util.
– 200_success
Nov 6 at 13:53

@200_success Ah, I missed that requirement. I can't use LinkedList then anyway. But in that case I'd just write a simple linked list class myself, which isn't against the rules :)
– RoToRa
Nov 7 at 7:38

add a comment |

up vote
0
down vote

I found a simple way for beginners:

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    // In the following statement, replace the 0 with the correct number of tokens, and complete the missing code.

    String tokens = new String[charRunCount (source,separator)+1]; //since we start from 0.

   int i = 0;

   int j = 0;

   int sourceLen = source.length ();

   while (i < sourceLen) {

       String tmp = "";

       while (i < sourceLen && source.charAt (i) != separator) { // add each character in the source until you hit separator

           tmp += source.charAt (i++);

       }

       if ( !(tmp.equals ("")) ) { //so that if i=separator don't go into the token

           tokens[j++] = tmp;

       }

       i++;

   }

    return tokens;

}

answered yesterday

Yuki1112

112

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f206997%2fsplit-a-string-into-a-list-of-tokens%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
2
down vote

Logically, then, you need to find the starting and ending indexes of each token! So, let's define those helper functions (as private static functions):

/**

 * Considering str starting at startIndex, find the index at which the 

 * next token starts.

 *

 * @return The index of the start of a token (or str.length() if no more

 *         tokens).

 */

private static int start(String str, char sep, int startIndex) {

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) == sep; i++);

    return i;

}



/**

 * Considering str starting at startIndex, find the index at which the 

 * current token ends.

 *

 * @return The index just beyond the end of a token (the index of a

 *         sep character, or str.length() if this is the last token)

 */

private static int end(String str, char sep, int startIndex) {

    assert(str.charAt(sep) != sep);

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) != sep; i++);

    return i;

}

Then, we can use them in tokenize():

public static String tokenize(String str, char sep) {

    int tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); e = end(str, sep, s)) {

        tokenCount++;

    }

    String tokens = new String[tokenCount];

    tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); ) {

        tokens[tokenCount++] = str.substring(s, e = end(str, sep, s));

    }

    assert(tokens.length == tokenCount);

    return tokens;

}

Notice that now you can take advantage of the helper functions to predetermine the size of the array. Also, a lot of the repetitiveness of your conditions and loops is eliminated.

All of the analysis is done using string indexes, so there is no string manipulation other than just the essential .substring() calls — and even the trim() call is gone!

answered Nov 6 at 7:43

200_success

127k15148410

thank you for your answer, I agree It's more efficient to separate the functions, however we are not allowed to do that in this assignment
– Yuki1112
Nov 6 at 9:36

What, exactly, are you not allowed to do, and why didn't you state the restrictions up front in your question? Didn't you define a charRunCount() helper function in your own solution?
– 200_success
Nov 6 at 13:52

add a comment |

up vote
2
down vote

Logically, then, you need to find the starting and ending indexes of each token! So, let's define those helper functions (as private static functions):

/**

 * Considering str starting at startIndex, find the index at which the 

 * next token starts.

 *

 * @return The index of the start of a token (or str.length() if no more

 *         tokens).

 */

private static int start(String str, char sep, int startIndex) {

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) == sep; i++);

    return i;

}



/**

 * Considering str starting at startIndex, find the index at which the 

 * current token ends.

 *

 * @return The index just beyond the end of a token (the index of a

 *         sep character, or str.length() if this is the last token)

 */

private static int end(String str, char sep, int startIndex) {

    assert(str.charAt(sep) != sep);

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) != sep; i++);

    return i;

}

Then, we can use them in tokenize():

public static String tokenize(String str, char sep) {

    int tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); e = end(str, sep, s)) {

        tokenCount++;

    }

    String tokens = new String[tokenCount];

    tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); ) {

        tokens[tokenCount++] = str.substring(s, e = end(str, sep, s));

    }

    assert(tokens.length == tokenCount);

    return tokens;

}

Notice that now you can take advantage of the helper functions to predetermine the size of the array. Also, a lot of the repetitiveness of your conditions and loops is eliminated.

All of the analysis is done using string indexes, so there is no string manipulation other than just the essential .substring() calls — and even the trim() call is gone!

answered Nov 6 at 7:43

200_success

127k15148410

thank you for your answer, I agree It's more efficient to separate the functions, however we are not allowed to do that in this assignment
– Yuki1112
Nov 6 at 9:36

What, exactly, are you not allowed to do, and why didn't you state the restrictions up front in your question? Didn't you define a charRunCount() helper function in your own solution?
– 200_success
Nov 6 at 13:52

add a comment |

up vote
2
down vote

Logically, then, you need to find the starting and ending indexes of each token! So, let's define those helper functions (as private static functions):

/**

 * Considering str starting at startIndex, find the index at which the 

 * next token starts.

 *

 * @return The index of the start of a token (or str.length() if no more

 *         tokens).

 */

private static int start(String str, char sep, int startIndex) {

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) == sep; i++);

    return i;

}



/**

 * Considering str starting at startIndex, find the index at which the 

 * current token ends.

 *

 * @return The index just beyond the end of a token (the index of a

 *         sep character, or str.length() if this is the last token)

 */

private static int end(String str, char sep, int startIndex) {

    assert(str.charAt(sep) != sep);

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) != sep; i++);

    return i;

}

Then, we can use them in tokenize():

public static String tokenize(String str, char sep) {

    int tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); e = end(str, sep, s)) {

        tokenCount++;

    }

    String tokens = new String[tokenCount];

    tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); ) {

        tokens[tokenCount++] = str.substring(s, e = end(str, sep, s));

    }

    assert(tokens.length == tokenCount);

    return tokens;

}

Notice that now you can take advantage of the helper functions to predetermine the size of the array. Also, a lot of the repetitiveness of your conditions and loops is eliminated.

All of the analysis is done using string indexes, so there is no string manipulation other than just the essential .substring() calls — and even the trim() call is gone!

answered Nov 6 at 7:43

200_success

127k15148410

Logically, then, you need to find the starting and ending indexes of each token! So, let's define those helper functions (as private static functions):

/**

 * Considering str starting at startIndex, find the index at which the 

 * next token starts.

 *

 * @return The index of the start of a token (or str.length() if no more

 *         tokens).

 */

private static int start(String str, char sep, int startIndex) {

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) == sep; i++);

    return i;

}



/**

 * Considering str starting at startIndex, find the index at which the 

 * current token ends.

 *

 * @return The index just beyond the end of a token (the index of a

 *         sep character, or str.length() if this is the last token)

 */

private static int end(String str, char sep, int startIndex) {

    assert(str.charAt(sep) != sep);

    int i;

    for (i = startIndex; i < str.length() && str.charAt(i) != sep; i++);

    return i;

}

Then, we can use them in tokenize():

public static String tokenize(String str, char sep) {

    int tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); e = end(str, sep, s)) {

        tokenCount++;

    }

    String tokens = new String[tokenCount];

    tokenCount = 0;

    for (int s, e = 0; (s = start(str, sep, e)) < str.length(); ) {

        tokens[tokenCount++] = str.substring(s, e = end(str, sep, s));

    }

    assert(tokens.length == tokenCount);

    return tokens;

}

Notice that now you can take advantage of the helper functions to predetermine the size of the array. Also, a lot of the repetitiveness of your conditions and loops is eliminated.

All of the analysis is done using string indexes, so there is no string manipulation other than just the essential .substring() calls — and even the trim() call is gone!

answered Nov 6 at 7:43

200_success

127k15148410

answered Nov 6 at 7:43

200_success

127k15148410

answered Nov 6 at 7:43

200_success

127k15148410

answered Nov 6 at 7:43

200_success

127k15148410

thank you for your answer, I agree It's more efficient to separate the functions, however we are not allowed to do that in this assignment
– Yuki1112
Nov 6 at 9:36

What, exactly, are you not allowed to do, and why didn't you state the restrictions up front in your question? Didn't you define a charRunCount() helper function in your own solution?
– 200_success
Nov 6 at 13:52

add a comment |

thank you for your answer, I agree It's more efficient to separate the functions, however we are not allowed to do that in this assignment
– Yuki1112
Nov 6 at 9:36

What, exactly, are you not allowed to do, and why didn't you state the restrictions up front in your question? Didn't you define a charRunCount() helper function in your own solution?
– 200_success
Nov 6 at 13:52

thank you for your answer, I agree It's more efficient to separate the functions, however we are not allowed to do that in this assignment
– Yuki1112
Nov 6 at 9:36

What, exactly, are you not allowed to do, and why didn't you state the restrictions up front in your question? Didn't you define a charRunCount() helper function in your own solution?
– 200_success
Nov 6 at 13:52

add a comment |

up vote
0
down vote

Not sure of what you mean with "I am not allowed to use imports or regexes or add more functions." but if you can use String#indexOf this can be greatly simplified:

public class Tokenizer {

    private final char separator;



    public Tokenizer(char separator) {

        this.separator = separator;

    }



    public List<String> tokenize(String string) {

        List<String> tokens = new LinkedList<>();

        int start = 0, end = 0;

        while ( start < string.length() && (end = string.indexOf(separator, start))>-1 ) {

            tokens.add(string.substring(start, end));

            start = end+1;

        }

        tokens.add(string.substring(start));

        return tokens;

    }

}

answered Nov 7 at 7:17

gervais.b

96139

1

List would be imported from java.util.
– 200_success
Nov 7 at 13:11

Indeed. Then we should either loop twice to count the occurrence of "separator" and create a good sized array then loop to populate it. Or manage the increase of the array size by ourselves.
– gervais.b
Nov 7 at 22:03

add a comment |

up vote
0
down vote

Not sure of what you mean with "I am not allowed to use imports or regexes or add more functions." but if you can use String#indexOf this can be greatly simplified:

public class Tokenizer {

    private final char separator;



    public Tokenizer(char separator) {

        this.separator = separator;

    }



    public List<String> tokenize(String string) {

        List<String> tokens = new LinkedList<>();

        int start = 0, end = 0;

        while ( start < string.length() && (end = string.indexOf(separator, start))>-1 ) {

            tokens.add(string.substring(start, end));

            start = end+1;

        }

        tokens.add(string.substring(start));

        return tokens;

    }

}

answered Nov 7 at 7:17

gervais.b

96139

1

List would be imported from java.util.
– 200_success
Nov 7 at 13:11

Indeed. Then we should either loop twice to count the occurrence of "separator" and create a good sized array then loop to populate it. Or manage the increase of the array size by ourselves.
– gervais.b
Nov 7 at 22:03

add a comment |

up vote
0
down vote

Not sure of what you mean with "I am not allowed to use imports or regexes or add more functions." but if you can use String#indexOf this can be greatly simplified:

public class Tokenizer {

    private final char separator;



    public Tokenizer(char separator) {

        this.separator = separator;

    }



    public List<String> tokenize(String string) {

        List<String> tokens = new LinkedList<>();

        int start = 0, end = 0;

        while ( start < string.length() && (end = string.indexOf(separator, start))>-1 ) {

            tokens.add(string.substring(start, end));

            start = end+1;

        }

        tokens.add(string.substring(start));

        return tokens;

    }

}

answered Nov 7 at 7:17

gervais.b

96139

Not sure of what you mean with "I am not allowed to use imports or regexes or add more functions." but if you can use String#indexOf this can be greatly simplified:

public class Tokenizer {

    private final char separator;



    public Tokenizer(char separator) {

        this.separator = separator;

    }



    public List<String> tokenize(String string) {

        List<String> tokens = new LinkedList<>();

        int start = 0, end = 0;

        while ( start < string.length() && (end = string.indexOf(separator, start))>-1 ) {

            tokens.add(string.substring(start, end));

            start = end+1;

        }

        tokens.add(string.substring(start));

        return tokens;

    }

}

answered Nov 7 at 7:17

gervais.b

96139

answered Nov 7 at 7:17

gervais.b

96139

answered Nov 7 at 7:17

gervais.b

96139

answered Nov 7 at 7:17

gervais.b

96139

1

List would be imported from java.util.
– 200_success
Nov 7 at 13:11

Indeed. Then we should either loop twice to count the occurrence of "separator" and create a good sized array then loop to populate it. Or manage the increase of the array size by ourselves.
– gervais.b
Nov 7 at 22:03

add a comment |

1

List would be imported from java.util.
– 200_success
Nov 7 at 13:11

Indeed. Then we should either loop twice to count the occurrence of "separator" and create a good sized array then loop to populate it. Or manage the increase of the array size by ourselves.
– gervais.b
Nov 7 at 22:03

List would be imported from java.util.
– 200_success
Nov 7 at 13:11

Indeed. Then we should either loop twice to count the occurrence of "separator" and create a good sized array then loop to populate it. Or manage the increase of the array size by ourselves.
– gervais.b
Nov 7 at 22:03

add a comment |

up vote
0
down vote

I'd like to suggest to do it in a a way, that

doesn't have two loops that basically do the same thing

is more Java-esque

Instead of storing the result in an array, which requires you to first count the potential tokens, store the tokens in a LinkedList.

My solution would look like this:

public static String tokenize(String str, char separator) {



    Collection<String> result = new LinkedList<>();



    int len = str.length();

    int pos = 0;

    int start = 0; 



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray(new String[result.size()]);

}

EDIT: Here's a version without additional import :)

static class LinkedList {



    private class Node {

        Node(String value) {

            this.value = value;

        }



        String value;

        Node next;

    }



    private Node first = null;

    private Node last = null;

    private int size = 0;



    public void add(String value) {

        Node node = new Node(value);

        size++;

        if (first == null) {

            first = last = node;

            return;

        }

        last.next = node;

        last = node;

    }



    public String toArray() {

        String array = new String[size];



        int i = 0;

        Node current = first;

        while (current != null) {

            array[i] = current.value;

            i++;

            current = current.next;

        }



        return array;

    }



}



public static String tokenize(String str, char separator) {



    LinkedList result = new LinkedList();



    int len = str.length();

    int pos = 0;

    int start = 0;



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray();

}

edited Nov 7 at 7:57

answered Nov 6 at 9:36

RoToRa

6,1761236

1

Collection would be imported from java.util.
– 200_success
Nov 6 at 13:53

@200_success Ah, I missed that requirement. I can't use LinkedList then anyway. But in that case I'd just write a simple linked list class myself, which isn't against the rules :)
– RoToRa
Nov 7 at 7:38

add a comment |

up vote
0
down vote

I'd like to suggest to do it in a a way, that

doesn't have two loops that basically do the same thing

is more Java-esque

Instead of storing the result in an array, which requires you to first count the potential tokens, store the tokens in a LinkedList.

My solution would look like this:

public static String tokenize(String str, char separator) {



    Collection<String> result = new LinkedList<>();



    int len = str.length();

    int pos = 0;

    int start = 0; 



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray(new String[result.size()]);

}

EDIT: Here's a version without additional import :)

static class LinkedList {



    private class Node {

        Node(String value) {

            this.value = value;

        }



        String value;

        Node next;

    }



    private Node first = null;

    private Node last = null;

    private int size = 0;



    public void add(String value) {

        Node node = new Node(value);

        size++;

        if (first == null) {

            first = last = node;

            return;

        }

        last.next = node;

        last = node;

    }



    public String toArray() {

        String array = new String[size];



        int i = 0;

        Node current = first;

        while (current != null) {

            array[i] = current.value;

            i++;

            current = current.next;

        }



        return array;

    }



}



public static String tokenize(String str, char separator) {



    LinkedList result = new LinkedList();



    int len = str.length();

    int pos = 0;

    int start = 0;



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray();

}

edited Nov 7 at 7:57

answered Nov 6 at 9:36

RoToRa

6,1761236

1

Collection would be imported from java.util.
– 200_success
Nov 6 at 13:53

@200_success Ah, I missed that requirement. I can't use LinkedList then anyway. But in that case I'd just write a simple linked list class myself, which isn't against the rules :)
– RoToRa
Nov 7 at 7:38

add a comment |

up vote
0
down vote

I'd like to suggest to do it in a a way, that

doesn't have two loops that basically do the same thing

is more Java-esque

Instead of storing the result in an array, which requires you to first count the potential tokens, store the tokens in a LinkedList.

My solution would look like this:

public static String tokenize(String str, char separator) {



    Collection<String> result = new LinkedList<>();



    int len = str.length();

    int pos = 0;

    int start = 0; 



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray(new String[result.size()]);

}

EDIT: Here's a version without additional import :)

static class LinkedList {



    private class Node {

        Node(String value) {

            this.value = value;

        }



        String value;

        Node next;

    }



    private Node first = null;

    private Node last = null;

    private int size = 0;



    public void add(String value) {

        Node node = new Node(value);

        size++;

        if (first == null) {

            first = last = node;

            return;

        }

        last.next = node;

        last = node;

    }



    public String toArray() {

        String array = new String[size];



        int i = 0;

        Node current = first;

        while (current != null) {

            array[i] = current.value;

            i++;

            current = current.next;

        }



        return array;

    }



}



public static String tokenize(String str, char separator) {



    LinkedList result = new LinkedList();



    int len = str.length();

    int pos = 0;

    int start = 0;



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray();

}

edited Nov 7 at 7:57

answered Nov 6 at 9:36

RoToRa

6,1761236

I'd like to suggest to do it in a a way, that

doesn't have two loops that basically do the same thing

is more Java-esque

Instead of storing the result in an array, which requires you to first count the potential tokens, store the tokens in a LinkedList.

My solution would look like this:

public static String tokenize(String str, char separator) {



    Collection<String> result = new LinkedList<>();



    int len = str.length();

    int pos = 0;

    int start = 0; 



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray(new String[result.size()]);

}

EDIT: Here's a version without additional import :)

static class LinkedList {



    private class Node {

        Node(String value) {

            this.value = value;

        }



        String value;

        Node next;

    }



    private Node first = null;

    private Node last = null;

    private int size = 0;



    public void add(String value) {

        Node node = new Node(value);

        size++;

        if (first == null) {

            first = last = node;

            return;

        }

        last.next = node;

        last = node;

    }



    public String toArray() {

        String array = new String[size];



        int i = 0;

        Node current = first;

        while (current != null) {

            array[i] = current.value;

            i++;

            current = current.next;

        }



        return array;

    }



}



public static String tokenize(String str, char separator) {



    LinkedList result = new LinkedList();



    int len = str.length();

    int pos = 0;

    int start = 0;



    while (pos < len) {

        if (str.charAt(pos) != separator) {

            pos++;

            continue;

        }

        if (pos > start) {

            result.add(str.substring(start, pos));

        }

        do {

            pos++;

            start = pos;

        } while (pos < len && str.charAt(pos) == separator);

    }



    if (pos > start) {

        result.add(str.substring(start, pos));

    }



    return result.toArray();

}

edited Nov 7 at 7:57

answered Nov 6 at 9:36

RoToRa

6,1761236

edited Nov 7 at 7:57

answered Nov 6 at 9:36

RoToRa

6,1761236

answered Nov 6 at 9:36

RoToRa

6,1761236

answered Nov 6 at 9:36

RoToRa

6,1761236

1

Collection would be imported from java.util.
– 200_success
Nov 6 at 13:53

@200_success Ah, I missed that requirement. I can't use LinkedList then anyway. But in that case I'd just write a simple linked list class myself, which isn't against the rules :)
– RoToRa
Nov 7 at 7:38

add a comment |

1

Collection would be imported from java.util.
– 200_success
Nov 6 at 13:53

@200_success Ah, I missed that requirement. I can't use LinkedList then anyway. But in that case I'd just write a simple linked list class myself, which isn't against the rules :)
– RoToRa
Nov 7 at 7:38

Collection would be imported from java.util.
– 200_success
Nov 6 at 13:53

@200_success Ah, I missed that requirement. I can't use LinkedList then anyway. But in that case I'd just write a simple linked list class myself, which isn't against the rules :)
– RoToRa
Nov 7 at 7:38

add a comment |

up vote
0
down vote

I found a simple way for beginners:

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    // In the following statement, replace the 0 with the correct number of tokens, and complete the missing code.

    String tokens = new String[charRunCount (source,separator)+1]; //since we start from 0.

   int i = 0;

   int j = 0;

   int sourceLen = source.length ();

   while (i < sourceLen) {

       String tmp = "";

       while (i < sourceLen && source.charAt (i) != separator) { // add each character in the source until you hit separator

           tmp += source.charAt (i++);

       }

       if ( !(tmp.equals ("")) ) { //so that if i=separator don't go into the token

           tokens[j++] = tmp;

       }

       i++;

   }

    return tokens;

}

answered yesterday

Yuki1112

112

add a comment |

up vote
0
down vote

I found a simple way for beginners:

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    // In the following statement, replace the 0 with the correct number of tokens, and complete the missing code.

    String tokens = new String[charRunCount (source,separator)+1]; //since we start from 0.

   int i = 0;

   int j = 0;

   int sourceLen = source.length ();

   while (i < sourceLen) {

       String tmp = "";

       while (i < sourceLen && source.charAt (i) != separator) { // add each character in the source until you hit separator

           tmp += source.charAt (i++);

       }

       if ( !(tmp.equals ("")) ) { //so that if i=separator don't go into the token

           tokens[j++] = tmp;

       }

       i++;

   }

    return tokens;

}

answered yesterday

Yuki1112

112

add a comment |

up vote
0
down vote

I found a simple way for beginners:

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    // In the following statement, replace the 0 with the correct number of tokens, and complete the missing code.

    String tokens = new String[charRunCount (source,separator)+1]; //since we start from 0.

   int i = 0;

   int j = 0;

   int sourceLen = source.length ();

   while (i < sourceLen) {

       String tmp = "";

       while (i < sourceLen && source.charAt (i) != separator) { // add each character in the source until you hit separator

           tmp += source.charAt (i++);

       }

       if ( !(tmp.equals ("")) ) { //so that if i=separator don't go into the token

           tokens[j++] = tmp;

       }

       i++;

   }

    return tokens;

}

answered yesterday

Yuki1112

112

I found a simple way for beginners:

public static String tokenize (String str, char separator) {

    // Removes all the occurrences of the separator at the beginning and end of str

    String source = trim(str, separator);

    // In the following statement, replace the 0 with the correct number of tokens, and complete the missing code.

    String tokens = new String[charRunCount (source,separator)+1]; //since we start from 0.

   int i = 0;

   int j = 0;

   int sourceLen = source.length ();

   while (i < sourceLen) {

       String tmp = "";

       while (i < sourceLen && source.charAt (i) != separator) { // add each character in the source until you hit separator

           tmp += source.charAt (i++);

       }

       if ( !(tmp.equals ("")) ) { //so that if i=separator don't go into the token

           tokens[j++] = tmp;

       }

       i++;

   }

    return tokens;

}

answered yesterday

Yuki1112

112

answered yesterday

Yuki1112

112

answered yesterday

Yuki1112

112

answered yesterday

Yuki1112

112

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk