Finding duplicates in two sorted arrays

up vote
6
down vote

favorite

This is an interview question from here. Specifically, the second asked for a function that took in two sorted arrays of integers with no duplicate values within a single array and which returned an array of the duplicates between the two arrays.

Here is my solution (with tests):

public class ArrayDuplicates {

     public static void main(String args) {

          int sorted1 = {1, 2, 3, 5, 7};

          int sorted2 = {2, 4, 5, 6};

          Integer duplicates = duplicates(sorted1, sorted2);

          for(Integer d: duplicates) {

            System.out.println(d.intValue());

           }

      }

     public static Integer duplicates(int sorted1, int sorted2) {

            List<Integer> duplicates = new ArrayList<Integer>();

            for(int count = 0; count < sorted1.length; count ++) {

                  for(int counter = 0; counter < sorted2.length; counter ++) {

                       if(sorted1[count] == sorted2[counter]) {

                            duplicates.add(sorted1[count]);

                       } else if(sorted1[count] < sorted2[counter]) {

                          break;

                       }

                   }    

             }

    return duplicates.toArray(new Integer[duplicates.size()]);

    }

}

The code runs fine but I am trying to make it more efficient. I did a runtime analysis of the duplicates method and found the runtime to be in $O(N^2)$ (general nested for loop, etc). To me, there always seems to be some optimization to get the algorithm to $O(n)$. For this problem, the only optimization, I found was to break out of the while loop if the value in the second array is greater than the first (because you know it's sorted, so there's no way that value is present in the second array).

Is there another optimization that could get this code down to $O(n)$?

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

asked Jan 23 '15 at 8:54

committedandroider

291312

add a comment |

up vote
6
down vote

favorite

Here is my solution (with tests):

public class ArrayDuplicates {

     public static void main(String args) {

          int sorted1 = {1, 2, 3, 5, 7};

          int sorted2 = {2, 4, 5, 6};

          Integer duplicates = duplicates(sorted1, sorted2);

          for(Integer d: duplicates) {

            System.out.println(d.intValue());

           }

      }

     public static Integer duplicates(int sorted1, int sorted2) {

            List<Integer> duplicates = new ArrayList<Integer>();

            for(int count = 0; count < sorted1.length; count ++) {

                  for(int counter = 0; counter < sorted2.length; counter ++) {

                       if(sorted1[count] == sorted2[counter]) {

                            duplicates.add(sorted1[count]);

                       } else if(sorted1[count] < sorted2[counter]) {

                          break;

                       }

                   }    

             }

    return duplicates.toArray(new Integer[duplicates.size()]);

    }

}

Is there another optimization that could get this code down to $O(n)$?

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

asked Jan 23 '15 at 8:54

committedandroider

291312

add a comment |

up vote
6
down vote

favorite

Here is my solution (with tests):

public class ArrayDuplicates {

     public static void main(String args) {

          int sorted1 = {1, 2, 3, 5, 7};

          int sorted2 = {2, 4, 5, 6};

          Integer duplicates = duplicates(sorted1, sorted2);

          for(Integer d: duplicates) {

            System.out.println(d.intValue());

           }

      }

     public static Integer duplicates(int sorted1, int sorted2) {

            List<Integer> duplicates = new ArrayList<Integer>();

            for(int count = 0; count < sorted1.length; count ++) {

                  for(int counter = 0; counter < sorted2.length; counter ++) {

                       if(sorted1[count] == sorted2[counter]) {

                            duplicates.add(sorted1[count]);

                       } else if(sorted1[count] < sorted2[counter]) {

                          break;

                       }

                   }    

             }

    return duplicates.toArray(new Integer[duplicates.size()]);

    }

}

Is there another optimization that could get this code down to $O(n)$?

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

asked Jan 23 '15 at 8:54

committedandroider

291312

Here is my solution (with tests):

public class ArrayDuplicates {

     public static void main(String args) {

          int sorted1 = {1, 2, 3, 5, 7};

          int sorted2 = {2, 4, 5, 6};

          Integer duplicates = duplicates(sorted1, sorted2);

          for(Integer d: duplicates) {

            System.out.println(d.intValue());

           }

      }

     public static Integer duplicates(int sorted1, int sorted2) {

            List<Integer> duplicates = new ArrayList<Integer>();

            for(int count = 0; count < sorted1.length; count ++) {

                  for(int counter = 0; counter < sorted2.length; counter ++) {

                       if(sorted1[count] == sorted2[counter]) {

                            duplicates.add(sorted1[count]);

                       } else if(sorted1[count] < sorted2[counter]) {

                          break;

                       }

                   }    

             }

    return duplicates.toArray(new Integer[duplicates.size()]);

    }

}

Is there another optimization that could get this code down to $O(n)$?

java optimization algorithm

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

asked Jan 23 '15 at 8:54

committedandroider

291312

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

asked Jan 23 '15 at 8:54

committedandroider

291312

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

edited Jan 23 '15 at 13:51

Jamal♦

30.2k11115226

asked Jan 23 '15 at 8:54

committedandroider

291312

asked Jan 23 '15 at 8:54

committedandroider

291312

asked Jan 23 '15 at 8:54

committedandroider

291312

add a comment |

3 Answers
3

active

oldest

votes

up vote
10
down vote

accepted

You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).

However because the values you look for are also sorted then you know that the next value will be after the value you found in the last iteration. Keep the index where you broke out of the inner loop and start there during the next iteration.

int innerIndex = 0;

for(int count = 0; count < sorted1.length; count ++) {

      for(; innerIndex < sorted2.length; innerIndex++) {

           if(sorted1[count] == sorted2[innerIndex]) {

                duplicates.add(sorted1[count]);

           } else if(sorted1[count] < sorted2[innerIndex]) {

              break;

           }

       }    

 }

Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:

int index1 = 0, index2

while(index1 < sorted1.length && index2 < sorted2.length){

    if(sorted1[index1] < sorted2[index2]){

        index1++;

    }else if(sorted1[index1] > sorted2[index2]){

        index2++;

    }else{

        duplicates.add(sorted1[index1]);

        index1++;

        index2++;

    }

}

edited 7 mins ago

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

Should it be else if(sorted1[index1] > sorted2[index2]) { index2++ }?
– h.j.k.
Jan 23 '15 at 16:44

@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57

add a comment |

up vote
5
down vote

To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.

That is, you start with index i=0 in the first array and j=0 in the second array, and you stop when one of your indexes moved past the end of its array. At each step in the single loop, you compare a[i] with b[j]. If a[i] is greater than b[j], you bump j by 1. If it is smaller, you bump i by 1. If a[i] is equal to b[j], you bump both i and j and record a[i] as a duplicate.

That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

answered Jan 23 '15 at 10:58

user63638

511

add a comment |

up vote
0
down vote

Are you allowed to use built-in functions? If that is the case you can use the retainAll() method.

Integer sorted1 = { 1, 2, 3, 5, 7 };

Integer sorted2 = { 2, 4, 5, 6 };



List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));

duplicates.retainAll(Arrays.asList(sorted2));



System.out.println("Duplicates: " + duplicates);

//Output: Duplicates: [2, 5]

answered Jan 23 '15 at 9:29

Abbas

5,2631538

2

This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47

I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f78397%2ffinding-duplicates-in-two-sorted-arrays%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
10
down vote

accepted

You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).

int innerIndex = 0;

for(int count = 0; count < sorted1.length; count ++) {

      for(; innerIndex < sorted2.length; innerIndex++) {

           if(sorted1[count] == sorted2[innerIndex]) {

                duplicates.add(sorted1[count]);

           } else if(sorted1[count] < sorted2[innerIndex]) {

              break;

           }

       }    

 }

Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:

int index1 = 0, index2

while(index1 < sorted1.length && index2 < sorted2.length){

    if(sorted1[index1] < sorted2[index2]){

        index1++;

    }else if(sorted1[index1] > sorted2[index2]){

        index2++;

    }else{

        duplicates.add(sorted1[index1]);

        index1++;

        index2++;

    }

}

edited 7 mins ago

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

Should it be else if(sorted1[index1] > sorted2[index2]) { index2++ }?
– h.j.k.
Jan 23 '15 at 16:44

@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57

add a comment |

up vote
10
down vote

accepted

You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).

int innerIndex = 0;

for(int count = 0; count < sorted1.length; count ++) {

      for(; innerIndex < sorted2.length; innerIndex++) {

           if(sorted1[count] == sorted2[innerIndex]) {

                duplicates.add(sorted1[count]);

           } else if(sorted1[count] < sorted2[innerIndex]) {

              break;

           }

       }    

 }

Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:

int index1 = 0, index2

while(index1 < sorted1.length && index2 < sorted2.length){

    if(sorted1[index1] < sorted2[index2]){

        index1++;

    }else if(sorted1[index1] > sorted2[index2]){

        index2++;

    }else{

        duplicates.add(sorted1[index1]);

        index1++;

        index2++;

    }

}

edited 7 mins ago

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

Should it be else if(sorted1[index1] > sorted2[index2]) { index2++ }?
– h.j.k.
Jan 23 '15 at 16:44

@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57

add a comment |

up vote
10
down vote

accepted

You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).

int innerIndex = 0;

for(int count = 0; count < sorted1.length; count ++) {

      for(; innerIndex < sorted2.length; innerIndex++) {

           if(sorted1[count] == sorted2[innerIndex]) {

                duplicates.add(sorted1[count]);

           } else if(sorted1[count] < sorted2[innerIndex]) {

              break;

           }

       }    

 }

Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:

int index1 = 0, index2

while(index1 < sorted1.length && index2 < sorted2.length){

    if(sorted1[index1] < sorted2[index2]){

        index1++;

    }else if(sorted1[index1] > sorted2[index2]){

        index2++;

    }else{

        duplicates.add(sorted1[index1]);

        index1++;

        index2++;

    }

}

edited 7 mins ago

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).

int innerIndex = 0;

for(int count = 0; count < sorted1.length; count ++) {

      for(; innerIndex < sorted2.length; innerIndex++) {

           if(sorted1[count] == sorted2[innerIndex]) {

                duplicates.add(sorted1[count]);

           } else if(sorted1[count] < sorted2[innerIndex]) {

              break;

           }

       }    

 }

Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:

int index1 = 0, index2

while(index1 < sorted1.length && index2 < sorted2.length){

    if(sorted1[index1] < sorted2[index2]){

        index1++;

    }else if(sorted1[index1] > sorted2[index2]){

        index2++;

    }else{

        duplicates.add(sorted1[index1]);

        index1++;

        index2++;

    }

}

edited 7 mins ago

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

edited 7 mins ago

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

answered Jan 23 '15 at 9:30

ratchet freak

11.6k1341

Should it be else if(sorted1[index1] > sorted2[index2]) { index2++ }?
– h.j.k.
Jan 23 '15 at 16:44

@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57

add a comment |

Should it be else if(sorted1[index1] > sorted2[index2]) { index2++ }?
– h.j.k.
Jan 23 '15 at 16:44

@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57

Should it be else if(sorted1[index1] > sorted2[index2]) { index2++ }?
– h.j.k.
Jan 23 '15 at 16:44

@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57

add a comment |

up vote
5
down vote

To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.

That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

answered Jan 23 '15 at 10:58

user63638

511

add a comment |

up vote
5
down vote

To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.

That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

answered Jan 23 '15 at 10:58

user63638

511

add a comment |

up vote
5
down vote

To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.

That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

answered Jan 23 '15 at 10:58

user63638

511

To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.

That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

answered Jan 23 '15 at 10:58

user63638

511

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

edited Jan 23 '15 at 11:13

Pimgd

21.1k556142

answered Jan 23 '15 at 10:58

user63638

511

answered Jan 23 '15 at 10:58

user63638

511

answered Jan 23 '15 at 10:58

user63638

511

add a comment |

up vote
0
down vote

Are you allowed to use built-in functions? If that is the case you can use the retainAll() method.

Integer sorted1 = { 1, 2, 3, 5, 7 };

Integer sorted2 = { 2, 4, 5, 6 };



List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));

duplicates.retainAll(Arrays.asList(sorted2));



System.out.println("Duplicates: " + duplicates);

//Output: Duplicates: [2, 5]

answered Jan 23 '15 at 9:29

Abbas

5,2631538

2

This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47

I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55

add a comment |

up vote
0
down vote

Are you allowed to use built-in functions? If that is the case you can use the retainAll() method.

Integer sorted1 = { 1, 2, 3, 5, 7 };

Integer sorted2 = { 2, 4, 5, 6 };



List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));

duplicates.retainAll(Arrays.asList(sorted2));



System.out.println("Duplicates: " + duplicates);

//Output: Duplicates: [2, 5]

answered Jan 23 '15 at 9:29

Abbas

5,2631538

2

This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47

I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55

add a comment |

up vote
0
down vote

Are you allowed to use built-in functions? If that is the case you can use the retainAll() method.

Integer sorted1 = { 1, 2, 3, 5, 7 };

Integer sorted2 = { 2, 4, 5, 6 };



List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));

duplicates.retainAll(Arrays.asList(sorted2));



System.out.println("Duplicates: " + duplicates);

//Output: Duplicates: [2, 5]

answered Jan 23 '15 at 9:29

Abbas

5,2631538

Are you allowed to use built-in functions? If that is the case you can use the retainAll() method.

Integer sorted1 = { 1, 2, 3, 5, 7 };

Integer sorted2 = { 2, 4, 5, 6 };



List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));

duplicates.retainAll(Arrays.asList(sorted2));



System.out.println("Duplicates: " + duplicates);

//Output: Duplicates: [2, 5]

answered Jan 23 '15 at 9:29

Abbas

5,2631538

answered Jan 23 '15 at 9:29

Abbas

5,2631538

answered Jan 23 '15 at 9:29

Abbas

5,2631538

answered Jan 23 '15 at 9:29

Abbas

5,2631538

2

This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47

I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55

add a comment |

2

This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47

I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55

This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47

I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk