Finding duplicates in two sorted arrays
up vote
6
down vote
favorite
This is an interview question from here. Specifically, the second asked for a function that took in two sorted arrays of integers with no duplicate values within a single array and which returned an array of the duplicates between the two arrays.
Here is my solution (with tests):
public class ArrayDuplicates {
public static void main(String args) {
int sorted1 = {1, 2, 3, 5, 7};
int sorted2 = {2, 4, 5, 6};
Integer duplicates = duplicates(sorted1, sorted2);
for(Integer d: duplicates) {
System.out.println(d.intValue());
}
}
public static Integer duplicates(int sorted1, int sorted2) {
List<Integer> duplicates = new ArrayList<Integer>();
for(int count = 0; count < sorted1.length; count ++) {
for(int counter = 0; counter < sorted2.length; counter ++) {
if(sorted1[count] == sorted2[counter]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[counter]) {
break;
}
}
}
return duplicates.toArray(new Integer[duplicates.size()]);
}
}
The code runs fine but I am trying to make it more efficient. I did a runtime analysis of the duplicates method and found the runtime to be in $O(N^2)$ (general nested for
loop, etc). To me, there always seems to be some optimization to get the algorithm to $O(n)$. For this problem, the only optimization, I found was to break out of the while loop if the value in the second array is greater than the first (because you know it's sorted, so there's no way that value is present in the second array).
Is there another optimization that could get this code down to $O(n)$?
java optimization algorithm
add a comment |
up vote
6
down vote
favorite
This is an interview question from here. Specifically, the second asked for a function that took in two sorted arrays of integers with no duplicate values within a single array and which returned an array of the duplicates between the two arrays.
Here is my solution (with tests):
public class ArrayDuplicates {
public static void main(String args) {
int sorted1 = {1, 2, 3, 5, 7};
int sorted2 = {2, 4, 5, 6};
Integer duplicates = duplicates(sorted1, sorted2);
for(Integer d: duplicates) {
System.out.println(d.intValue());
}
}
public static Integer duplicates(int sorted1, int sorted2) {
List<Integer> duplicates = new ArrayList<Integer>();
for(int count = 0; count < sorted1.length; count ++) {
for(int counter = 0; counter < sorted2.length; counter ++) {
if(sorted1[count] == sorted2[counter]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[counter]) {
break;
}
}
}
return duplicates.toArray(new Integer[duplicates.size()]);
}
}
The code runs fine but I am trying to make it more efficient. I did a runtime analysis of the duplicates method and found the runtime to be in $O(N^2)$ (general nested for
loop, etc). To me, there always seems to be some optimization to get the algorithm to $O(n)$. For this problem, the only optimization, I found was to break out of the while loop if the value in the second array is greater than the first (because you know it's sorted, so there's no way that value is present in the second array).
Is there another optimization that could get this code down to $O(n)$?
java optimization algorithm
add a comment |
up vote
6
down vote
favorite
up vote
6
down vote
favorite
This is an interview question from here. Specifically, the second asked for a function that took in two sorted arrays of integers with no duplicate values within a single array and which returned an array of the duplicates between the two arrays.
Here is my solution (with tests):
public class ArrayDuplicates {
public static void main(String args) {
int sorted1 = {1, 2, 3, 5, 7};
int sorted2 = {2, 4, 5, 6};
Integer duplicates = duplicates(sorted1, sorted2);
for(Integer d: duplicates) {
System.out.println(d.intValue());
}
}
public static Integer duplicates(int sorted1, int sorted2) {
List<Integer> duplicates = new ArrayList<Integer>();
for(int count = 0; count < sorted1.length; count ++) {
for(int counter = 0; counter < sorted2.length; counter ++) {
if(sorted1[count] == sorted2[counter]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[counter]) {
break;
}
}
}
return duplicates.toArray(new Integer[duplicates.size()]);
}
}
The code runs fine but I am trying to make it more efficient. I did a runtime analysis of the duplicates method and found the runtime to be in $O(N^2)$ (general nested for
loop, etc). To me, there always seems to be some optimization to get the algorithm to $O(n)$. For this problem, the only optimization, I found was to break out of the while loop if the value in the second array is greater than the first (because you know it's sorted, so there's no way that value is present in the second array).
Is there another optimization that could get this code down to $O(n)$?
java optimization algorithm
This is an interview question from here. Specifically, the second asked for a function that took in two sorted arrays of integers with no duplicate values within a single array and which returned an array of the duplicates between the two arrays.
Here is my solution (with tests):
public class ArrayDuplicates {
public static void main(String args) {
int sorted1 = {1, 2, 3, 5, 7};
int sorted2 = {2, 4, 5, 6};
Integer duplicates = duplicates(sorted1, sorted2);
for(Integer d: duplicates) {
System.out.println(d.intValue());
}
}
public static Integer duplicates(int sorted1, int sorted2) {
List<Integer> duplicates = new ArrayList<Integer>();
for(int count = 0; count < sorted1.length; count ++) {
for(int counter = 0; counter < sorted2.length; counter ++) {
if(sorted1[count] == sorted2[counter]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[counter]) {
break;
}
}
}
return duplicates.toArray(new Integer[duplicates.size()]);
}
}
The code runs fine but I am trying to make it more efficient. I did a runtime analysis of the duplicates method and found the runtime to be in $O(N^2)$ (general nested for
loop, etc). To me, there always seems to be some optimization to get the algorithm to $O(n)$. For this problem, the only optimization, I found was to break out of the while loop if the value in the second array is greater than the first (because you know it's sorted, so there's no way that value is present in the second array).
Is there another optimization that could get this code down to $O(n)$?
java optimization algorithm
java optimization algorithm
edited Jan 23 '15 at 13:51
Jamal♦
30.2k11115226
30.2k11115226
asked Jan 23 '15 at 8:54
committedandroider
291312
291312
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
up vote
10
down vote
accepted
You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).
However because the values you look for are also sorted then you know that the next value will be after the value you found in the last iteration. Keep the index where you broke out of the inner loop and start there during the next iteration.
int innerIndex = 0;
for(int count = 0; count < sorted1.length; count ++) {
for(; innerIndex < sorted2.length; innerIndex++) {
if(sorted1[count] == sorted2[innerIndex]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[innerIndex]) {
break;
}
}
}
Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:
int index1 = 0, index2
while(index1 < sorted1.length && index2 < sorted2.length){
if(sorted1[index1] < sorted2[index2]){
index1++;
}else if(sorted1[index1] > sorted2[index2]){
index2++;
}else{
duplicates.add(sorted1[index1]);
index1++;
index2++;
}
}
Should it beelse if(sorted1[index1] > sorted2[index2]) { index2++ }
?
– h.j.k.
Jan 23 '15 at 16:44
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
add a comment |
up vote
5
down vote
To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.
That is, you start with index i=0
in the first array and j=0
in the second array, and you stop when one of your indexes moved past the end of its array. At each step in the single loop, you compare a[i]
with b[j]
. If a[i]
is greater than b[j]
, you bump j
by 1. If it is smaller, you bump i
by 1. If a[i]
is equal to b[j]
, you bump both i
and j
and record a[i]
as a duplicate.
That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.
add a comment |
up vote
0
down vote
Are you allowed to use built-in functions? If that is the case you can use the retainAll()
method.
Integer sorted1 = { 1, 2, 3, 5, 7 };
Integer sorted2 = { 2, 4, 5, 6 };
List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));
duplicates.retainAll(Arrays.asList(sorted2));
System.out.println("Duplicates: " + duplicates);
//Output: Duplicates: [2, 5]
2
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
accepted
You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).
However because the values you look for are also sorted then you know that the next value will be after the value you found in the last iteration. Keep the index where you broke out of the inner loop and start there during the next iteration.
int innerIndex = 0;
for(int count = 0; count < sorted1.length; count ++) {
for(; innerIndex < sorted2.length; innerIndex++) {
if(sorted1[count] == sorted2[innerIndex]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[innerIndex]) {
break;
}
}
}
Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:
int index1 = 0, index2
while(index1 < sorted1.length && index2 < sorted2.length){
if(sorted1[index1] < sorted2[index2]){
index1++;
}else if(sorted1[index1] > sorted2[index2]){
index2++;
}else{
duplicates.add(sorted1[index1]);
index1++;
index2++;
}
}
Should it beelse if(sorted1[index1] > sorted2[index2]) { index2++ }
?
– h.j.k.
Jan 23 '15 at 16:44
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
add a comment |
up vote
10
down vote
accepted
You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).
However because the values you look for are also sorted then you know that the next value will be after the value you found in the last iteration. Keep the index where you broke out of the inner loop and start there during the next iteration.
int innerIndex = 0;
for(int count = 0; count < sorted1.length; count ++) {
for(; innerIndex < sorted2.length; innerIndex++) {
if(sorted1[count] == sorted2[innerIndex]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[innerIndex]) {
break;
}
}
}
Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:
int index1 = 0, index2
while(index1 < sorted1.length && index2 < sorted2.length){
if(sorted1[index1] < sorted2[index2]){
index1++;
}else if(sorted1[index1] > sorted2[index2]){
index2++;
}else{
duplicates.add(sorted1[index1]);
index1++;
index2++;
}
}
Should it beelse if(sorted1[index1] > sorted2[index2]) { index2++ }
?
– h.j.k.
Jan 23 '15 at 16:44
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
add a comment |
up vote
10
down vote
accepted
up vote
10
down vote
accepted
You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).
However because the values you look for are also sorted then you know that the next value will be after the value you found in the last iteration. Keep the index where you broke out of the inner loop and start there during the next iteration.
int innerIndex = 0;
for(int count = 0; count < sorted1.length; count ++) {
for(; innerIndex < sorted2.length; innerIndex++) {
if(sorted1[count] == sorted2[innerIndex]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[innerIndex]) {
break;
}
}
}
Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:
int index1 = 0, index2
while(index1 < sorted1.length && index2 < sorted2.length){
if(sorted1[index1] < sorted2[index2]){
index1++;
}else if(sorted1[index1] > sorted2[index2]){
index2++;
}else{
duplicates.add(sorted1[index1]);
index1++;
index2++;
}
}
You are looking for values in a sorted array, first thing that comes to mind is binary search (improves to $O(n log n)$).
However because the values you look for are also sorted then you know that the next value will be after the value you found in the last iteration. Keep the index where you broke out of the inner loop and start there during the next iteration.
int innerIndex = 0;
for(int count = 0; count < sorted1.length; count ++) {
for(; innerIndex < sorted2.length; innerIndex++) {
if(sorted1[count] == sorted2[innerIndex]) {
duplicates.add(sorted1[count]);
} else if(sorted1[count] < sorted2[innerIndex]) {
break;
}
}
}
Otherwise you can pretend to merge the arrays. It's essentially the same algorithm but makes it clearer what is happening:
int index1 = 0, index2
while(index1 < sorted1.length && index2 < sorted2.length){
if(sorted1[index1] < sorted2[index2]){
index1++;
}else if(sorted1[index1] > sorted2[index2]){
index2++;
}else{
duplicates.add(sorted1[index1]);
index1++;
index2++;
}
}
edited 7 mins ago
answered Jan 23 '15 at 9:30
ratchet freak
11.6k1341
11.6k1341
Should it beelse if(sorted1[index1] > sorted2[index2]) { index2++ }
?
– h.j.k.
Jan 23 '15 at 16:44
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
add a comment |
Should it beelse if(sorted1[index1] > sorted2[index2]) { index2++ }
?
– h.j.k.
Jan 23 '15 at 16:44
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
Should it be
else if(sorted1[index1] > sorted2[index2]) { index2++ }
?– h.j.k.
Jan 23 '15 at 16:44
Should it be
else if(sorted1[index1] > sorted2[index2]) { index2++ }
?– h.j.k.
Jan 23 '15 at 16:44
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
@h.j.k. think you're right there. if the first array's value is greater, you have to increment index2 because there is no way for that second array's value to be in the first array
– committedandroider
Jan 23 '15 at 16:57
add a comment |
up vote
5
down vote
To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.
That is, you start with index i=0
in the first array and j=0
in the second array, and you stop when one of your indexes moved past the end of its array. At each step in the single loop, you compare a[i]
with b[j]
. If a[i]
is greater than b[j]
, you bump j
by 1. If it is smaller, you bump i
by 1. If a[i]
is equal to b[j]
, you bump both i
and j
and record a[i]
as a duplicate.
That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.
add a comment |
up vote
5
down vote
To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.
That is, you start with index i=0
in the first array and j=0
in the second array, and you stop when one of your indexes moved past the end of its array. At each step in the single loop, you compare a[i]
with b[j]
. If a[i]
is greater than b[j]
, you bump j
by 1. If it is smaller, you bump i
by 1. If a[i]
is equal to b[j]
, you bump both i
and j
and record a[i]
as a duplicate.
That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.
add a comment |
up vote
5
down vote
up vote
5
down vote
To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.
That is, you start with index i=0
in the first array and j=0
in the second array, and you stop when one of your indexes moved past the end of its array. At each step in the single loop, you compare a[i]
with b[j]
. If a[i]
is greater than b[j]
, you bump j
by 1. If it is smaller, you bump i
by 1. If a[i]
is equal to b[j]
, you bump both i
and j
and record a[i]
as a duplicate.
That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.
To make this $O(n)$, the basic algorithm is to walk both arrays in one loop in parallel, in a merge-like manner.
That is, you start with index i=0
in the first array and j=0
in the second array, and you stop when one of your indexes moved past the end of its array. At each step in the single loop, you compare a[i]
with b[j]
. If a[i]
is greater than b[j]
, you bump j
by 1. If it is smaller, you bump i
by 1. If a[i]
is equal to b[j]
, you bump both i
and j
and record a[i]
as a duplicate.
That's all. As each step results in at least one array element that will never be looked at again, this is $O(n)$.
edited Jan 23 '15 at 11:13
Pimgd
21.1k556142
21.1k556142
answered Jan 23 '15 at 10:58
user63638
511
511
add a comment |
add a comment |
up vote
0
down vote
Are you allowed to use built-in functions? If that is the case you can use the retainAll()
method.
Integer sorted1 = { 1, 2, 3, 5, 7 };
Integer sorted2 = { 2, 4, 5, 6 };
List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));
duplicates.retainAll(Arrays.asList(sorted2));
System.out.println("Duplicates: " + duplicates);
//Output: Duplicates: [2, 5]
2
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
add a comment |
up vote
0
down vote
Are you allowed to use built-in functions? If that is the case you can use the retainAll()
method.
Integer sorted1 = { 1, 2, 3, 5, 7 };
Integer sorted2 = { 2, 4, 5, 6 };
List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));
duplicates.retainAll(Arrays.asList(sorted2));
System.out.println("Duplicates: " + duplicates);
//Output: Duplicates: [2, 5]
2
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
add a comment |
up vote
0
down vote
up vote
0
down vote
Are you allowed to use built-in functions? If that is the case you can use the retainAll()
method.
Integer sorted1 = { 1, 2, 3, 5, 7 };
Integer sorted2 = { 2, 4, 5, 6 };
List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));
duplicates.retainAll(Arrays.asList(sorted2));
System.out.println("Duplicates: " + duplicates);
//Output: Duplicates: [2, 5]
Are you allowed to use built-in functions? If that is the case you can use the retainAll()
method.
Integer sorted1 = { 1, 2, 3, 5, 7 };
Integer sorted2 = { 2, 4, 5, 6 };
List<Integer> duplicates = new ArrayList<Integer>(Arrays.asList(sorted1));
duplicates.retainAll(Arrays.asList(sorted2));
System.out.println("Duplicates: " + duplicates);
//Output: Duplicates: [2, 5]
answered Jan 23 '15 at 9:29
Abbas
5,2631538
5,2631538
2
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
add a comment |
2
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
2
2
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
This is still O(n^2) because retainAll doesn't check whether the arrays are sorted so it needs to check all against all.
– ratchet freak
Jan 23 '15 at 9:47
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
I didn't know that as I am not that familiar with Java, thanks for the comment! ;)
– Abbas
Jan 23 '15 at 9:55
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f78397%2ffinding-duplicates-in-two-sorted-arrays%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown