Bash - testing if a string is 'greater' than another - how does it work internally?
up vote
20
down vote
favorite
In Bash I can write the following test
[[ "f" > "a" ]]
which results in returning 0, i.e. true. How does bash actually perform this string comparison? From my understanding >
does an integer comparison. Does it try to compare the ASCII value of the operands?
bash
add a comment |
up vote
20
down vote
favorite
In Bash I can write the following test
[[ "f" > "a" ]]
which results in returning 0, i.e. true. How does bash actually perform this string comparison? From my understanding >
does an integer comparison. Does it try to compare the ASCII value of the operands?
bash
1
Yeah I suspect it has a polymorphic deal going on where > means one thing with two strings, and another with two numbers. However I'm not an experienced bash coder.
– VoronoiPotato
Aug 16 '12 at 14:31
add a comment |
up vote
20
down vote
favorite
up vote
20
down vote
favorite
In Bash I can write the following test
[[ "f" > "a" ]]
which results in returning 0, i.e. true. How does bash actually perform this string comparison? From my understanding >
does an integer comparison. Does it try to compare the ASCII value of the operands?
bash
In Bash I can write the following test
[[ "f" > "a" ]]
which results in returning 0, i.e. true. How does bash actually perform this string comparison? From my understanding >
does an integer comparison. Does it try to compare the ASCII value of the operands?
bash
bash
asked Aug 16 '12 at 14:28
helpermethod
22.8k50148236
22.8k50148236
1
Yeah I suspect it has a polymorphic deal going on where > means one thing with two strings, and another with two numbers. However I'm not an experienced bash coder.
– VoronoiPotato
Aug 16 '12 at 14:31
add a comment |
1
Yeah I suspect it has a polymorphic deal going on where > means one thing with two strings, and another with two numbers. However I'm not an experienced bash coder.
– VoronoiPotato
Aug 16 '12 at 14:31
1
1
Yeah I suspect it has a polymorphic deal going on where > means one thing with two strings, and another with two numbers. However I'm not an experienced bash coder.
– VoronoiPotato
Aug 16 '12 at 14:31
Yeah I suspect it has a polymorphic deal going on where > means one thing with two strings, and another with two numbers. However I'm not an experienced bash coder.
– VoronoiPotato
Aug 16 '12 at 14:31
add a comment |
3 Answers
3
active
oldest
votes
up vote
11
down vote
accepted
From help test
:
STRING1 > STRING2
True if STRING1 sorts after STRING2 lexicographically.
Internally, bash either uses strcoll()
or strcmp()
for that:
else if ((op[0] == '>' || op[0] == '<') && op[1] == '')
{
if (shell_compatibility_level > 40 && flags & TEST_LOCALE)
return ((op[0] == '>') ? (strcoll (arg1, arg2) > 0) : (strcoll (arg1, arg2) < 0));
else
return ((op[0] == '>') ? (strcmp (arg1, arg2) > 0) : (strcmp (arg1, arg2) < 0));
}
The latter actually compares ASCII codes, the former (used when locale is enabled) performs a more specific comparison which is suitable for sorting in given locale.
add a comment |
up vote
7
down vote
It's an alphabetical comparison (AIUI the sort order may be influenced by the current locale). It compares the first character of each string, and if the one on the left has a higher value it's true, if lower it's false; if they're the same, then it compares the second character, etc.
This is not the same as integer comparison, for that you use [[ 2 -gt 1 ]]
or (( 2 > 1 ))
. To illustrate the difference between string and integer comparison, consider that all of the following are "true":
[[ 2 > 10 ]] # because "2" comes after "1" in ASCII sort order
[[ 10 -gt 2 ]] # because 10 is a larger number than 2
(( 10 > 2 )) # ditto
Here are some more test that're true as string comparisons, but would be false with integer comparison:
[[ 05 < 5 ]] # Because "0" comes before "5"
[[ +5 < 0 ]] # Because "+" comes before the digits
[[ -0 < 0 ]] # Because "-" comes before the digits
[[ -1 < -2 ]] # Because "-" doesn't change how the second character is compared
add a comment |
up vote
1
down vote
Yes, it compares the ascii value and if equal then repeat the comparison in the next character.
/* Copyright (C) 1991, 1996, 1997, 2003 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
#include <string.h>
#include <memcopy.h>
#undef strcmp
/* Compare S1 and S2, returning less than, equal to or
greater than zero if S1 is lexicographically less than,
equal to or greater than S2. */
int
strcmp (p1, p2)
const char *p1;
const char *p2;
{
register const unsigned char *s1 = (const unsigned char *) p1;
register const unsigned char *s2 = (const unsigned char *) p2;
unsigned reg_char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
2
I don't see why you're pasting some mostly-irrelevantstrcmp()
function. bash supports locales, and in this case it usesstrcoll()
instead to perform comparison suitable for a particular charset.
– Michał Górny
Aug 17 '12 at 8:12
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
11
down vote
accepted
From help test
:
STRING1 > STRING2
True if STRING1 sorts after STRING2 lexicographically.
Internally, bash either uses strcoll()
or strcmp()
for that:
else if ((op[0] == '>' || op[0] == '<') && op[1] == '')
{
if (shell_compatibility_level > 40 && flags & TEST_LOCALE)
return ((op[0] == '>') ? (strcoll (arg1, arg2) > 0) : (strcoll (arg1, arg2) < 0));
else
return ((op[0] == '>') ? (strcmp (arg1, arg2) > 0) : (strcmp (arg1, arg2) < 0));
}
The latter actually compares ASCII codes, the former (used when locale is enabled) performs a more specific comparison which is suitable for sorting in given locale.
add a comment |
up vote
11
down vote
accepted
From help test
:
STRING1 > STRING2
True if STRING1 sorts after STRING2 lexicographically.
Internally, bash either uses strcoll()
or strcmp()
for that:
else if ((op[0] == '>' || op[0] == '<') && op[1] == '')
{
if (shell_compatibility_level > 40 && flags & TEST_LOCALE)
return ((op[0] == '>') ? (strcoll (arg1, arg2) > 0) : (strcoll (arg1, arg2) < 0));
else
return ((op[0] == '>') ? (strcmp (arg1, arg2) > 0) : (strcmp (arg1, arg2) < 0));
}
The latter actually compares ASCII codes, the former (used when locale is enabled) performs a more specific comparison which is suitable for sorting in given locale.
add a comment |
up vote
11
down vote
accepted
up vote
11
down vote
accepted
From help test
:
STRING1 > STRING2
True if STRING1 sorts after STRING2 lexicographically.
Internally, bash either uses strcoll()
or strcmp()
for that:
else if ((op[0] == '>' || op[0] == '<') && op[1] == '')
{
if (shell_compatibility_level > 40 && flags & TEST_LOCALE)
return ((op[0] == '>') ? (strcoll (arg1, arg2) > 0) : (strcoll (arg1, arg2) < 0));
else
return ((op[0] == '>') ? (strcmp (arg1, arg2) > 0) : (strcmp (arg1, arg2) < 0));
}
The latter actually compares ASCII codes, the former (used when locale is enabled) performs a more specific comparison which is suitable for sorting in given locale.
From help test
:
STRING1 > STRING2
True if STRING1 sorts after STRING2 lexicographically.
Internally, bash either uses strcoll()
or strcmp()
for that:
else if ((op[0] == '>' || op[0] == '<') && op[1] == '')
{
if (shell_compatibility_level > 40 && flags & TEST_LOCALE)
return ((op[0] == '>') ? (strcoll (arg1, arg2) > 0) : (strcoll (arg1, arg2) < 0));
else
return ((op[0] == '>') ? (strcmp (arg1, arg2) > 0) : (strcmp (arg1, arg2) < 0));
}
The latter actually compares ASCII codes, the former (used when locale is enabled) performs a more specific comparison which is suitable for sorting in given locale.
edited Aug 17 '12 at 8:15
answered Aug 16 '12 at 14:38
Michał Górny
12.9k23871
12.9k23871
add a comment |
add a comment |
up vote
7
down vote
It's an alphabetical comparison (AIUI the sort order may be influenced by the current locale). It compares the first character of each string, and if the one on the left has a higher value it's true, if lower it's false; if they're the same, then it compares the second character, etc.
This is not the same as integer comparison, for that you use [[ 2 -gt 1 ]]
or (( 2 > 1 ))
. To illustrate the difference between string and integer comparison, consider that all of the following are "true":
[[ 2 > 10 ]] # because "2" comes after "1" in ASCII sort order
[[ 10 -gt 2 ]] # because 10 is a larger number than 2
(( 10 > 2 )) # ditto
Here are some more test that're true as string comparisons, but would be false with integer comparison:
[[ 05 < 5 ]] # Because "0" comes before "5"
[[ +5 < 0 ]] # Because "+" comes before the digits
[[ -0 < 0 ]] # Because "-" comes before the digits
[[ -1 < -2 ]] # Because "-" doesn't change how the second character is compared
add a comment |
up vote
7
down vote
It's an alphabetical comparison (AIUI the sort order may be influenced by the current locale). It compares the first character of each string, and if the one on the left has a higher value it's true, if lower it's false; if they're the same, then it compares the second character, etc.
This is not the same as integer comparison, for that you use [[ 2 -gt 1 ]]
or (( 2 > 1 ))
. To illustrate the difference between string and integer comparison, consider that all of the following are "true":
[[ 2 > 10 ]] # because "2" comes after "1" in ASCII sort order
[[ 10 -gt 2 ]] # because 10 is a larger number than 2
(( 10 > 2 )) # ditto
Here are some more test that're true as string comparisons, but would be false with integer comparison:
[[ 05 < 5 ]] # Because "0" comes before "5"
[[ +5 < 0 ]] # Because "+" comes before the digits
[[ -0 < 0 ]] # Because "-" comes before the digits
[[ -1 < -2 ]] # Because "-" doesn't change how the second character is compared
add a comment |
up vote
7
down vote
up vote
7
down vote
It's an alphabetical comparison (AIUI the sort order may be influenced by the current locale). It compares the first character of each string, and if the one on the left has a higher value it's true, if lower it's false; if they're the same, then it compares the second character, etc.
This is not the same as integer comparison, for that you use [[ 2 -gt 1 ]]
or (( 2 > 1 ))
. To illustrate the difference between string and integer comparison, consider that all of the following are "true":
[[ 2 > 10 ]] # because "2" comes after "1" in ASCII sort order
[[ 10 -gt 2 ]] # because 10 is a larger number than 2
(( 10 > 2 )) # ditto
Here are some more test that're true as string comparisons, but would be false with integer comparison:
[[ 05 < 5 ]] # Because "0" comes before "5"
[[ +5 < 0 ]] # Because "+" comes before the digits
[[ -0 < 0 ]] # Because "-" comes before the digits
[[ -1 < -2 ]] # Because "-" doesn't change how the second character is compared
It's an alphabetical comparison (AIUI the sort order may be influenced by the current locale). It compares the first character of each string, and if the one on the left has a higher value it's true, if lower it's false; if they're the same, then it compares the second character, etc.
This is not the same as integer comparison, for that you use [[ 2 -gt 1 ]]
or (( 2 > 1 ))
. To illustrate the difference between string and integer comparison, consider that all of the following are "true":
[[ 2 > 10 ]] # because "2" comes after "1" in ASCII sort order
[[ 10 -gt 2 ]] # because 10 is a larger number than 2
(( 10 > 2 )) # ditto
Here are some more test that're true as string comparisons, but would be false with integer comparison:
[[ 05 < 5 ]] # Because "0" comes before "5"
[[ +5 < 0 ]] # Because "+" comes before the digits
[[ -0 < 0 ]] # Because "-" comes before the digits
[[ -1 < -2 ]] # Because "-" doesn't change how the second character is compared
edited Nov 19 at 1:17
answered Aug 16 '12 at 14:38
Gordon Davisson
66.3k97792
66.3k97792
add a comment |
add a comment |
up vote
1
down vote
Yes, it compares the ascii value and if equal then repeat the comparison in the next character.
/* Copyright (C) 1991, 1996, 1997, 2003 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
#include <string.h>
#include <memcopy.h>
#undef strcmp
/* Compare S1 and S2, returning less than, equal to or
greater than zero if S1 is lexicographically less than,
equal to or greater than S2. */
int
strcmp (p1, p2)
const char *p1;
const char *p2;
{
register const unsigned char *s1 = (const unsigned char *) p1;
register const unsigned char *s2 = (const unsigned char *) p2;
unsigned reg_char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
2
I don't see why you're pasting some mostly-irrelevantstrcmp()
function. bash supports locales, and in this case it usesstrcoll()
instead to perform comparison suitable for a particular charset.
– Michał Górny
Aug 17 '12 at 8:12
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
add a comment |
up vote
1
down vote
Yes, it compares the ascii value and if equal then repeat the comparison in the next character.
/* Copyright (C) 1991, 1996, 1997, 2003 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
#include <string.h>
#include <memcopy.h>
#undef strcmp
/* Compare S1 and S2, returning less than, equal to or
greater than zero if S1 is lexicographically less than,
equal to or greater than S2. */
int
strcmp (p1, p2)
const char *p1;
const char *p2;
{
register const unsigned char *s1 = (const unsigned char *) p1;
register const unsigned char *s2 = (const unsigned char *) p2;
unsigned reg_char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
2
I don't see why you're pasting some mostly-irrelevantstrcmp()
function. bash supports locales, and in this case it usesstrcoll()
instead to perform comparison suitable for a particular charset.
– Michał Górny
Aug 17 '12 at 8:12
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
add a comment |
up vote
1
down vote
up vote
1
down vote
Yes, it compares the ascii value and if equal then repeat the comparison in the next character.
/* Copyright (C) 1991, 1996, 1997, 2003 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
#include <string.h>
#include <memcopy.h>
#undef strcmp
/* Compare S1 and S2, returning less than, equal to or
greater than zero if S1 is lexicographically less than,
equal to or greater than S2. */
int
strcmp (p1, p2)
const char *p1;
const char *p2;
{
register const unsigned char *s1 = (const unsigned char *) p1;
register const unsigned char *s2 = (const unsigned char *) p2;
unsigned reg_char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
Yes, it compares the ascii value and if equal then repeat the comparison in the next character.
/* Copyright (C) 1991, 1996, 1997, 2003 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
#include <string.h>
#include <memcopy.h>
#undef strcmp
/* Compare S1 and S2, returning less than, equal to or
greater than zero if S1 is lexicographically less than,
equal to or greater than S2. */
int
strcmp (p1, p2)
const char *p1;
const char *p2;
{
register const unsigned char *s1 = (const unsigned char *) p1;
register const unsigned char *s2 = (const unsigned char *) p2;
unsigned reg_char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
answered Aug 16 '12 at 14:41
olivecoder
1,8771214
1,8771214
2
I don't see why you're pasting some mostly-irrelevantstrcmp()
function. bash supports locales, and in this case it usesstrcoll()
instead to perform comparison suitable for a particular charset.
– Michał Górny
Aug 17 '12 at 8:12
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
add a comment |
2
I don't see why you're pasting some mostly-irrelevantstrcmp()
function. bash supports locales, and in this case it usesstrcoll()
instead to perform comparison suitable for a particular charset.
– Michał Górny
Aug 17 '12 at 8:12
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
2
2
I don't see why you're pasting some mostly-irrelevant
strcmp()
function. bash supports locales, and in this case it uses strcoll()
instead to perform comparison suitable for a particular charset.– Michał Górny
Aug 17 '12 at 8:12
I don't see why you're pasting some mostly-irrelevant
strcmp()
function. bash supports locales, and in this case it uses strcoll()
instead to perform comparison suitable for a particular charset.– Michał Górny
Aug 17 '12 at 8:12
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
You right but strcmp illustrate the string comparison in a simpler form. The purpose is show how to compare a string in general and not the specific bash implementation. The method is the same in bash, python, perl, PHP, c, Java...
– olivecoder
Aug 17 '12 at 10:05
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
More... the question is not about the name of function but about method.
– olivecoder
Aug 17 '12 at 10:12
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11989373%2fbash-testing-if-a-string-is-greater-than-another-how-does-it-work-internal%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Yeah I suspect it has a polymorphic deal going on where > means one thing with two strings, and another with two numbers. However I'm not an experienced bash coder.
– VoronoiPotato
Aug 16 '12 at 14:31