Find records with same string with extra character
OK, so I have a Microsoft SQL Server 2014 database table called owner
with around 90,000 records with owner information, another called vehicle
with vehicle information
Owner_Name owner_id V_name owner_id exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9
I'm trying to find all records that have more than one exemption on a vehicle
( H0
means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX
, but this wouldn't work in my scenario.
SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1
Is there a solution that would allow me to return records like so, not knowing which owner_name
's may be similar? Basically trying to get the server to search through the owner_name
column and if there's a similarity such as
JACOB JAMISON & JESSICA
and JACOB JAMISON M & JESSICA B
then it'll return those records like so:
Owner_Name xNameAppears ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2
Thank you in advance!
sql-server sql-server-2014 full-text-search string-searching
add a comment |
OK, so I have a Microsoft SQL Server 2014 database table called owner
with around 90,000 records with owner information, another called vehicle
with vehicle information
Owner_Name owner_id V_name owner_id exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9
I'm trying to find all records that have more than one exemption on a vehicle
( H0
means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX
, but this wouldn't work in my scenario.
SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1
Is there a solution that would allow me to return records like so, not knowing which owner_name
's may be similar? Basically trying to get the server to search through the owner_name
column and if there's a similarity such as
JACOB JAMISON & JESSICA
and JACOB JAMISON M & JESSICA B
then it'll return those records like so:
Owner_Name xNameAppears ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2
Thank you in advance!
sql-server sql-server-2014 full-text-search string-searching
1
SureSOUNDEX
is suitable? The following all return the same valueSOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
Nov 21 '18 at 17:22
add a comment |
OK, so I have a Microsoft SQL Server 2014 database table called owner
with around 90,000 records with owner information, another called vehicle
with vehicle information
Owner_Name owner_id V_name owner_id exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9
I'm trying to find all records that have more than one exemption on a vehicle
( H0
means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX
, but this wouldn't work in my scenario.
SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1
Is there a solution that would allow me to return records like so, not knowing which owner_name
's may be similar? Basically trying to get the server to search through the owner_name
column and if there's a similarity such as
JACOB JAMISON & JESSICA
and JACOB JAMISON M & JESSICA B
then it'll return those records like so:
Owner_Name xNameAppears ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2
Thank you in advance!
sql-server sql-server-2014 full-text-search string-searching
OK, so I have a Microsoft SQL Server 2014 database table called owner
with around 90,000 records with owner information, another called vehicle
with vehicle information
Owner_Name owner_id V_name owner_id exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9
I'm trying to find all records that have more than one exemption on a vehicle
( H0
means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX
, but this wouldn't work in my scenario.
SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1
Is there a solution that would allow me to return records like so, not knowing which owner_name
's may be similar? Basically trying to get the server to search through the owner_name
column and if there's a similarity such as
JACOB JAMISON & JESSICA
and JACOB JAMISON M & JESSICA B
then it'll return those records like so:
Owner_Name xNameAppears ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2
Thank you in advance!
sql-server sql-server-2014 full-text-search string-searching
sql-server sql-server-2014 full-text-search string-searching
asked Nov 21 '18 at 14:46
MindXpertMindXpert
183
183
1
SureSOUNDEX
is suitable? The following all return the same valueSOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
Nov 21 '18 at 17:22
add a comment |
1
SureSOUNDEX
is suitable? The following all return the same valueSOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
Nov 21 '18 at 17:22
1
1
Sure
SOUNDEX
is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
Nov 21 '18 at 17:22
Sure
SOUNDEX
is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
Nov 21 '18 at 17:22
add a comment |
1 Answer
1
active
oldest
votes
The SOUNDEX function can be applied to a column as well.
But since
there's thousands like that
I wouldn't suggest just writing a query to join on a function to do that.
This will likely not perform very well on larger tables:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;
I'd rather do something that will make finding this easier in the long-term.
Here's an example:
CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);
I'm going to add a computed column based on the function, and then add an index to aid my query.
ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);
CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);
Validate that everything looks good...
SELECT *
FROM dbo.vehicle AS v
Use a query like this to find imprecise matches:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;
You sir, are the man. I didn't think to useSOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
Nov 21 '18 at 16:12
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223108%2ffind-records-with-same-string-with-extra-character%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The SOUNDEX function can be applied to a column as well.
But since
there's thousands like that
I wouldn't suggest just writing a query to join on a function to do that.
This will likely not perform very well on larger tables:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;
I'd rather do something that will make finding this easier in the long-term.
Here's an example:
CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);
I'm going to add a computed column based on the function, and then add an index to aid my query.
ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);
CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);
Validate that everything looks good...
SELECT *
FROM dbo.vehicle AS v
Use a query like this to find imprecise matches:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;
You sir, are the man. I didn't think to useSOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
Nov 21 '18 at 16:12
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
add a comment |
The SOUNDEX function can be applied to a column as well.
But since
there's thousands like that
I wouldn't suggest just writing a query to join on a function to do that.
This will likely not perform very well on larger tables:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;
I'd rather do something that will make finding this easier in the long-term.
Here's an example:
CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);
I'm going to add a computed column based on the function, and then add an index to aid my query.
ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);
CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);
Validate that everything looks good...
SELECT *
FROM dbo.vehicle AS v
Use a query like this to find imprecise matches:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;
You sir, are the man. I didn't think to useSOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
Nov 21 '18 at 16:12
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
add a comment |
The SOUNDEX function can be applied to a column as well.
But since
there's thousands like that
I wouldn't suggest just writing a query to join on a function to do that.
This will likely not perform very well on larger tables:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;
I'd rather do something that will make finding this easier in the long-term.
Here's an example:
CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);
I'm going to add a computed column based on the function, and then add an index to aid my query.
ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);
CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);
Validate that everything looks good...
SELECT *
FROM dbo.vehicle AS v
Use a query like this to find imprecise matches:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;
The SOUNDEX function can be applied to a column as well.
But since
there's thousands like that
I wouldn't suggest just writing a query to join on a function to do that.
This will likely not perform very well on larger tables:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;
I'd rather do something that will make finding this easier in the long-term.
Here's an example:
CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);
I'm going to add a computed column based on the function, and then add an index to aid my query.
ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);
CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);
Validate that everything looks good...
SELECT *
FROM dbo.vehicle AS v
Use a query like this to find imprecise matches:
SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;
answered Nov 21 '18 at 15:30
Erik DarlingErik Darling
21.2k1264104
21.2k1264104
You sir, are the man. I didn't think to useSOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
Nov 21 '18 at 16:12
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
add a comment |
You sir, are the man. I didn't think to useSOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.
– MindXpert
Nov 21 '18 at 16:12
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
You sir, are the man. I didn't think to use
SOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.– MindXpert
Nov 21 '18 at 16:12
You sir, are the man. I didn't think to use
SOUNDEX
in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.– MindXpert
Nov 21 '18 at 16:12
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
@user1227080 happy to help!
– Erik Darling
Nov 21 '18 at 16:13
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223108%2ffind-records-with-same-string-with-extra-character%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Sure
SOUNDEX
is suitable? The following all return the same valueSOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')
– Martin Smith
Nov 21 '18 at 17:22