Find records with same string with extra character












3















OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!










share|improve this question


















  • 1





    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')

    – Martin Smith
    Nov 21 '18 at 17:22


















3















OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!










share|improve this question


















  • 1





    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')

    – Martin Smith
    Nov 21 '18 at 17:22
















3












3








3


2






OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!










share|improve this question














OK, so I have a Microsoft SQL Server 2014 database table called owner with around 90,000 records with owner information, another called vehicle with vehicle information



Owner_Name                   owner_id       V_name     owner_id    exempt
------------------------------------- ------------------------------
JACOB JAMISON & JESSICA 35 Civic 35 H3
JACOB JAMISON M & JESSICA B 39 Accord 39 H3
BLACKSON BARRINGTON 56 Bugatti 56 H6
BLACKSON BARRINGTON H 98 SSC 98 H7
BRUSTER MICHAEL 107 Corvette 107 H9


I'm trying to find all records that have more than one exemption on a vehicle
( H0 means no exemption). This code below has worked well, as long as the name is exactly the same. However, if there's a variation, such as an extra letter or if it's entered backwards, then it won't return those records. I've looked at things like SOUNDEX, but this wouldn't work in my scenario.



SELECT Owner_name
, COUNT(Owner_name) AS 'xNameAppears'
, COUNT(v.exempt) AS 'ExemptionCount'
FROM owner o
INNER JOIN vehicle V ON V.owner_id = o.owner_id
WHERE v.exempt <> 'H0'
GROUP BY O.owner_name
HAVING COUNT(v.exempt) > 1


Is there a solution that would allow me to return records like so, not knowing which owner_name's may be similar? Basically trying to get the server to search through the owner_name column and if there's a similarity such as
JACOB JAMISON & JESSICA and JACOB JAMISON M & JESSICA B then it'll return those records like so:



Owner_Name                      xNameAppears      ExemptCount
-------------------------------------------------------------
JACOB JAMISON & JESSICA 2 2
JACOB JAMISON M & JESSICA B 2 2
BLACKSON BARRINGTON 2 2
BLACKSON BARRINGTON H 2 2


Thank you in advance!







sql-server sql-server-2014 full-text-search string-searching






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 21 '18 at 14:46









MindXpertMindXpert

183




183








  • 1





    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')

    – Martin Smith
    Nov 21 '18 at 17:22
















  • 1





    Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')

    – Martin Smith
    Nov 21 '18 at 17:22










1




1





Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')

– Martin Smith
Nov 21 '18 at 17:22







Sure SOUNDEX is suitable? The following all return the same value SOUNDEX('JACOB JAMISON & JESSICA '), SOUNDEX('Jacob Zuma'), SOUNDEX('Jacob Willekens'), SOUNDEX('Jacob S. Coxey, Sr.'), SOUNDEX('Juegief zzzzzzzzzzzz')

– Martin Smith
Nov 21 '18 at 17:22












1 Answer
1






active

oldest

votes


















5














The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer
























  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

    – MindXpert
    Nov 21 '18 at 16:12











  • @user1227080 happy to help!

    – Erik Darling
    Nov 21 '18 at 16:13











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223108%2ffind-records-with-same-string-with-extra-character%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









5














The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer
























  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

    – MindXpert
    Nov 21 '18 at 16:12











  • @user1227080 happy to help!

    – Erik Darling
    Nov 21 '18 at 16:13
















5














The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer
























  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

    – MindXpert
    Nov 21 '18 at 16:12











  • @user1227080 happy to help!

    – Erik Darling
    Nov 21 '18 at 16:13














5












5








5







The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;





share|improve this answer













The SOUNDEX function can be applied to a column as well.



But since




there's thousands like that




I wouldn't suggest just writing a query to join on a function to do that.



This will likely not perform very well on larger tables:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON SOUNDEX(v2.Owner_Name) = SOUNDEX(v.Owner_Name)
AND v2.Owner_Name <> v.Owner_Name;


I'd rather do something that will make finding this easier in the long-term.



Here's an example:



CREATE TABLE dbo.vehicle (Owner_Name VARCHAR(50));
INSERT dbo.vehicle ( Owner_Name )
SELECT *
FROM (
VALUES
('JACOB JAMISON & JESSICA'),
('JACOB JAMISON M & JESSICA B'),
('BLACKSON BARRINGTON'),
('BLACKSON BARRINGTON H'),
('BRUSTER MICHAEL')
) AS x (Owner_Name);


I'm going to add a computed column based on the function, and then add an index to aid my query.



ALTER TABLE dbo.vehicle ADD Owner_Soundex AS SOUNDEX(Owner_Name);

CREATE INDEX ix_whatever ON dbo.vehicle (Owner_Soundex, Owner_Name);


Validate that everything looks good...



SELECT *
FROM dbo.vehicle AS v


Use a query like this to find imprecise matches:



SELECT *
FROM dbo.vehicle AS v
JOIN dbo.vehicle AS v2
ON v2.Owner_Soundex = v.Owner_Soundex
AND v2.Owner_Name <> v.Owner_Name;






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 '18 at 15:30









Erik DarlingErik Darling

21.2k1264104




21.2k1264104













  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

    – MindXpert
    Nov 21 '18 at 16:12











  • @user1227080 happy to help!

    – Erik Darling
    Nov 21 '18 at 16:13



















  • You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

    – MindXpert
    Nov 21 '18 at 16:12











  • @user1227080 happy to help!

    – Erik Darling
    Nov 21 '18 at 16:13

















You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

– MindXpert
Nov 21 '18 at 16:12





You sir, are the man. I didn't think to use SOUNDEX in that manner. This works very well! Indeed that second way is more efficient performance wise. Thank you.

– MindXpert
Nov 21 '18 at 16:12













@user1227080 happy to help!

– Erik Darling
Nov 21 '18 at 16:13





@user1227080 happy to help!

– Erik Darling
Nov 21 '18 at 16:13


















draft saved

draft discarded




















































Thanks for contributing an answer to Database Administrators Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223108%2ffind-records-with-same-string-with-extra-character%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Costa Masnaga

Fotorealismo

Sidney Franklin