vlookup on csv using all 4 columns from file2 to File1 using awk











up vote
-1
down vote

favorite
1












Trying to do a vlookup on 4 columns on a large dataset of csv file



F1 : File1



TSM,TYPE,NODE,SCHED
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5


F2



AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result on 17th on F1 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result after execuring code on 18th F2 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA


Code



awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1


Result is not correct with above code










share|improve this question
























  • Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
    – RavinderSingh13
    2 days ago















up vote
-1
down vote

favorite
1












Trying to do a vlookup on 4 columns on a large dataset of csv file



F1 : File1



TSM,TYPE,NODE,SCHED
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5


F2



AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result on 17th on F1 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result after execuring code on 18th F2 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA


Code



awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1


Result is not correct with above code










share|improve this question
























  • Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
    – RavinderSingh13
    2 days ago













up vote
-1
down vote

favorite
1









up vote
-1
down vote

favorite
1






1





Trying to do a vlookup on 4 columns on a large dataset of csv file



F1 : File1



TSM,TYPE,NODE,SCHED
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5


F2



AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result on 17th on F1 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result after execuring code on 18th F2 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA


Code



awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1


Result is not correct with above code










share|improve this question















Trying to do a vlookup on 4 columns on a large dataset of csv file



F1 : File1



TSM,TYPE,NODE,SCHED
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5


F2



AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result on 17th on F1 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA


Desired Result after execuring code on 18th F2 : Input File1



TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA


Code



awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1


Result is not correct with above code







unix awk ksh






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited yesterday

























asked 2 days ago









J.RAM

13




13












  • Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
    – RavinderSingh13
    2 days ago


















  • Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
    – RavinderSingh13
    2 days ago
















Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago




Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago












1 Answer
1






active

oldest

votes

















up vote
1
down vote













Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.



awk -F, -v DAT=$(date +'%Y-%m-%d') '
FNR!=NR && FNR==1{
print $0","DAT
next
}
FNR==NR{
a[$1,$2,$3,$4]=$0
next
}
{
$0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"
}
1
' Input_file2 Input_file1


Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.






share|improve this answer























  • yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
    – J.RAM
    2 days ago












  • @J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
    – RavinderSingh13
    2 days ago












  • @J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
    – RavinderSingh13
    2 days ago










  • if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
    – J.RAM
    2 days ago












  • @J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
    – RavinderSingh13
    2 days ago











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53350310%2fvlookup-on-csv-using-all-4-columns-from-file2-to-file1-using-awk%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote













Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.



awk -F, -v DAT=$(date +'%Y-%m-%d') '
FNR!=NR && FNR==1{
print $0","DAT
next
}
FNR==NR{
a[$1,$2,$3,$4]=$0
next
}
{
$0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"
}
1
' Input_file2 Input_file1


Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.






share|improve this answer























  • yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
    – J.RAM
    2 days ago












  • @J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
    – RavinderSingh13
    2 days ago












  • @J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
    – RavinderSingh13
    2 days ago










  • if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
    – J.RAM
    2 days ago












  • @J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
    – RavinderSingh13
    2 days ago















up vote
1
down vote













Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.



awk -F, -v DAT=$(date +'%Y-%m-%d') '
FNR!=NR && FNR==1{
print $0","DAT
next
}
FNR==NR{
a[$1,$2,$3,$4]=$0
next
}
{
$0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"
}
1
' Input_file2 Input_file1


Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.






share|improve this answer























  • yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
    – J.RAM
    2 days ago












  • @J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
    – RavinderSingh13
    2 days ago












  • @J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
    – RavinderSingh13
    2 days ago










  • if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
    – J.RAM
    2 days ago












  • @J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
    – RavinderSingh13
    2 days ago













up vote
1
down vote










up vote
1
down vote









Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.



awk -F, -v DAT=$(date +'%Y-%m-%d') '
FNR!=NR && FNR==1{
print $0","DAT
next
}
FNR==NR{
a[$1,$2,$3,$4]=$0
next
}
{
$0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"
}
1
' Input_file2 Input_file1


Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.






share|improve this answer














Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.



awk -F, -v DAT=$(date +'%Y-%m-%d') '
FNR!=NR && FNR==1{
print $0","DAT
next
}
FNR==NR{
a[$1,$2,$3,$4]=$0
next
}
{
$0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"
}
1
' Input_file2 Input_file1


Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.







share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered 2 days ago









RavinderSingh13

23.8k41437




23.8k41437












  • yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
    – J.RAM
    2 days ago












  • @J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
    – RavinderSingh13
    2 days ago












  • @J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
    – RavinderSingh13
    2 days ago










  • if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
    – J.RAM
    2 days ago












  • @J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
    – RavinderSingh13
    2 days ago


















  • yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
    – J.RAM
    2 days ago












  • @J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
    – RavinderSingh13
    2 days ago












  • @J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
    – RavinderSingh13
    2 days ago










  • if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
    – J.RAM
    2 days ago












  • @J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
    – RavinderSingh13
    2 days ago
















yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago






yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago














@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago






@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago














@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago




@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago












if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago






if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago














@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago




@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53350310%2fvlookup-on-csv-using-all-4-columns-from-file2-to-file1-using-awk%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Costa Masnaga

Fotorealismo

Sidney Franklin