vlookup on csv using all 4 columns from file2 to File1 using awk

up vote
-1
down vote

favorite

Trying to do a vlookup on 4 columns on a large dataset of csv file

F1 : File1

TSM,TYPE,NODE,SCHED

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result on 17th on F1 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result after execuring code on 18th F2 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA

Code

awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1

Result is not correct with above code

edited yesterday

asked 2 days ago

J.RAM

Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago

add a comment |

up vote
-1
down vote

favorite

Trying to do a vlookup on 4 columns on a large dataset of csv file

F1 : File1

TSM,TYPE,NODE,SCHED

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result on 17th on F1 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result after execuring code on 18th F2 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA

Code

awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1

Result is not correct with above code

edited yesterday

asked 2 days ago

J.RAM

Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago

add a comment |

up vote
-1
down vote

favorite

Trying to do a vlookup on 4 columns on a large dataset of csv file

F1 : File1

TSM,TYPE,NODE,SCHED

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result on 17th on F1 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result after execuring code on 18th F2 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA

Code

awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1

Result is not correct with above code

edited yesterday

asked 2 days ago

J.RAM

Trying to do a vlookup on 4 columns on a large dataset of csv file

F1 : File1

TSM,TYPE,NODE,SCHED

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result on 17th on F1 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA

Desired Result after execuring code on 18th F2 : Input File1

TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO

AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES

AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA

Code

awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1

Result is not correct with above code

unix awk ksh

edited yesterday

asked 2 days ago

J.RAM

edited yesterday

asked 2 days ago

J.RAM

edited yesterday

asked 2 days ago

J.RAM

asked 2 days ago

J.RAM

asked 2 days ago

J.RAM

Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago

add a comment |

Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago

Could you please confirm if your last line is NOT actual result? Since it looks like they DO NOT have 4 fields common in both the files, please confirm once.
– RavinderSingh13
2 days ago

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.

awk -F, -v DAT=$(date +'%Y-%m-%d') '

FNR!=NR && FNR==1{

  print $0","DAT

  next

}

FNR==NR{

  a[$1,$2,$3,$4]=$0

  next

}

{

  $0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"

}

1

'  Input_file2   Input_file1

Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.

edited yesterday

answered 2 days ago

RavinderSingh13

23.8k41437

yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago

@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago

@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago

if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago

@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago

|
show 15 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53350310%2fvlookup-on-csv-using-all-4-columns-from-file2-to-file1-using-awk%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.

awk -F, -v DAT=$(date +'%Y-%m-%d') '

FNR!=NR && FNR==1{

  print $0","DAT

  next

}

FNR==NR{

  a[$1,$2,$3,$4]=$0

  next

}

{

  $0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"

}

1

'  Input_file2   Input_file1

Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.

edited yesterday

answered 2 days ago

RavinderSingh13

23.8k41437

yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago

@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago

@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago

if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago

@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago

|
show 15 more comments

up vote
1
down vote

Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.

awk -F, -v DAT=$(date +'%Y-%m-%d') '

FNR!=NR && FNR==1{

  print $0","DAT

  next

}

FNR==NR{

  a[$1,$2,$3,$4]=$0

  next

}

{

  $0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"

}

1

'  Input_file2   Input_file1

Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.

edited yesterday

answered 2 days ago

RavinderSingh13

23.8k41437

yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago

@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago

@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago

if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago

@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago

|
show 15 more comments

up vote
1
down vote

Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.

awk -F, -v DAT=$(date +'%Y-%m-%d') '

FNR!=NR && FNR==1{

  print $0","DAT

  next

}

FNR==NR{

  a[$1,$2,$3,$4]=$0

  next

}

{

  $0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"

}

1

'  Input_file2   Input_file1

Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.

edited yesterday

answered 2 days ago

RavinderSingh13

23.8k41437

Looks like your last line shown in your expected output doesn't look to be following your rule shown, could you please try following.

awk -F, -v DAT=$(date +'%Y-%m-%d') '

FNR!=NR && FNR==1{

  print $0","DAT

  next

}

FNR==NR{

  a[$1,$2,$3,$4]=$0

  next

}

{

  $0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"

}

1

'  Input_file2   Input_file1

Append > temp_file && mv temp_file Input_file in case you want to save output into Input_file itself.

edited yesterday

answered 2 days ago

RavinderSingh13

23.8k41437

edited yesterday

answered 2 days ago

RavinderSingh13

23.8k41437

answered 2 days ago

RavinderSingh13

23.8k41437

answered 2 days ago

RavinderSingh13

23.8k41437

yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago

@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago

@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago

if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago

@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago

|
show 15 more comments

yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago

@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago

@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago

if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago

@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago

yes , but last line is skipped using the code in results AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA how to use code search caseinsensitive .
– J.RAM
2 days ago

@J.RAM, why you need it? I am not getting since its 4 fields are not common, please explain? Which file you want to save output too?
– RavinderSingh13
2 days ago

@J.RAM, AGGREGATE is there in file1 and AGGREGATE4 is in file2 how they are common? Please explain it once.
– RavinderSingh13
2 days ago

if 4 fields are not common for any row on f1 , how to adjust code show as NA for unmatched row in results , Yes output is redirected to Input_file1 , please help understand how its printing $5 from Inputfile2 , in code $5 is not used .
– J.RAM
2 days ago

@J.RAM, try my edited code now and let me know? I will add non-one liner form of solution too now.
– RavinderSingh13
2 days ago

|
show 15 more comments

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk