Additing gtf file

up vote
0
down vote

favorite

I had to get only ENSEMBLE non-chromosomal pseudogenes from given gtf file
add additional attribute field "filtered" with value "manually" for each of the annotated pseudogenes and save as new file. So I had to filter the given file by containing "ENSEMBLY" "pseudogenes" and not containing "Chr" save it in new file and add to the last column additional property(filter-manually). Could you tell me how can I do this using awk or sed preferably?

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

edited Nov 19 at 14:29

zx8754

28.9k76395

asked Nov 19 at 13:25

Sergei

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

up vote
0
down vote

favorite

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

edited Nov 19 at 14:29

zx8754

28.9k76395

asked Nov 19 at 13:25

Sergei

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

up vote
0
down vote

favorite

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

edited Nov 19 at 14:29

zx8754

28.9k76395

asked Nov 19 at 13:25

Sergei

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

regex bash awk sed bioinformatics

edited Nov 19 at 14:29

zx8754

28.9k76395

asked Nov 19 at 13:25

Sergei

edited Nov 19 at 14:29

zx8754

28.9k76395

asked Nov 19 at 13:25

Sergei

edited Nov 19 at 14:29

zx8754

28.9k76395

edited Nov 19 at 14:29

zx8754

28.9k76395

edited Nov 19 at 14:29

zx8754

28.9k76395

asked Nov 19 at 13:25

Sergei

asked Nov 19 at 13:25

Sergei

asked Nov 19 at 13:25

Sergei

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

87.4k12122177

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53375630%2fadditing-gtf-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

87.4k12122177

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

87.4k12122177

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

87.4k12122177

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

87.4k12122177

answered Nov 19 at 15:36

tripleee

87.4k12122177

answered Nov 19 at 15:36

tripleee

87.4k12122177

answered Nov 19 at 15:36

tripleee

87.4k12122177

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk