How to find 10 most frequent words in the file in Unix/Linux

up vote
0
down vote

favorite

How to find 10 most frequent words in the file in Unix/Linux?

I tried using this command in Unix:

$ sort file.txt | uniq -c | sort -nr | head -10

However I am not sure if it's correct and whether it is showing me 10 most frequent words in the large file.

asked Nov 19 at 15:30

rex7991

1

Is file.txt just one word per line? Or are there multiple words per line?
– dawg
Nov 19 at 15:35

yes it has one word per line
– rex7991
Nov 19 at 15:37

1

awk '{cnt[$1]++} END{for (e in cnt) printf "%st%sn", cnt[e], e}' file.txt | sort -nr | head -n 10
– dawg
Nov 19 at 15:39

1

Please add example input and desired output.
– dawg
Nov 19 at 15:53

1

Why do you think your result is wrong?
– kvantour
Nov 19 at 16:06

|
show 2 more comments

up vote
0
down vote

favorite

How to find 10 most frequent words in the file in Unix/Linux?

I tried using this command in Unix:

$ sort file.txt | uniq -c | sort -nr | head -10

However I am not sure if it's correct and whether it is showing me 10 most frequent words in the large file.

asked Nov 19 at 15:30

rex7991

1

Is file.txt just one word per line? Or are there multiple words per line?
– dawg
Nov 19 at 15:35

yes it has one word per line
– rex7991
Nov 19 at 15:37

1

awk '{cnt[$1]++} END{for (e in cnt) printf "%st%sn", cnt[e], e}' file.txt | sort -nr | head -n 10
– dawg
Nov 19 at 15:39

1

Please add example input and desired output.
– dawg
Nov 19 at 15:53

1

Why do you think your result is wrong?
– kvantour
Nov 19 at 16:06

|
show 2 more comments

up vote
0
down vote

favorite

How to find 10 most frequent words in the file in Unix/Linux?

I tried using this command in Unix:

$ sort file.txt | uniq -c | sort -nr | head -10

However I am not sure if it's correct and whether it is showing me 10 most frequent words in the large file.

asked Nov 19 at 15:30

rex7991

How to find 10 most frequent words in the file in Unix/Linux?

I tried using this command in Unix:

$ sort file.txt | uniq -c | sort -nr | head -10

However I am not sure if it's correct and whether it is showing me 10 most frequent words in the large file.

linux unix

asked Nov 19 at 15:30

rex7991

asked Nov 19 at 15:30

rex7991

asked Nov 19 at 15:30

rex7991

asked Nov 19 at 15:30

rex7991

asked Nov 19 at 15:30

rex7991

1

Is file.txt just one word per line? Or are there multiple words per line?
– dawg
Nov 19 at 15:35

yes it has one word per line
– rex7991
Nov 19 at 15:37

1

awk '{cnt[$1]++} END{for (e in cnt) printf "%st%sn", cnt[e], e}' file.txt | sort -nr | head -n 10
– dawg
Nov 19 at 15:39

1

Please add example input and desired output.
– dawg
Nov 19 at 15:53

1

Why do you think your result is wrong?
– kvantour
Nov 19 at 16:06

|
show 2 more comments

1

Is file.txt just one word per line? Or are there multiple words per line?
– dawg
Nov 19 at 15:35

yes it has one word per line
– rex7991
Nov 19 at 15:37

1

awk '{cnt[$1]++} END{for (e in cnt) printf "%st%sn", cnt[e], e}' file.txt | sort -nr | head -n 10
– dawg
Nov 19 at 15:39

1

Please add example input and desired output.
– dawg
Nov 19 at 15:53

1

Why do you think your result is wrong?
– kvantour
Nov 19 at 16:06

Is file.txt just one word per line? Or are there multiple words per line?
– dawg
Nov 19 at 15:35

yes it has one word per line
– rex7991
Nov 19 at 15:37

awk '{cnt[$1]++} END{for (e in cnt) printf "%st%sn", cnt[e], e}' file.txt | sort -nr | head -n 10
– dawg
Nov 19 at 15:39

Please add example input and desired output.
– dawg
Nov 19 at 15:53

Why do you think your result is wrong?
– kvantour
Nov 19 at 16:06

|
show 2 more comments

1 Answer
1

active

oldest

votes

up vote
0
down vote

I have a shell demo to deal with your problem ,even you have a file with more than one Word in one line

wordcount.sh

#!/bin/bash



# filename: wordcount.sh

# usage: word count



# handle position arguments

if [ $# -ne 1 ]

then

    echo "Usage: $0 filename"

    exit -1

fi



# realize word count

printf "%-14s%sn" "Word" "Count"



cat $1 | tr 'A-Z' 'a-z' | 

egrep -o "b[[:alpha:]]+b" | 

awk '{ count[$0]++ }

END{

for(ind in count)

{ printf("%-14s%dn",ind,count[ind]); }

}' | sort -k2 -n -r | head -n 10

just run ./wordcount.sh filename.txt

explain

Use the tr command to convert all uppercase letters to lowercase letters, then use the egrep command to grab all the words in the text and output them item by item. Finally, use the awk command and the associative array to implement the word count function, and decrement the output according to the number of occurrences. .

edited Nov 19 at 15:59

answered Nov 19 at 15:53

HbnKing

6021315

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53377867%2fhow-to-find-10-most-frequent-words-in-the-file-in-unix-linux%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

I have a shell demo to deal with your problem ,even you have a file with more than one Word in one line

wordcount.sh

#!/bin/bash



# filename: wordcount.sh

# usage: word count



# handle position arguments

if [ $# -ne 1 ]

then

    echo "Usage: $0 filename"

    exit -1

fi



# realize word count

printf "%-14s%sn" "Word" "Count"



cat $1 | tr 'A-Z' 'a-z' | 

egrep -o "b[[:alpha:]]+b" | 

awk '{ count[$0]++ }

END{

for(ind in count)

{ printf("%-14s%dn",ind,count[ind]); }

}' | sort -k2 -n -r | head -n 10

just run ./wordcount.sh filename.txt

edited Nov 19 at 15:59

answered Nov 19 at 15:53

HbnKing

6021315

add a comment |

up vote
0
down vote

I have a shell demo to deal with your problem ,even you have a file with more than one Word in one line

wordcount.sh

#!/bin/bash



# filename: wordcount.sh

# usage: word count



# handle position arguments

if [ $# -ne 1 ]

then

    echo "Usage: $0 filename"

    exit -1

fi



# realize word count

printf "%-14s%sn" "Word" "Count"



cat $1 | tr 'A-Z' 'a-z' | 

egrep -o "b[[:alpha:]]+b" | 

awk '{ count[$0]++ }

END{

for(ind in count)

{ printf("%-14s%dn",ind,count[ind]); }

}' | sort -k2 -n -r | head -n 10

just run ./wordcount.sh filename.txt

edited Nov 19 at 15:59

answered Nov 19 at 15:53

HbnKing

6021315

add a comment |

up vote
0
down vote

I have a shell demo to deal with your problem ,even you have a file with more than one Word in one line

wordcount.sh

#!/bin/bash



# filename: wordcount.sh

# usage: word count



# handle position arguments

if [ $# -ne 1 ]

then

    echo "Usage: $0 filename"

    exit -1

fi



# realize word count

printf "%-14s%sn" "Word" "Count"



cat $1 | tr 'A-Z' 'a-z' | 

egrep -o "b[[:alpha:]]+b" | 

awk '{ count[$0]++ }

END{

for(ind in count)

{ printf("%-14s%dn",ind,count[ind]); }

}' | sort -k2 -n -r | head -n 10

just run ./wordcount.sh filename.txt

edited Nov 19 at 15:59

answered Nov 19 at 15:53

HbnKing

6021315

I have a shell demo to deal with your problem ,even you have a file with more than one Word in one line

wordcount.sh

#!/bin/bash



# filename: wordcount.sh

# usage: word count



# handle position arguments

if [ $# -ne 1 ]

then

    echo "Usage: $0 filename"

    exit -1

fi



# realize word count

printf "%-14s%sn" "Word" "Count"



cat $1 | tr 'A-Z' 'a-z' | 

egrep -o "b[[:alpha:]]+b" | 

awk '{ count[$0]++ }

END{

for(ind in count)

{ printf("%-14s%dn",ind,count[ind]); }

}' | sort -k2 -n -r | head -n 10

just run ./wordcount.sh filename.txt

edited Nov 19 at 15:59

answered Nov 19 at 15:53

HbnKing

6021315

edited Nov 19 at 15:59

answered Nov 19 at 15:53

HbnKing

6021315

answered Nov 19 at 15:53

HbnKing

6021315

answered Nov 19 at 15:53

HbnKing

6021315

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk