Searching Text Using grep in Linux

How to search text in your files in Linux?

Terminology

  1. Linux standard input

    Standard input, often abbreviated stdin, is the source of input data for command line programs (i.e., all-text mode programs) on Linux.

What can I do with Linux grep

  1. Search a text in file and list the lines
  2. List the files that contains a key word

About grep

  • grep is a powerful file pattern searcher in Linux
  • grep stands for “global regular expression print”
  • It processes text line by line and prints any lines which match a specified pattern
  • It can accept standard input as a parameter

Install grep

\$ sudo apt-get install grep         # Debian/Ubuntu
\$ sudo yum install grep             # RHEL/CentOS/Fedora
\$ whereis grep                      #/usr/bin/grep

Parameters

  1. case insensitive -i

  2. pass in regex –e or -E

     grep -E 'pattern1|pattern2' fileName
     grep -e pattern1 -e pattern2 fileName
    
  3. only show counts -c

  4. only print the file name -I

  5. only print the file name that doesn’t match -L

  6. match whole world rather then part of string -w

  7. search directories recursivly -r

  8. show line number -n

  9. show if don’t match -v

  10. show following lines -A

  11. show lines prior the pattern -B

  12. disply matched patter in colours --color

Searching in one file

  1. Search a string in a file

     \$ grep 'moss' .//_site/Java-Bean-VS-Spring-Bean/index.html
    

    IMAGE

    It returns all the lines that contain the given string.

  2. Search a string in a file case insensitive -i

     \$ grep -i 'moss' .//_site/Java-Bean-VS-Spring-Bean/index.html
    

    IMAGE

    It returns all the lines that contain given string insensitive.

  3. Search a string as a whole word -w

     \$ grep -w -i 'moss' .//_site/Java-Bean-VS-Spring-Bean/index.html
    

    IMAGE

    It returns only the whole words.

  4. Count the Matching Lines

     \$ grep -Fc 'moss' .//_site/Java-Bean-VS-Spring-Bean/index.html
     6
    

    grep is a line-based search utility. The -c option will output the count of matched lines instead of the count of pattern occurrences.

Search in Directories

  1. Grep in ls

     \$ ls -lt | grep 'Feb 28 2046'
    
  2. in current directory, find all .md files that contains ‘text’ and print out the line

     \$ grep test *md
     CHANGELOG.md:* Remove `base_path` include from `/test` pages.
     CHANGELOG.md:* Test strict Front Matter in `/test` site. [#1236](https://github.com/mmistakes/minimal-mistakes/pull/1236)
    
  3. find all files that contain some pattern.

     \$ grep -Rl 'mossgu'  ./
    

    Use the -l option to skip the matching information and let grep print only the file names of matched files.

  4. list how many times a string shows in each file

     \$ grep -RlFc 'mossgu'  ./
    
  5. find all lines that don’t have the given pattern

     \$ grep -v test *md*
    

Regular Expressions in grep

  1. *: all chars, length could be 0,

     \$ grep 'text' d* # match all fiels name starts with d and contains 'test'
    
  2. ^: regex starts with

     \$ grep '^root' ./ # match 'root::0:root`, not 'mail::6:root'
    
  3. $ regex ends with

     \$ grep 'root$' ./ #match 'mail::6:root', not 'tty::7:root,tty,adm'
    
  4. \< start of the pattern # ‘<moss’ starts with ‘moss’

  5. \> end of the pattern # ‘moss>’ ends with ‘moss’

  6. \bmoss\b # only ‘moss’

Docker logs with grep

grep doesn’t work on docker logs command: docker logs nginx | grep "error". Because docker logs doesn’t send output to standard output. piping works only for stdout. Try:

\$ docker logs nginx 2>&1 | grep "error"

Scenarios

1. I have a very big log file and try to find the line that in a certain time

# -n show line number
\$ grep -n '2019-10-24 00:01:11' *.log

2. find from root, all the log files and contians ‘ERROR’

\$ find / -type f -name "*.log" | xargs grep "ERROR"

3. from current directory, find all .in files and contains ‘tomcat’

\$ find . -name "*.in" | xargs grep "tomcat"

4. match part or whole world

\$ grep man *  # match batman, manic, man
\$ grep '\<man' * # match manic, man, not batman
\$ grep '\<man\>' * # match man, not batman manic

5. find all empty lines

\$ grep ^$ /etc

6. find multiple patterns

\$ grep -e 'hi' -e 'moss' ./ # match either of them
\$ grep -E 'hi|moss' ./ # match either of them

# match all
\$ grep  '*hi*moss*' file_path
\$ grep  '*moss*hi*' file_path
\$ grep 'hi\|moss\|hello' file_path

7. show the matching line and the following 100 lines

\$ grep -A100 Error ./

8. show the matching line and 100 lines prior

\$ grep -B100 Error ./

9. print system thread and find usefule info

\$ ps -ef | grep 'postgres' -i

10. in git blame find the line 400

\$ git blam main.js | grep 400

11. find very big files on the server

\$ df -h
Filesystem      Size   Used  Avail Capacity iused               ifree %iused  Mounted on
/dev/disk1s1   234Gi  186Gi   45Gi    81% 2232606 9223372036852543201    0%   /
devfs          346Ki  346Ki    0Bi   100%    1197                   0  100%   /dev
/dev/disk1s4   234Gi  2.0Gi   45Gi     5%       2 9223372036854775805    0%   /private/var/vm
map -hosts       0Bi    0Bi    0Bi   100%       0                   0  100%   /net
map auto_home    0Bi    0Bi    0Bi   100%       0                   0  100%   /home

\$ df -h | grep 'G' -n
2:/dev/disk1s1   234Gi  186Gi   45Gi    81% 2232606 9223372036852543201    0%   /
4:/dev/disk1s4   234Gi  2.0Gi   45Gi     5%       2 9223372036854775805    0%   /private/var/vm

References

last update: Aug 2020