grep, awk, sed family tool
Grep
Global Regular Expression Pattern
With grep you can do simple text-based or regular expression search on the file you passed or can be also piped.
You can only provide in one pattern, you can provide in multiple pattern to search for by using the pipe symbol, but any parameter after the pattern are treated as filenames
Use the -E
flag to use the extended regular expression notation, i.e. you don't have to escape the following character to get their special meaning, ?, +, {}, and | in the regular expression.
Usages
grep <insert word to search> filename |
This will perform a search on the entire file to find where the word occurs |
grep -i <word> filename |
Perform a case-insensitive search on the entire file |
grep -R <word> . |
Perform a search on all of the file in the current directory as well as the sub-directories |
grep -c <word> filename |
Count the number of matches |
grep -A -B -C <word> filename |
Use -A, -B, and -C to get context surrounding the matched text, after, before, and both before and after respectively |
Awk
The awk tool allows you to basically split each of line of your given text into different fields, kind of like pandas in Python. You can then access each columns individually allowing you to perform some level of numerical analysis.
The way to invoke awk is via awk -F <delimiters> -f <awk program file/provide them in a single quoted string> <file to process>
Basic Structure of Awk
BEGIN {
# Applied before processing every line
}
/regex/ {
# Applied to only lines that match the regex
}
$1 ~ /regex/ {
# Applied to only lines that has it's first field match the regex
}
{
# Applied to every line
}
END {
# Applied after every line is ran
}
First the way that awk program is ran is through these pattern block:
The BEGIN and END pattern block are special in that they are only ran before processing through rest of the line and after every line is ran. If you want to do some setup work before processing through the rest of the line, i.e. setting up variables, you can do it in those two pattern block.
Then you can have a regex match based pattern block so that the code inside will only run if the lines matched the regex.
You can have nothing as well, so that the code is run for every line.
If you have multiple pattern block matched, they will all run, it is not like if-statement where one is true the rest are skipped, if they are all true, they are all run, from the order they are prescribed.
Array In Awk
If you want to use an array in awk, just use it you don't need to declare it to use. Just start insert elements into the variable you want to use as an array.