Grep, sed awk – 3 magic words

Today a coworker oversaw me at a terminal and commented that sometimes he feels like he’s Bilbo and I’m Gandalf. I’ll take the compliment! It’s not the first time I’ve heard something like that, and I think it’s because of my familiarity with basic system tools younger developers aren’t familiar with. I intend for this to be a crash course on pragmatic uses of three such tools: grep, sed, and awk.

Grep, sed, and awk are all command line tools for processing text in different ways. Broadly speaking, grep searches text for patterns, sed is kind of like “find and replace” on meth, and awk is grep all grown up, back from the war with scars that tell stories. In this article we’re just going to cover the most basic, pragmatic use of these tools without taking a deep dive. My goal is exposing these wonderful, cursed tools to seekers on the path to *nix nirvana.

What is a grep and what’s it do?

grep simply searches through a bunch of text for a particular word or pattern. A common use of it might be something like this:

ls -lha | grep php

In this example, we’re piping the output of ls -lha (which lists files in the directory) to grep, and telling grep to search for lines that contain “php”. In other words, this would just show all files containing “php” somewhere in the file name.

 grep "Failed password" /var/log/auth.log | grep "Oct 26"

The above example would search /var/log/auth.log for any lines containing the string “Failed password”, and output them. Except, we pipe the output to grep again with “Oct 26”, so we’ll just get a list of lines that contain both “Failed password” and “Oct 26”.

There’s a lot more we can cover about grep, but like I said, this is only meant to expose seekers to these tools. We’re not going to cover everything else grep can do just yet.

Sed what now?

Sed can be thought of as industrial grade “search and replace”. However, using it effectively starts to require understanding what a “regex” or “regular expression” is, which is akin to learning whatever Swahili’s version of pig-latin is called. For now, we’ll stick to the shallow end of the pool.

jason@thor > echo "Sed went to school and gave the teacher an apple" | sed -e "s/apple/ass kicking/g"
Sed went to school and gave the teacher an ass kicking

The basic use of sed works like the above. You pass in some text, and then apply a replacement. Sed outputs the text with the replacements. You can use it on files directly as well, like this:

sed -e "s/red/blue/g" ./somefile.txt > somefile_fixed_colors.txt

The last example would open somefile.txt, replace all instances of “red” with “blue”, and then put the modified text into a file called “somefile_fixed_colors.txt”.

jason@thor > echo "Sed had 1 special idea, 45 apples, 25 chickens, and 10 bottles of booze" | sed -e "s/[0-9][0-9] /way too many /g" -e "s/apples/assholes/g" -e "s/[A-Z]/Y/g"
Yed had 1 special idea, way too many assholes, way too many chickens, and way too many bottles of booze

Dipping our toes into the warm waters of hell with this useless example, we can start to see the power of sed. In this example, we’re telling sed to do the following:

  • Replace any string containing two numbers in a row followed by a space with “way too many “.
  • Replace apples with assholes.
  • Replace any capital letter between A and Z with a Y.

For a more useful example, but maybe even harder to understand:

 sed -e "s/[\t| +]$//g" ./somefile.source > somefile.clean.source

This example tells sed to remove trailing spaces or tabs at the end of lines in “somefile.source” and then store the results in “somefile.clean.source”. Hopefully, you can now understand why I call these cursed tools. The syntax to use them has been known to summon Cthulhu.

Awk requires alcohol.

No joke, I knew a professor at NKU who poured himself a tall rye, lamenting “I can only talk awk when I’m drunk”. Personally, I think he was driving at the fact our profession requires a certain level of “hold my beer”. Awk is actually it’s own language. If you enjoy eating raw ghost peppers, this might caress the iceberg: https://github.com/iadnah/gawk-stealth/blob/master/rgawkcmd-0.2.gawk. If you want to go deeper, look at the source code for “Durandal’s Backdoor”.

One a more practical level, we can think of awk like a more intelligent grep that has some control over output, or maybe even as an output formatting tool.

ps aux | grep nano | awk '{print $2}'

The above example would list all running processes with some added info, pass that to grep to select only lines containing the string “nano”, and then tell awk to only output the 2nd column of those lines.

This article will most likely be updated in the future, but I’m publishing it now because it at least contains some value (I think).

0 Replies to “Grep, sed awk – 3 magic words”