ZNHOO Whatever you are, be a good one!

sed awk

These tools do a big favor when editing streams. Both programs use regular expressions for selecting and processing text.

For instance, a configuration file contains lots of comment lines starting with # and blank lines. When applying the configuration to systems, you might need to remove those lines for a clean and tidy configuration file.

sed '/^ *#/d' nginx_with_comments.conf | sed '/^$/d' > nginx.conf

Use sed(stream editor, take pipes as input) command to delete d specific pattern lines. The 1st part removes comment lines while the 2nd remove blank lines.

awk considers each file as a set of records, which by default are the lines in the file. "awk" enables you to create a condition and action pair, and for each record that matches the condition, the action will fire. Most "awk" commands follow this pattern – you have a condition, followed by an action in curly braces.

ls -al ~/ | awk '/zachary/{print $1, $9}'

This command will first file lines contain zachary and then print the 1st and 8th items/columns - to print permissions and filenames. awk is really useful when dealing with columns. You can use other conditions; for example, checking whether or not a certain field is above or below a given threshold (i.e. $5 > 2), or checking whether a record has a certain column.

ls -al | awk '$9 ~/^\./ {gsub(/zachary/, "awkSub"); print;}'

akw check wheter the 9th item (filename) starts wity dot. Then use gsub (global substitute) to substitue every occurence of zachary with awkSub.

awk VS sed

Both are tools that transform text. BUT awk can do more things besides just manipulating text. Its a programming language by itself with most of the things you learn in programming, like arrays, loops, if/else flow control etc You can "program" in sed as well, but you won't want to maintain the code written in it.

sed is a stream editor. It works with streams of characters on a per-line basis. It has a primitive programming language that includes goto-style loops and simple conditionals (in addition to pattern matching and address matching). There are essentially only two "variables": pattern space and hold space. Readability of scripts can be difficult. Mathematical operations are extraordinarily awkward at best.

awk is oriented toward delimited fields on a per-line basis. It has much more robust programming constructs including if/else, while, do/while and for (C-style and array iteration). There is complete support for variables and single-dimension associative arrays plus (IMO) kludgey multi-dimension arrays. Mathematical operations resemble those in C. It has printf and functions. The "K" in "AWK" stands for "Kernighan" as in "Kernighan and Ritchie" of the book "C Programming Language" fame (not to forget Aho and Weinberger). One could conceivably write a detector of academic plagiarism using awk.

I would tend to use sed where there are patterns in the text. For example, you could replace all the negative numbers in some text that are in the form "minus-sign followed by a sequence of digits" (e.g. "-231.45") with the "accountant's brackets" form (e.g. "(231.45)") using this (which has room for improvement):

echo "-123.456 -345.678" sed 's/-\([0-9.]\+\)/(\1)/g'

I would use awk when the text looks more like rows and columns or, as awk refers to them "records" and "fields" - matrix. Refer to the example above.

Use sed for very simple text parsing. Anything beyond that, awk is better. In fact, you can ditch sed altogether and just use awk. Since their functions overlap and awk can do more, just use awk. You will reduce your learning curve as well.

总之,sed能实现的awk也能实现;简单的模式匹配用awk;有矩阵行列特征用awk。