Code on screen - featured

When you run a command or a script that runs for a long time, it may be helpful to have all its output prefixed with a timestamp. For example, we want to change this:

foo
bar
baz

to this:

[2020-12-13 12:20:38] foo
[2020-12-13 12:21:32] bar
[2020-12-13 12:22:20] baz

Here’s how you can do it.

Piping to Awk

The key is to pipe the output to another tool that reads everything from standard input, and prints the same thing but with a prefix added.

What tool do we use? Awk is a perfect candidate because it’s installed pretty much everywhere. Awk is a tool for processing streams of texts.

Usage in Single Commands

Here’s an example involving a single command:

command | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'

However, note that the above only adds timestamp prefixes to text written to standard output. If you want text written to standard error to be timestamp-prefixed too, then make sure you redirect standard error to standard output, like this:

command 2>&1 | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'

The magic part is 2>&1. It means: redirect file descriptor number 2 (standard error) to file descriptor number 1 (standard output).

Usage in Bash Scripts

Bash scripts usually invoke multiple commands. You don’t want to manually pipe every command to awk: the script would turn awkward pretty quickly. ?

#!/bin/bash
set -e
set -o pipefail

# Yeah, no...
apt-get update 2>&1 | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
apt-get install -y foo 2>&1 | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
systemctl restart foo.service 2>&1 | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'

Can we do better? Yes: you can tell Bash that all of its own output (which includes all output by commands run from Bash) should be redirected to another command:

exec > >(awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }') 2>&1

What bizarre magic is this?!?!

  • A naked exec statement, without a further command given, means: we don’t want to actually execute anything, we only want to manipulate some file descriptors (pipes) in the current Bash process.
  • > means: redirect standard output (which now includes standard error) to the given file.
  • >(some command) is a process substitution. It means: create a new pipe, then run some command with its standard input connected to that pipe, then return a filename to that pipe.So > >(some command) means: redirect standard output to the pipe created by the process substitution.
  • 2>&1 means: in the current Bash process, redirect standard error to standard output.

The pipe created by the process substituion is temporary, and is closed as soon as that line ends. Here you can see it in action:

$ echo The named pipe filename is >(true)
The named pipe filename is /dev/fd/63

But if you try to access it immediately thereafter, you will see that it’s already gone:

$ ls /dev/fd/63
ls: /dev/fd/63: Bad file descriptor

So putting it all together, the awkward script above becomes:

#!/bin/bash
set -e

exec > >(awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }') 2>&1

apt-get update
apt-get install -y foo
systemctl restart foo.service

Caveat: this is Bash, not sh

sh is not Bash. sh is a shell whose history is older than Bash. Although /bin/sh may point to the same executable as Bash, when Bash is run as sh it will execute in a sort of “legacy mode” and disable some features. So if your shell scripts use the trick described in this blog post, make sure your shebang line is #!/bin/bash and not #!/bin/sh.

Caveat: Block Buffering

Commands behave a little bit differently if their output is redirected to a pipe.

Normally, commands use line buffering, which means that every time they’ve printed a complete line, that line will be written to the terminal.

But when the output is redirected to something which is not a terminal, then commands will use block buffering, which means that they don’t actually write out what they intend to print; until a threshold as been reached, or until the command exits. This threshold is around 16 KB. You can see block buffering in action here:

python -c 'import time; print("x" * (1024 * 15)); time.sleep(10)' | cat

This Python command prints 15 KB of data to standard output, then sleeps for 10 seconds. The output is piped to cat. Note that during the 10-second sleep, nothing is printed. Only after the 10 seconds does it print something.

If you increase 15 to 16, then you’ll see the output immediately.

If this behavior is problematic for you, then there are various ways to force line buffering:

  • On Linux and FreeBSD, you can use stdbuf. Prefix it to any command for which you want to force line buffering:stdbuf -oL python -c 'import time; print("x" * (1024 * 15)); time.sleep(10)' | cat
  • On macOS, you can install stdbuf from Homebrew. It’s part of coreutils.brew install coreutils /usr/local/opt/coreutils/libexec/gnubin/stdbuf -oL python -c 'import time; print("x" * (1024 * 15)); time.sleep(10)' | cat
  • For Python scripts, you can set the environment variable PYTHONUNBUFFERED=1, which tells Python to disable any sort of output buffering entirely.

Caveats of, Alternatives to Awk

That there are multiple implementations of Awk, with different features. On Linux, you would usually be using the GNU implementation. On macOS, you would be using a BSD implementation. If you intent to write shell scripts that run on more than just one OS, then be sure to test things on all platforms you intend to support.

In fact, the Awk implementation on macOS does not support the strftime function!

$ echo hi | awk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
/usr/bin/awk: calling undefined function strftime
 input record number 1, file
 source line number 1

You could use Perl as an alternative. Perl is also widely installed.

$ echo hi | perl -pe 'use POSIX strftime; print strftime "[%Y-%m-%d %H:%M:%S%z] ", localtime; STDOUT->flush()'
[2020-04-27 15:13:33+0200] hi

Or you could explicitly install the GNU implementation and use that. The GNU implementation is also called gawk:

$ brew install gawk
$ echo hi | gawk '{ print strftime("[%Y-%m-%d %H:%M:%S]"), $0 }'
[2020-04-27 15:15:51] hi

More About Awk

Awk is surprisingly powerful, yet concise! Here’s an Awk tutorial. For more in-depth information, check the GNU Awk User’s Guide.

Good luck, and don’t shy away from being awkward.

Hongli is a software engineer & consultant that solves complex problems beyond tech.
May 07, 2024 | BLOG | 6 MINUTES

8 questions you were afraid to ask about Talos answerd

Talos is a minimal Kubernetes OS that's quickly gaining popularity because of its ease of use and strong focus on security by default. It has already been …

April 30, 2024 | BLOG | 9 MINUTES

12 Factor: 13 years later

How can we make applications easy to operate? The 12-factor methodology is about 13 years old. How did it age in the cloud-native era? Do we need a 13th …

April 25, 2024 | BLOG | 5 MINUTES

Build your own Python Kubernetes Operator

Yes, you read it right – build a K8s operator in Python! I often get reactions like, "But doesn't it have to be in Golang?" Fortunately, that's not …