From 33582e0b96985202cc1bf0c380247747af529c4a Mon Sep 17 00:00:00 2001 From: Boris Kolpackov Date: Thu, 14 Jan 2021 09:30:09 +0200 Subject: Update Bash guide with new pipefail recommendations --- doc/bash-style.cli | 161 +++++++++++++++++++++++++++++------------------------ 1 file changed, 89 insertions(+), 72 deletions(-) (limited to 'doc') diff --git a/doc/bash-style.cli b/doc/bash-style.cli index a76053a..c3e79d4 100644 --- a/doc/bash-style.cli +++ b/doc/bash-style.cli @@ -15,7 +15,7 @@ \h1#intro|Introduction| Bash works best for simple tasks. Needing arrays, arithmetic, and so on, is -usually a good indication that the task at hand is too complex for Bash. +usually a good indication that the task at hand may be too complex for Bash. Most of the below rules can be broken if there is a good reason for it. Besides making things consistent, rules free you from having to stop and think @@ -31,8 +31,11 @@ the former provides a lot more rationale compared to this guide. \h1#style|Style| Don't use any extensions for your scripts. That is, call it just \c{foo} -rather than \c{foo.sh} or \c{foo.bash}. Use lower-case letters and dash -to separate words, for example \c{foo-bar}. +rather than \c{foo.sh} or \c{foo.bash} (though we do use the \c{.bash} +extension for +\l{https://build2.org/build2/doc/build2-build-system-manual.xhtml#module-bash +Bash modules}). Use lower-case letters and dash to separate words, for example +\c{foo-bar}. Indentation is two spaces (not tabs). Maximum line length is 79 characters (excluding newline). Use blank lines between logical blocks to improve @@ -46,13 +49,16 @@ is written on the same line after a semicolon, for example: \ if [ ... ]; then + ... fi for x in ...; do + ... done \ -Do use \c{elif} instead of nested \c{else} and \c{if}. +Do use \c{elif} instead of nested \c{else} and \c{if} (and consider is +\c{case} can be used instead). For \c{if} use \c{[ ]} for basic tests and \c{[[ ]]} if the previous form is not sufficient or hairy. In particular, \c{[[ ]]} results in cleaner code @@ -82,7 +88,8 @@ usage=\"usage: $0 \" owd=\"$(pwd)\" trap \"{ cd '$owd'; exit 1; }\" ERR set -o errtrace # Trap in functions and subshells. -shopt -s lastpipe # Execute last pipeline command in current shell. +set -o pipefail # Fail if any pipeline command fails. +shopt -s lastpipe # Execute last pipeline command in the current shell. function info () { echo \"$*\" 1>&2; } function error () { info \"$*\"; exit 1; } @@ -416,8 +423,8 @@ function dist() A function can return data in two primary ways: exit code and stdout. Normally, exit code 0 means success and exit code 1 means failure though additional codes can be used to distinguish between different kinds of -failures, signify special conditions, etc., see \l{#error-handing Error -Handling} for details. +failures (for example, \"hard\" and \"soft\" failures), signify special +conditions, etc., see \l{#error-handing Error Handling} for details. A function can also write to stdout with the result available to the caller in the same way as from programs (command substitution, pipeline, etc). If a @@ -426,28 +433,14 @@ with newlines with the caller using the \c{readarray} builtin to read them into an indexed array, for example: \ -function foo () +function func () { echo one echo two echo three } -foo | readarray -t r -\ - -In this case, if the function can fail, then the failure should be explicitly -checked for (either by examining \c{PIPESTATUS} or via the lack of the -result), since the \c{ERR} trap will not be triggered (unless the \c{pipefail} -shell option is set; see \l{#error-handing Error Handling} for details). For -example: - -\ -foo | readarray -t r - -if [ \"${PIPESTATUS[0]}\" -ne 0 ]; then - exit 1 -fi +func | readarray -t r \ \N|The use of the newline as a separator means that values may not contain @@ -455,23 +448,19 @@ newlines. While \c{readarray} supports specifying a custom separator with the \c{-d} option, including a \c{NUL} separator, this support is only available since Bash 4.4.| -This technique can also be extended to return an associative array by +This technique can also be extended to return an associative array by first returning the values as an indexed array and then converting them to an associative array with \c{eval}, for example: \ -function foo () +function func () { echo \"[a]=one\" echo \"[b]=two\" echo \"[c]=three\" } -foo | readarray -t ia - -if [ \"${PIPESTATUS[0]}\" -ne 0 ]; then - exit 1 -fi +func | readarray -t ia eval declare -A aa=(\"${ia[@]}\") \ @@ -480,7 +469,7 @@ Note that if a key or a value contains whitespaces, then it must be quoted. The recommendation is to always quote both, for example: \ -function foo () +function func () { echo \"['a']='one ONE'\" echo \"['b']='two'\" @@ -491,7 +480,7 @@ function foo () Or, if returning a local array: \ -function foo () +function func () { declare -A a=([a]='one ONE' [b]=two [c]=three) @@ -508,30 +497,52 @@ For more information on returning data from functions, see \h1#error-handing|Error Handling| Our scripts use the \c{ERR} trap to automatically terminate the script in case -any command fails. This is also propagated to functions and subshells by -specifying the \c{errtrace} shell option. - -\N|While the \c{pipefail} and \c{nounset} options may also seem like a good -idea, they have subtle, often latent pitfalls that make them more trouble than -they are worth (see \l{https://mywiki.wooledge.org/BashPitfalls#pipefail -\c{pipefail} pitfalls}, \l{https://mywiki.wooledge.org/BashPitfalls#nounset -\c{nounset} pitfalls}). - -In particular, without \c{pipefail}, a non-zero exit of any command in the -pipeline except the last is ignored. As a result, the pipeline needs to be -designed to work correctly in such cases, normally by relying on the input (or -lack thereof) to the last command to convey the failure. Alternatively, the -exit status of the pipeline commands can be explicitly checked using the -\c{PIPESTATUS} array.| +any command fail. This semantics is also propagated to functions and subshells +by specifying the \c{errtrace} shell option and to all the commands of a +pipeline by specifying the \c{pipefail} option. + +\N|Without \c{pipefail}, a non-zero exit of any command in the pipeline except +the last is ignored. The \c{pipefail} shell option is inherited by functions +and subshells.| + +\N|While the \c{nounset} options may also seem like a good idea, it has +subtle, often latent pitfalls that make it more trouble than it's worth (see +\l{https://mywiki.wooledge.org/BashPitfalls#nounset \c{nounset} pitfalls}).| + +The \c{pipefail} semantics is not without pitfalls which should be kept in +mind. In particular, if a command in a pipeline exits before reading the +preceding command's output in its entirety, such a command may exit with a +non-zero exit status (see \l{https://mywiki.wooledge.org/BashPitfalls#pipefail +\c{pipefail} pitfalls} for details). + +\N|Note that in such a situation the preceding command may exit with zero +status not only because it gracefully handled \c{SIGPIPE} but also because all +of its output happened to fit into the pipe buffer.| + +For example, these are the two common pipelines that may exhibit this issue: + +\ +prog | head -n 1 +prog | grep -q foo +\ + +In these two cases, the simplest (though not the most efficient) way to work +around this issue is to reimplement \c{head} with \c{sed} and to get rid of +\c{-q} in \c{grep}, for example: + +\ +prog | sed -n -e '1p' +prog | grep foo >/dev/null +\ If you need to check the exit status of a command, use \c{if}, for example: \ -if grep \"foo\" /tmp/bar; then +if grep -q \"foo\" /tmp/bar; then info \"found\" fi -if ! grep \"foo\" /tmp/bar; then +if ! grep -q \"foo\" /tmp/bar; then info \"not found\" fi \ @@ -579,41 +590,41 @@ even if the \c{cd} command has failed. Note, however, that notwithstanding the above statement from the Bash manual, the \c{ERR} trap is executed inside all the subshell commands of a pipeline -provided the \c{errtrace} option is specified. As a result, the above code -can be made to work using the pipe trick: +provided the \c{errtrace} option is specified. As a result, the above code can +be made to work by temporarily disabling \c{pipefail} and reimplementing it as +a pipeline: \ +set +o pipefail cleanup /no/such/dir | cat +r=\"${PIPESTATUS[0]}\" +set -o pipefail -if [ \"${PIPESTATUS[0]}\" -ne 0 ]; then +if [ \"$r\" -ne 0 ]; then ... fi \ -\N|If \c{cleanup}'s \c{cd} fails, the \c{ERR} trap will be executed in the -+subshell, causing it to exit with an error status which the parent shell then -+makes available in \c{PIPESTATUS}. - -If the \c{pipefail} shell option is set, then the explicit \c{PIPESTATUS} -check is not necessary since the function failure will trigger the \c{ERR} -trap in the current shell.| +\N|Here, if \c{cleanup}'s \c{cd} fails, the \c{ERR} trap will be executed in +the subshell, causing it to exit with an error status, which the parent shell +then makes available in \c{PIPESTATUS}.| The recommendation is then to avoid calling functions in contexts where the -\c{ERR} trap is ignored resorting to the pipe trick where that's not possible. -And to be mindful of the potential ambiguity between the true/false result and -failure for other commands. The use of the \c{&&} and \c{||} command -expressions is best left to the interactive shell. +\c{ERR} trap is ignored resorting to the above pipe trick where that's not +possible. And to be mindful of the potential ambiguity between the true/false +result and failure for other commands. The use of the \c{&&} and \c{||} +command expressions is best left to the interactive shell. \N|The pipe trick cannot be used if the function needs to modify the global -state. Such a function, however, can return the exit status also as part of -the global state. The pipe trick can also be used to ignore the exit status -of a command (provided \c{pipefail} is not set).| +state. Such a function, however, might as well return the exit status also as +part of the global state. The pipe trick can also be used to ignore the exit +status of a command.| The pipe trick can also be used to distinguish between different exit codes, for example: \ -function foo() +function func() { bar # If this command fails, the function returns 1. @@ -622,9 +633,12 @@ function foo() fi } -foo | cat +set +o pipefail +func | cat +r=\"${PIPESTATUS[0]}\" +set -o pipefail -case \"${PIPESTATUS[0]}\" in +case \"$r\" in 0) ;; 1) @@ -643,7 +657,7 @@ This technique can be further extended to implement functions that both return multiple exit codes and produce output, for example: \ -function foo() +function func() { bar # If this command fails, the function returns 1. @@ -654,9 +668,12 @@ function foo() echo result } -foo | readarray -t r +set +o pipefail +func | readarray -t r +r=\"${PIPESTATUS[0]}\" +set -o pipefail -case \"${PIPESTATUS[0]}\" in +case \"$r\" in 0) echo \"${r[0]}\" ;; -- cgit v1.1