Categories
bash

Process substitutions are pipes, NOT files

(Originally published on Reddit, archived here in lightly edited form for posterity.)

A recent question led to a suggestion to use process substitution to emulate the effects of a proper shebang invocation, but the OP eventually discovered that 8 characters of padding had to be added to the substitution’s output to get it working.

Here’s why: SBCL seems to scan the first 8 characters in an input script file to determine what sort of file it is, then rewinds to the beginning and parses the file anew.

But pipes are not seekable, so with a process substitution, this process fails miserably: SBCL fails to rewind, so it starts parsing from the 9th character onwards, leading to very odd errors.

In addition, there’s one broad class of process substitutions that are guaranteed to fail when used: those that input/output ZIP/JAR archives, e.g. streamed via curl from a remote server. The problem stems from the ZIP file structure; since the main file header is at EOF, commands expect to seek backwards through an archive. That’s just not gonna happen on a pipe.

So the next time you think “hey, I can use a process substitution for this”, and find that your chosen command chokes on it, but works just fine on a file with the exact same contents, it’s almost certainly seeking in its input, or doing something else that works with files but not with pipes.

Categories
bash

Shifting bash arrays

(Originally published on Reddit, archived here in lightly edited form for posterity.)

Someone just asked me how to shift a bash array. He knew how to shift positional parameters (the familiar shift N command), but when he tried:

shift arr N

he of course got the error: bash: shift: arr: numeric argument required

After reminding him to RTFbashM, I gave him the magic incantation:

arr=("${arr[@]:N}")

bash helpfully extends the ${parameter:offset:length} substring expansion syntax to array slicing as well, so the above simply says “take the contents of arr from index N onwards, create a new array out of it, then overwrite arr with this new value”.

But is it significant that I use @ instead of * as the array index, and are those double quotes really necessary?

YES. Here’s why:

a=(This are a "test with spaces")  # grammatically incorrect to avoid later confusion

echo ${a[0]:2}    => "is"  # substring of a[0]
echo ${a:2}       => "is"  # $a === $a[0], so we get the same result
echo ${a[1]:2}    => "e"   # substring of a[1]

echo ${a[@]:2}    => "a test with spaces"    # slice of a[2:end] as a single string
echo ${a[*]:2}    => "a test with spaces"    # ditto, but...
echo "${a[@]:2}"  => "a" "test with spaces"  # individual elements of a[2:end]
echo "${a[*]:2}"  => "a test with spaces"    # a[2:end] as a concatenated string

And here’s what happens when we try to assign the above array slice attempts to actual arrays:

$ b1=(${a[@]:2})            # expand as single string, bash then word-splits
$ c1=(${a[*]:2})            # ditto
$ b2=("${a[@]:2}")          # expand as individual elements, no bash word-splitting
$ c2=("${a[*]:2}")          # expand as single string, no bash word-splitting

$ declare -p b1 c1 b2 c2
declare -a b1=([0]="a" [1]="test" [2]="with" [3]="spaces")
declare -a c1=([0]="a" [1]="test" [2]="with" [3]="spaces")
declare -a b2=([0]="a" [1]="test with spaces")
declare -a c2=([0]="a test with spaces")

And, with the help of bash namerefs, we can actually create a shift_array function that mimics the shift command for indexed arrays:

# shift_array <arr_name> [<n>]
shift_array() {
  # Create nameref to real array
  local -n arr="$1"
  local n="${2:-1}"
  arr=("${arr[@]:${n}}")
}

$ a=(This is a "test with spaces")
$ shift_array a 2
$ declare -p a
declare -a a=([0]="a" [1]="test with spaces")

How to use the length parameter as a substring/slice length is left as an exercise to the reader.

NOTE: I’ve published shift_array in my bash_functions GitHub repo.