Some thoughts on a new shell grammar, because this is a convenient spot :)
Basics
- Parsing order is: Weird splitting, expansion L-R, redirect processing, execution.
- Variables are scoped; scopes nest.
- No implicit subshells. Probably requires MT impl, but that's OK.
Word splitting
Words in command lines ("outer expressions") are:
- Barewords: command names, globs, switches, parameters, ...
- Literal globs (below)
- Quoted strings
- Outer operators: =, redirections
- Parenthesized expressions
- Variable references, e.g.,
$foo
or @bar[0]
- Separators and delimiters: semi, braces, && || &, ternary ?? ::
- Keywords: for in do while until loop next function if elif end select case time
Command substitution
- Command substitution:
$(outer_expression)
because it's familiar
Inner expressions
Expressions in parens are "inner" expressions. They are where substitutions happen, except that variable expansion can also happen in outer expressions.
Operators are C, plus
:- := :? :+
from bash - take action if the lhs is unset or empty.
/
/ /
/= /
/? /
/+
Analogous, but Defined-or like Perl - take action if the lhs is unset
- ?: Elvis - test truth of the lhs.
- So
$foo // "hi"
is hi
unless $foo
is defined, and $foo ?: "Hi"
is Hi
if $foo
is either undefined or empty, since empty strings are considered false.
- ~ regex match (or glob), or sub returning new value
- =~ regex sub in place
- -~ and -~= regex or glob substring removal - suffix (in place with =)
- ~- and =~- likewise, but prefixes.
- Pattern is always to the left of the string for prefix removal and to the right of the string for suffix removal. For replacements, anchor regexes or use qg.
(
inner expression not starting with !)
Data types
- Scalars
- Text, number, file name, file descriptor, boolean, glob, regex
- Expansion within "" as usual, but the result becomes one word.
- No expansion in ''; only \' and
.
- qr// for regex literals
- qs/// for regex subs
- qg// for glob literals. In a qg, ^ and $ work at the start and end of globs.
- qt for transliteration:
qt///
for specified charsets, qt^
and ^^ , ,, ~ ~~ for case changes
- Bools: true and false. True is 0; false is any nonzero, in both outer and inner expressions. Strings are true iff non-empty. Undefined vars are false in a bool context.
- Arrays - only numeric indices
- Hashes - only text indices. Numbers are converted to string keys per convfmt or something similar.
Sigils are $
for scalars, @
for arrays, and %
for hashes. The sigil used is that of the variable, so %foo[bar]
. Indexing is always []
. Braces can be used after the sigil to disambiguate.
You can't have both $foo
and %foo
. Only one type per name!
Array or hash elements can be any scalar type; containers can include values of different types. TODO allow nested containers?
Sigils are used with var references for both lvalue and rvalue, so $foo=42
, not foo=42
.
Contexts
The result of a term depends on its context. E.g. echo $(foo)
prints the standard output of foo
, but if $(foo)
tests whether foo
succeeded.
Contexts are the same as the scalar types.
Casting/accessors
<selector>`<expr>
returns the result of <expr>
indicated by <selector>
. Multiple selectors can be given, separated by `
. Selectors are:
?
the exit status of a program
- A non-negative integer: that file descriptor.
To-do specify whether you want name or pipe of a descriptor
- Redirection
<, >, |, &>
work as usual. They are shorthand for a more general mechanism: [fetch] cmd [stash] |; [fetch] cmd [stash] ... |; ...
|;
is a "Mack" because it can carry a whole lot of data and is larger than a regular semi ;) . A fetch or stash is an expression involving the ->
operator.
Stashes are:
->"foo"
or ->'foo'
: output to file foo
. Quote processing is as usual. The quote are required (to-do relax this?). By default, stdout is saved.
->&bar
: Stash all selectors into special variable &bar. In this form, &bar
only exists in the pipeline.
Either of these can be preceded by a selector. E.g., 2->"foo.txt"
saves stderr to foo.txt.
- Selector
->$bat
: save "selector" to variable $bat
, which does last outside the pipeline. The selector is required.