String Operations in Bash/Shell

Quelle1)

Length Operator

There are two way to get length of the string. The simplest one is ${#varname}, which returns the length of the value of the variable as a character string. For example, if filename has the value fred.c, then ${#filename} would have the value 6.

The other operator (${#array[*]}) has to do with array variables.

String Length

${#string}  
  
 expr length $string 
  
 expr "$string" : '.*' 
 

stringZ=abcABC123ABCabc

echo ${#stringZ}                 # 15
echo `expr length $stringZ`      # 15
echo `expr "$stringZ" : '.*'`    # 15
Length of Matching Substring at Beginning of String
 expr match "$string" '$substring' 
$substring is a  regular expression.
 expr "$string" : '$substring' 
$substring is a regular expression.
 
stringZ=abcABC123ABCabc
#       |------|

echo `expr match "$stringZ" 'abc[A-Z]*.2'`   # 8
echo `expr "$stringZ" : 'abc[A-Z]*.2'`       # 8
Index
expr index $string $substring 

Numerical position in $string of first character in $substring that matches.

 
stringZ=abcABC123ABCabc
echo `expr index "$stringZ" C12`             # 6
                                             # C position.

echo `expr index "$stringZ" 1c`              # 3
# 'c' (in #3 position) matches before '1'.

This is the near equivalent of strchr() in C.

Substr

Shell uses zero-based indexing. When substring expansion of the form ${param:offset[:length} is used, an `offset' that evaluates to a number less than zero counts back from the end of the expanded value of $param.

When a negative `offset' begins with a minus sign, however, unexpected things can happen. Consider

 a=12345678
 echo ${a:-4}

intending to print the last four characters of $a. The problem is that ${param:-word} already has a well-defined meaning: expand to word if the expanded value of param is unset or null, and $param otherwise.

To use negative offsets that begin with a minus sign, separate the minus sign and the colon with a space.

${string:position} 

Extracts substring from $string at $position.

If the $string parameter is „*“ or „@“, then this extracts the positional parameters, [1] starting at $position.

${string:position:length}

Extracts $length characters of substring from $string at $position.

stringZ=abcABC123ABCabc
#       0123456789.....
#       0-based indexing.

echo ${stringZ:0}                            # abcABC123ABCabc
echo ${stringZ:1}                            # bcABC123ABCabc
echo ${stringZ:7}                            # 23ABCabc

echo ${stringZ:7:3}                          # 23A

# Three characters of substring. If the $string parameter is „*“ or „@“, then this extracts a maximum of $length positional parameters, starting at $position.

 
echo ${*:2}          # Echoes second and following positional parameters.
echo ${@:2}          # Same as above.

echo ${*:2:3}        # Echoes three positional parameters, starting at second.
expr substr $string $position $length 

Extracts $length characters from $string starting at $position.

stringZ=abcABC123ABCabc # 123456789…… # 1-based indexing.

echo `expr substr $stringZ 1 2` # ab echo `expr substr $stringZ 4 3` # ABC

Default

These operators can be used in a variety of ways. A good example would be to give a default value to a variable normally read from the command-line arguments, when no such arguments are given. This is shown in the following script: #!/bin/bash

export INFILE=${1-„infile“}

export OUTFILE=${2-„outfile“}

cat $INFILE $OUTFILE Hopefully, this gives you something to think about and to play with until the next article. If you're interested in more hints about bash (or other stuff I've written about), please take a look at my home page. If you've got questions or comments, please drop me a line. Pattern Matching There are two kinds of pattern matching available: matching from the left and matching from the right. The operators, with their functions and an example, are shown in the following table: Operator Function Example

${foo#t*is}  Deletes the shortest possible match from the left  export $foo="this is a test"

echo ${foo#t*is} is a test

${foo##t*is}  Deletes the longest possible match from the left  export $foo="this is a test"

echo ${foo##t*is} a test

${foo%t*st}  Deletes the shortest possible match from the right  export $foo="this is a test"

echo ${foo%t*st} this is a

${foo%%t*st}  Deletes the longest possible match from the right  export $foo="this is a test" echo ${foo%%t*is}
NOTE

While the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key on the keyboard and operates from the left. The % key is on the right of the $ key and operated from the right. These operators can be used to do a variety of things. For example, the following script changes the extension of all .html files to .htm. #!/bin/bash # quickly convert html filenames for use on a dossy system # only handles file extensions, not filenames

for i in *.html; do

if [ -f ${i%l} ]; then
  echo ${i%l} already exists
else
  mv $i ${i%l}
fi

done Substitution Another kind of variable mangling you might want to employ is substitution. There are four substitution operators in bash, as shown in the following table. Operator Function Example

${foo:-bar}  If $foo exists and is not null, return $foo. If it doesn't exist or is null, return bar.  export foo=""

echo ${foo:-one} one echo $foo

${foo:=bar}  If $foo exists and is not null, return $foo. If it doesn't exist or is null, set $foo to bar and return bar  export foo=""

echo ${foo:=one} one echo $foo one

${foo:+bar}  If $foo exists and is not null, return bar. If it doesn't exist, or is null, return a null.  export foo="this is a test"

echo ${foo:+bar} bar

${foo:?"error message"}  If $foo exists and is not null, return its value. If it doesn't exist or is null, print the error message. If no error message is given, print parameter null or not set. Note: In a non-interactive shell, this will abort the current script. In an interactive shell, this will just print the error message.  export foo="one" for i in foo bar baz; do eval echo \${$foo:?} one bash: bar: parameter null or not set bash: baz: parameter null or not set 
== NOTE ==

The colon (:) in the above operators can be omitted. Doing so changes the behavior of the operator to test only for existence of the variable. This will cause the creation of a variable in the case of ${foo=bar}. These operators can be used in a variety of ways. A good example would be to give a default value to a variable normally read from the command-line arguments, when no such arguments are given. This is shown in the following script:

#!/bin/bash

export INFILE=${1-"infile"}

export OUTFILE=${2-"outfile"}

cat $INFILE $OUTFILE

Hopefully, this gives you something to think about and to play with until the next article. If you're interested in more hints about bash (or other stuff I've written about), please take a look at my home page. If you've got questions or comments, please drop me a line. KSH Pattern-matching Operators Korn shell's pattern-matching operators.

Operator Meaning ${variable#pattern}

If the pattern matches the beginning of the variable's value, delete the shortest part that matches and return the rest.
${variable##pattern}
If the pattern matches the beginning of the variable's value, delete the longest part that matches and return the rest.
${variable%pattern}
If the pattern matches the end of the variable's value, delete the shortest part that matches and return the rest.
${variable%%pattern}
If the pattern matches the end of the variable's value, delete the longest part that matches and return the rest.
These can be hard to remember, so here's a handy mnemonic device: # matches the front because number signs precede numbers; % matches the rear because percent signs follow numbers.

The expression ${DIRSTACK :-$PWD} evaluates to $DIRSTACK if it is non-null or $PWD (the current directory) if it is null. Table 4.1: Substitution Operators Operator Substitution ${varname:-word}

If varname exists and isn't null, return its value; otherwise return word.
Purpose:
Returning a default value if the variable is undefined.
Example:
${count:-0} evaluates to 0 if count is undefined.
 ${varname:=word}
If varname exists and isn't null, return its value; otherwise set it to word and then return its value.[7]
Purpose:
Setting a variable to a default value if it is undefined.
Example:
${count:=0} sets count to 0 if it is undefined.
 ${varname:?message}
If varname exists and isn't null, return its value; otherwise print varname: followed by message, and abort the current command or script. Omitting message produces the default message parameter null or not set.
Purpose:
Catching errors that result from variables being undefined.
Example:
{count:?" undefined!" } prints "count: undefined!" and exits if count is undefined.
 ${varname:+word}
If varname exists and isn't null, return word; otherwise return null.
Purpose:
Testing for the existence of a variable.
Example:
${count:+1} returns 1 (which could mean "true") if count is defined.
The first two of these operators are ideal for setting defaults for command-line arguments in case the user omits them. We'll use the first one in our first programming task.

If we used #*/ instead of ##*/, the expression would have the incorrect value dave/pete/fred/bob, because the shortest instance of „anything followed by a slash“ at the beginning of the string is just a slash (/). The construct ${variable##*/} is actually equivalent to the UNIX utility basename(1). basename takes a pathname as argument and returns the filename only; it is meant to be used with the shell's command substitution mechanism (see below). basename is less efficient than ${variable##/*} because it runs in its own separate process rather than within the shell. Another utility, dirname(1), does essentially the opposite of basename: it returns the directory prefix only. It is equivalent to the Korn shell expression ${variable%/*} and is less efficient for the same reason. bash String Manipulations By Jim Dennis, jimd@starshine.org The bash shell has many features that are sufficiently obscure you almost never see them used. One of the problems is that the man page offers no examples. Here I'm going to show how to use some of these features to do the sorts of simple string manipulations that are commonly needed on file and path names. Background In traditional Bourne shell programming you might see references to the basename and dirname commands. These perform simple string manipulations on their arguments. You'll also see many uses of sed and awk or perl -e to perform simple string manipulations. Often these machinations are necessary perform on lists of filenames and paths. There are many specialized programs that are conventionally included with Unix to perform these sorts of utility functions: tr, cut, paste, and join. Given a filename like /home/myplace/a.data.directory/a.filename.txt which we'll call $f you could use commands like:

      dirname $f 
      basename $f 
      basename $f .txt
      

… to see output like:

      /home/myplace/a.data.directory
      a.filename.txt
      a.filename 

Notice that the GNU version of basename takes an optional parameter. This handy for specifying a filename „extension“ like .tar.gz which will be stripped off of the output. Note that basename and dirname don't verify that these parameters are valid filenames or paths. They simple perform simple string operations on a single argument. You shouldn't use wild cards with them – since dirname takes exactly one argument (and complains if given more) and basename takes one argument and an optional one which is not a filename. Despite their simplicity these two commands are used frequently in shell programming because most shells don't have any built-in string handling functions – and we frequently need to refer to just the directory or just the file name parts of a given full file specification. Usually these commands are used within the „back tick“ shell operators like TARGETDIR=`dirname $1`. The „back tick“ operators are equivalent to the $(…) construct. This latter construct is valid in Korn shell and bash – and I find it easier to read (since I don't have to squint at me screen wondering which direction the „tick“ is slanted). A Better Way Although the basename and dirname commands embody the „small is beautiful“ spirit of Unix – they may push the envelope towards the „too simple to be worth a separate program“ end of simplicity. Naturally you can call on sed, awk, TCL or perl for more flexible and complete string handling. However this can be overkill – and a little ungainly. So, bash (which long ago abandoned the „small is beautiful“ principal and went the way of emacs) has some built in syntactical candy for doing these operations. Since bash is the default shell on Linux systems then there is no reason not to use these features when writing scripts for Linux.

If your concerned about portability to other shells and systems – you may want to stick with dirname, basename, and sed The bash Man Page The bash man page is huge. In contains a complete reference to the „readline“ libraries and how to write a .inputrc file (which I think should all go in a separate man page) – and a run down of all the csh „history“ or bang! operators (which I think should be replaced with a simple statement like: „Most of the bang! tricks that work in csh work the same way in bash“). However, buried in there is a section on Parameter Substitution which tells us that $foo is really a shorthand for ${foo} which is really the simplest case of several ${foo:operators} and similar constructs. Are you confused, yet? Here's where a few examples would have helped. To understand the man page I simply experimented with the echo command and several shell variables. This is what it all means: Given: foo=/tmp/my.dir/filename.tar.gz We can use these expressions: path = ${foo%/*} To get: /tmp/my.dir (like dirname) file = ${foo##*/} To get: filename.tar.gz (like basename) base = ${file.*} To get: filename ext = ${file#*.} To get: tar.gz Note that the last two depend on the assignment made in the second one Here we notice two different "operators" being used inside the parameters (curly braces). Those are the # and the % operators. We also see them used as single characters and in pairs. This gives us four combinations for trimming patterns off the beginning or end of a string: ${variable%pattern} Trim the shortest match from the end ${variable##pattern} Trim the longest match from the beginning ${variablepattern} Trim the shortest match from the end ${variable#pattern} Trim the shortest match from the beginning It's important to understand that these use shell „globbing“ rather than „regular expressions“ to match these patterns. Naturally a simple string like „txt“ will match sequences of exactly those three characters in that sequence – so the difference between „shortest“ and „longest“ only applies if you are using a shell wild card in your pattern. A simple example of using these operators comes in the common question of copying or renaming all the *.txt to change the .txt to .bak (in MS-DOS' COMMAND.COM that would be REN *.TXT *.BAK). This is complicated in Unix/Linux because of a fundamental difference in the programming API's. In most Unix shells the expansion of a wild card pattern into a list of filenames (called „globbing“) is done by the shell – before the command is executed. Thus the command normally sees a list of filenames (like „foo.txt bar.txt etc.txt“) where DOS (COMMAND.COM) hands external programs a pattern like *.TXT. Under Unix shells, if a pattern doesn't match any filenames the parameter is usually left on the command like literally. Under bash this is a user-settable option. In fact, under bash you can disable shell „globbing“ if you like – there's a simple option to do this. It's almost never used – because commands like mv, and cp won't work properly if their arguments are passed to them in this manner. However here's a way to accomplish a similar result: for i in *.txt; do cp $i ${i%.txt}.bak; done … obviously this is more typing. If you tried to create a shell function or alias for it – you have to figure out how to pass this parameters. Certainly the following seems simple enough: function cp-pattern { for i in $1; do cp $i ${i%$1}$2; done … but that doesn't work like most Unix users would expect. You'd have to pass this command a pair of specially chosen, and quoted arguments like: cp-pattern '*.txt' .bak … note how the second pattern has no wild cards and how the first is quoted to prevent any shell globbing. That's fine for something you might just use yourself – if you remember to quote it right. It's easy enough to add check for the number of arguments and to ensure that there is at least one file that exists in the $1 pattern. However it becomes much harder to make this command reasonably safe and robust. Inevitably it becomes less „unix-like“ and thus more difficult to use with other Unix tools. I generally just take a whole different approach. Rather than trying to use cp to make a backup of each file under a slightly changed name I might just make a directory (usually using the date and my login ID as a template) and use a simple cp command to copy all my target files into the new directory. Another interesting thing we can do with these „parameter expansion“ features is to iterate over a list of components in a single variable. For example, you might want to do something to traverse over every directory listed in your path – perhaps to verify that everything listed therein is really a directory and is accessible to you. Here's a command that will echo each directory named on your path on it's own line: p=$PATH until [ $p = $d ]; do d=${p:*}; p=${p#*:}; echo $d; done ... obviously you can replace the echo $d part of this command with anything you like. Another case might be where you'd want to traverse a list of directories that were all part of a path. Here's a command pair that echos each directory from the root down to the "current working directory": p=$(pwd) until [ $p = $d ]; do p=${p#*/}; d=${p on your system and and you may be surprised at how many Bourne and C shell scripts there are in there In conclusion I'll just provide a sampler of some other bash parameter expansions: ${parameter:-word} Provide a default if parameter is unset or null. Example: echo ${1:-„default“} Note: this would have to be used from within a functions or shell script – the point is to show that some of the parameter substitutions can be use with shell numbered arguments. In this case the string „default“ would be returned if the function or script was called with no $1 (or if all of the arguments had been shifted out of existence. ${parameter:=word} Assign a value to parameter if it was previously unset or null. Example: echo ${HOME:=„/home/.nohome“} ${parameter:?word} Generate an error if parameter is unset or null by printing word to stdout. Example: : ${HOME:=„/home/.nohome“} ${TMP:?„Error: Must have a valid Temp Variable Set“} This one just uses the shell „null command“ (the : command) to evaluate the expression. If the variable doesn't exist or has a null value – this will print the string to the standard error file handle and exit the script with a return code of one. Oddly enough – while it is easy to redirect the standard error of processes under bash – there doesn't seem to be an easy portable way to explicitly generate message or redirect output to stderr. The best method I've come up with is to use the /proc/ filesystem (process table) like so: function error { echo „$*“ > /proc/self/fd/2 } … self is always a set of entries that refers to the current process – and self/fd/ is a directory full of the currently open file descriptors. Under Unix and DOS every process is given the following pre-opened file descriptors: stdin, stdout, and stderr. ${parameter:+word} Alternative value. ${TMP:+„/mnt/tmp“} use /mnt/tmp instead of $TMP but do nothing if TMP was unset. This is a weird one that I can't ever see myself using. But it is a logical complement to the ${var:-value} we saw above. ${#variable} Return the length of the variable in characters. Example: echo The length of your PATH is ${#PATH} 9.2. Manipulating Strings Bash supports a surprising number of string manipulation operations. Unfortunately, these tools lack a unified focus. Some are a subset of parameter substitution, and others fall under the functionality of the UNIX expr command. This results in inconsistent command syntax and overlap of functionality, not to mention confusion.

expr match „$string“ '\($substring\)' Extracts $substring at beginning of $string, where $substring is a regular expression. expr „$string“ : '\($substring\)' Extracts $substring at beginning of $string, where $substring is a regular expression.

stringZ=abcABC123ABCabc #

echo `expr match „$stringZ“ '\(.[b-c]*[A-Z]..[0-9]\)'` # abcABC1 echo `expr „$stringZ“ : '\(.[b-c]*[A-Z]..[0-9]\)'` # abcABC1 echo `expr „$stringZ“ : '\(…….\)'` # abcABC1 # All of the above forms give an identical result.

expr match „$string“ '.*\($substring\)' Extracts $substring at end of $string, where $substring is a regular expression. expr „$string“ : '.*\($substring\)' Extracts $substring at end of $string, where $substring is a regular expression.

stringZ=abcABC123ABCabc #

echo `expr match „$stringZ“ '.*\([A-C][A-C][A-C][a-c]*\)'` # ABCabc echo `expr „$stringZ“ : '.*\(……\)'` # ABCabc Substring Removal ${string#substring} Strips shortest match of $substring from front of $string. ${string##substring} Strips longest match of $substring from front of $string.

stringZ=abcABC123ABCabc # |—-| # |———-|

echo ${stringZ#a*C} # 123ABCabc # Strip out shortest match between 'a' and 'C'.

echo ${stringZ##a*C} # abc # Strip out longest match between 'a' and 'C'.

${string%substring} Strips shortest match of $substring from back of $string. ${stringsubstring} Strips longest match of $substring from back of $string. stringZ=abcABC123ABCabc # || # |------------| echo ${stringZ%b*c} # abcABC123ABCa # Strip out shortest match between 'b' and 'c', from back of $stringZ. echo ${stringZb*c} # a # Strip out longest match between 'b' and 'c', from back of $stringZ.

Example 9-10. Converting graphic file formats, with filename change

#!/bin/bash # cvt.sh: # Converts all the MacPaint image files in a directory to „pbm“ format.

# Uses the „macptopbm“ binary from the „netpbm“ package, #+ which is maintained by Brian Henderson (bryanh@giraffe-data.com). # Netpbm is a standard part of most Linux distros.

OPERATION=macptopbm SUFFIX=pbm # New filename suffix.

if [ -n „$1“ ] then

directory=$1      # If directory name given as a script argument...

else

directory=$PWD    # Otherwise use current working directory.

fi

# Assumes all files in the target directory are MacPaint image files, # + with a „.mac“ suffix.

for file in $directory/* # Filename globbing. do

filename=${file%.*c}      #  Strip ".mac" suffix off filename
                          #+ ('.*c' matches everything
                          #+ between '.' and 'c', inclusive).
$OPERATION $file > $filename.$SUFFIX
                          # Redirect conversion to new filename.
rm -f $file               # Delete original files after converting.   
echo "$filename.$SUFFIX"  # Log what is happening to stdout.

done

exit 0

Substring Replacement ${string/substring/replacement} Replace first match of $substring with $replacement. ${stringsubstring/replacement} Replace all matches of $substring with $replacement. stringZ=abcABC123ABCabc echo ${stringZ/abc/xyz} # xyzABC123ABCabc # Replaces first match of 'abc' with 'xyz'. echo ${stringZabc/xyz} # xyzABC123ABCxyz

                                # Replaces all matches of 'abc' with # 'xyz'.

${string/#substring/replacement}

If $substring matches front end of $string, substitute $replacement for $substring. ${string/%substring/replacement} If $substring matches back end of $string, substitute $replacement for $substring.

stringZ=abcABC123ABCabc

echo ${stringZ/#abc/XYZ} # XYZABC123ABCabc

                                # Replaces front-end match of 'abc' with 'xyz'.

echo ${stringZ/%abc/XYZ} # abcABC123ABCXYZ

                                # Replaces back-end match of 'abc' with 'xyz'.

9.2.1. Manipulating strings using awk

A Bash script may invoke the string manipulation facilities of awk as an alternative to using its built-in operations. Example 9-11. Alternate ways of extracting substrings

#!/bin/bash # substring-extraction.sh

String=23skidoo1 # 012345678 Bash # 123456789 awk # Note different string indexing system: # Bash numbers first character of string as '0'. # Awk numbers first character of string as '1'.

echo ${String:2:4} # position 3 (0-1-2), 4 characters long

                                       # skid

# The awk equivalent of ${string:pos:length} is substr(string,pos,length). echo | awk ' { print substr(„'„${String}“'“,3,4) # skid } ' # Piping an empty „echo“ to awk gives it dummy input, #+ and thus makes it unnecessary to supply a filename.

exit 0

 9.2.2. Further Discussion

For more on string manipulation in scripts, refer to Section 9.3 and the relevant section of the expr command listing. For script examples, see: Example 12-6 Example 9-14 Example 9-15 Example 9-16 Example 9-18 Notes [1] This applies to either command line arguments or parameters passed to a function.

Old News ;-)

David Korn Tells All # More to the point, thanks to the way ksh works, you can do this:

# make an array, words, local to the current function typeset -A words

# read a full line read line

# split the line into words echo „$line“ | read -A words

# Now you can access the line either word-wise or string-wise - useful if you want to, say, check for a command as the Nth parameter, # but also keep formatting of the other parameters… Function Returns true if filename is not the empty string and is a file.

mbfl_file_is_readable filename   Function
Returns true if filename is not the empty string, is a file and is readable. 
mbfl_file_is_writable filename   Function
Returns true if filename is not the empty string, is a file and is writable. 
mbfl_file_is_directory directory   Function
Returns true if directory is not the empty string and is a directory. 
mbfl_file_directory_is_readable  directory   Function
Returns true if directory is not the empty string, is a directory and is readable. 
mbfl_file_directory_is_writable  directory   Function
Returns true if directory is not the empty string, is a directory and is writable. 
mbfl_file_is_symlink pathname   Function
Returns true if pathname is not the empty string and is a symbolic link. 
 Miscellaneous commands

mbfl_cd dirname ?…? Function

Changes directory to dirname. Optional flags to cd may be appended. 
Parsing command line options

The getopt module defines a set of procedures to be used to process command line arguments with the following format: -a brief option a with no value;

-a123 brief option a with value 123;

–bianco long option bianco with no value;

–color=bianco long option color with value bianco. Requires the message module (Message for details). Arguments: Using the module: Predefined options: Interface functions: Querying Options: Arguments The module contains, at the root level, a block of code like the following:

ARGC=0 declare -a ARGV ARGV1

for 2); do

  ARGV1[$ARGC1]="$1"
  shift

done

this block is executed when the script is evaluated. Its purpose is to store command line arguments in the global array ARGV1 and the number of command line arguments in the global variable ARGC1. The global array ARGV and the global variable ARGC are predefined and should be used by the mbfl_getopts functions to store non-option command line arguments. Example: $ script –gulp wo –gasp=123 wa if the script makes use of the library, the strings wo and wa will go into ARGV and ARGC will be set to 2. The option arguments are processed and some action is performed to register them. We can access the non-option arguments with the following code: for 3); do

  # do something with ${ARGV[$i]}

done Using the module To use this module we have to declare a set of script options; we declare a new script option with the function mbfl_declare_option. Options declaration should be done at the beginning of the script, before doing anything; for example: right after the MBFL library code. In the main block of the script: options are parsed by invoking mbfl_getopts_parse: this function will update a global variable and invoke a script function for each option on the command line. Examples Example of option declaration: mbfl_declare_option ALPHA no a alpha noarg „enable alpha option“ this code declares an option with no argument and properties: global variable script_option_ALPHA, which will be set to no by default and to yes if the option is used; brief flag -a; long flag –alpha; description enable alpha option, to be shown in the usage output. If the option is used: the function script_option_update_alpha is invoked (if it exists) with no arguments, after the variable script_option_ALPHA has been set to yes. Valid option usages are: $ script.sh -a $ script.sh –alpha Another example: mbfl_declare_option BETA 123 b beta witharg „select beta value“ this code declares an option with argument and properties: global variable script_option_BETA, which will be set to 123 by default and to the value selected on the command line if the option is used; brief flag -b; long flag –beta; description select beta value, to be shown in the usage output. If the option is used: the function script_option_update_beta is invoked (if it exists) with no arguments, after the variable script_option_BETA has been set to the selected value. Valid option usages are:

$ script.sh -b456 $ script.sh –beta=456

Predefined options A set of predefined options is recognised by the library and not handed to the user defined functions. –encoded-args Signals to the library that the non-option arguments and the option values are encoded in hexadecimal strings. Encoding is useful to avoid quoting problems when invoking a script from another one. If this option is used: the values are decoded by mbfl_getopts_parse before storing them in the ARGV array and before being stored in the option's specific global variables.

-v –verbose Turns on verbose messages. The fuction mbfl_option_verbose returns true (Message, for details).

–silent Turns off verbose messages. The fuction mbfl_option_verbose returns false.

–verbose-program If used the –verbose option is added to the command line of external programs that support it. The fuction mbfl_option_verbose_program returns true or false depending on the state of this option.

–show-program Prints the command line of executed external programs.

–debug Turns on debugging messages (Message, for details).

–test Turns on test execution (Program Testing, for details).

–null Signals to the script that it has to use the null character to separate values, instead of the common newline. The global variable mbfl_option_NULL is set to yes.

-f –force Signals to the script that it does not have to query the user before doing dangerous operations, like overwriting files. The global variable mbfl_option_INTERACTIVE is set to no.

-i –interactive Signals to the script that it does have to query the user before doing dangerous operations, like overwriting files. The global variable mbfl_option_INTERACTIVE is set to yes.

–validate-programs Validates the existence of all the programs needed by the script; then exits. The exit code is zero if all the programs were found, one otherwise.

–version Prints to the standard output of the script the contents of the global variable mbfl_message_VERSION, then exits with code zero. The variable makes use of the service variables (Service Variables, for details).

–version-only Prints to the standard output of the script the contents of the global variable script_VERSION, then exits with code zero.

–license Prints to the standard output of the script the contents of one of the global variables mbfl_message_LICENSE_*, then exits with code zero. The variable makes use of the service variables (Service Variables, for details).

-h –help –usage Prints to the standard output of the script: the contents of the global variable script_USAGE; a newline; the string options:; a newline; an automatically generated string describing the options declared with mbfl_declare_option; a string describing the MBFL default options. Then exits with code zero. The following options may be used to set, unset and query the state of the predefined options. mbfl_option_encoded_args Function

mbfl_set_option_encoded_args    Function
mbfl_unset_option_encoded_args    Function
Query/sets/unsets the encoded arguments option. 
 

mbfl_option_encoded_args Function

mbfl_set_option_encoded_args    Function
mbfl_unset_option_encoded_args    Function
Query/sets/unsets the verbose messages option. 
 

mbfl_option_verbose_program Function

mbfl_set_option_verbose_program    Function
mbfl_unset_option_verbose_program    Function
Query/sets/unsets verbose execution for external programs. 

This option, of course, is supported only for programs that are known by MBFL (like rm): if a program is executed with mbfl_program_exec, it is responsibility of the caller to use the option.

mbfl_option_show_program Function

mbfl_set_option_show_program    Function
mbfl_unset_option_show_program    Function
Prints the command line of executed external program. This does not disable program execution, it just prints the command line before executing it. 
 

mbfl_option_test Function

mbfl_set_option_test   Function
mbfl_unset_option_test   Function
Query/sets/unsets the test execution option. 
 

mbfl_option_debug Function

mbfl_set_option_debug   Function
mbfl_unset_option_debug   Function
Query/sets/unsets the debug messages option. 
 

mbfl_option_null Function

mbfl_set_option_null   Function
mbfl_unset_option_null   Function
Query/sets/unsets the null list separator option. 
 

mbfl_option_interactive Function

mbfl_set_option_interactive   Function
mbfl_unset_option_interactive    Function
Query/sets/unsets the interactive excution option. 
 Interface functions

mbfl_declare_option keyword default brief long hasarg description Function

Declares a new option. Arguments description follows. 

keyword A string identifying the option; internally it is used to build a function name and a variable name. It is safer to limit this string to the letters in the range a-z and underscores.

default The default value for the option. For an option with argument it can be anything; for an option with no argument: it must be yes or no.

brief The brief option selector: a single character. It is safer to choose a single letter (lower or upper case) in the ASCII standard.

long The long option selector: a string. It is safer to choose a sequence of letters in the ASCII standard, separated by underscores or dashes.

hasarg Either witharg or noarg: declares if the option requires an argument or not.

description A one-line string describing the option briefly.

mbfl_getopts_parse   Function
Parses a set of command line options. The options are handed to user defined functions. The global array ARGV1 and the global variable ARGC1 are supposed to hold the command line arguments and the number of command line arguments. Non-option arguments are left in the global array ARGV, the global variable ARGC holds the number of elements in ARGV. 
mbfl_getopts_islong string varname   Function
Verifies if a string is a long option without argument.  string is the string to validate, varname is the optional name of a variable that's set to the option name, without the leading dashes. 

Returns with code zero if the string is a long option without argument, else returns with code one. An option must be of the form –option, only characters in the ranges A-Z, a-z, 0-9 and the characters - and _ are allowed in the option name.

mbfl_getopts_islong_with string optname varname Function

Verifies if a string is a long option with argument. Arguments: 

string the string to validate;

optname optional name of a variable that's set to the option name, without the leading dashes;

varname optional name of a variable that's set to the option value. Returns with code zero if the string is a long option with argument, else returns with code one. An option must be of the form –option=value, only characters in the ranges A-Z, a-z, 0-9 and the characters - and _ are allowed in the option name. If the argument is not an option with value, the variable names are ignored.

mbfl_getopts_isbrief string varname Function

Verifies if a string is a brief option without argument. Arguments: string is the string to validate, varname optional name of a variable that's set to the option name, without the leading dash. 

Returns with code zero if the argument is a brief option without argument, else returns with code one. A brief option must be of the form -a, only characters in the ranges A-Z, a-z, 0-9 are allowed as option letters.

mbfl_getopts_isbrief_with string optname valname Function

Verifies if a string is a brief option without argument. Arguments: 

string the string to validate;

optname optional name of a variable that's set to the option name, without the leading dashes;

valname optional name of a variable that's set to the option value. Returns with code zero if the argument is a brief option without argument, else returns with code one. A brief option must be of the form -aV (a is the option, V is the value), only characters in the ranges A-Z, a-z, 0-9 are allowed as option letters.

mbfl_wrong_num_args required present Function

Validates the number of arguments. required is the required number of arguments, present is the given number of arguments on the command line. If the number of arguments is different from the required one: prints an error message and returns with code one; else returns with code zero. 
 

mbfl_argv_from_stdin Function

If the ARGC global variable is set to zero: fills the global variable ARGV with lines from stdin. If the global variable mbfl_option_NULL is set to yes: lines are read using the null character as terminator, else they are read using the standard newline as terminator. 

This function may block waiting for input.

mbfl_argv_all_files Function

Checks that all the arguments in ARGV are file names of existent file. Returns with code zero if no errors, else prints an error message and returns with code 1. 
 Querying Options

Some feature and behaviour of the library is configured by the return value of the following set of functions. All of these functions are defined by the Getopts module, but they can be redefined by the script. mbfl_option_encoded_args Function

Returns true if the option --encoded-args was used on the command line. 
mbfl_option_verbose   Function
Returns true if the option --verbose was used on the command line after all the occurrences of --silent. Returns false if the option --silent was used on the command line after all the occurrences of --verbose. 
mbfl_option_test   Function
Returns true if the option --test was used on the command line. 
mbfl_option_debug   Function
Returns true if the option --debug was used on the command line. 
mbfl_option_null   Function
Returns true if the option --null was used on the command line. 
mbfl_option_interactive   Function
Returns true if the option --interactive was used on the command line after all the occurrences of --force. Returns false if the option --force was used on the command line after all the occurrences of --interactive. 
Printing messages to the console

This module allows one to print messages on an output channel. Various forms of message are supported. All the function names are prefixed with mbfl_message_. All the messages will have the forms: <progname>: <message> <progname>: [error|warning]: <message>

The following global variables are declared: mbfl_message_PROGNAME must be initialised with the name of the script that'll be displayed at the beginning of each message;

mbfl_message_VERBOSE yes if verbose messages should be displayed, else no; mbfl_message_set_program PROGNAME Function

Sets the script official name to put at the beginning of messages. 
mbfl_message_set_channel channel   Function
Selects the channel to be used to output messages. 
mbfl_message_string string   Function
Outputs a message to the selected channel. Echoes a string composed of: the content of the mbfl_message_PROGNAME global variable; a colon; a space; the provided message. 

A newline character is NOT appended to the message. Escape characters are allowed in the message.

mbfl_message_verbose string   Function
Outputs a message to the selected channel, but only if the evaluation of the function/alias mbfl_option_verbose returns true. 

Echoes a string composed of: the content of the mbfl_message_PROGNAME global variable; a colon; a space; the provided message. A newline character is NOT appended to the message. Escape characters are allowed in the message.

mbfl_message_verbose_end string   Function
Outputs a message to the selected channel, but only if the evaluation of the function/alias mbfl_option_verbose returns true. 

Echoes the string. A newline character is NOT appended to the message. Escape characters are allowed in the message.

mbfl_message_debug string   Function
Outputs a message to the selected channel, but only if the evaluation of the function/alias mbfl_option_debug returns true. 

Echoes a string composed of: the content of the mbfl_message_PROGNAME global variable; a colon; a space; the provided message. A newline character is NOT appended to the message. Escape characters are allowed in the message.

mbfl_message_warning string   Function
Outputs a warning message to the selected channel. Echoes a string composed of: the content of the mbfl_message_PROGNAME global variable; a colon; a space; the string warning; a colon; a space; the provided message. 

A newline character IS appended to the message. Escape characters are allowed in the message.

mbfl_message_error string Function

Outputs a error message to the selected channel. Echoes a string composed of: the content of the mbfl_message_PROGNAME global variable; a colon; a space; the string error; a colon; a space; the provided message. 

A newline character IS appended to the message. Escape characters are allowed in the message.

Using external programs

This module declares a set of global variables all prefixed with mbfl_program_. We have to look at the module's code to see which one are declared. Program Testing: Testing a script and running programs. Program Checking: Checking programs existence. Program Executing: Executing a program. Program Declaring: Declaring the intention to use a program.

Testing a script and running programs MBFL allows a script to execute a „dry run“, that is: do not perform any operation on the system, just print messages describing what will happen if the script is executed with the selected options. This implies, in the MBFL model, that no external program is executed. When this feature is turned on: mbfl_program_exec does not execute the program, instead it prints the command line on standard error and returns true.

mbfl_set_option_test Function

Enables the script test option. After this a script should not do anything on the system, just print messages describing the operations. This function is invoked when the predefined option --test is used on the command line. 
 

mbfl_unset_option_test Function

Disables the script test option. After this a script should perform normal operations. 
 

mbfl_option_test Function

Returns true if test execution is enabled, else returns false. 
 

Node:Program Checking, Next:Program Executing, Previous:Program Testing, Up:Program

Checking programs existence The simpler way to test the availability of a program is to look for it just before it is used. The following function should be used at the beginning of a function that makes use of external programs.

mbfl_program_check program ?program …? Function

Checks the availability of programs. All the pathnames on the command line are checked: if one is not executable an error message is printed on stderr. Returns false if a program can't be found, true otherwise. 
 

mbfl_program_find program Function

A wrapper for: 

type -ap program

that looks for a program in the current search path: prints the full pathname of the program found, or prints an empty string if nothing is found.

 Executing a program

mbfl_program_exec arg … Function

Evaluates a command line. 

If the function mbfl_option_test returns true: instead of evaluation, the command line is sent to stderr. If the function mbfl_option_show_program returns true: the command line is sent to stderr, then it is executed.

Declaring the intention to use a program To make a script model simpler, we assume that the unavailability of a program at the time of its execution is a fatal error. So if we need to execute a program and the executable is not there, the script must be aborted on the spot. Functions are available to test the availability of a program, so we can try to locate an alternative or terminate the process under the script control. On a system where executables may vanish from one moment to another, no matter how we test a program existence, there's always the possibility that the program is not „there“ when we invoke it. If we just use mbfl_program_exec to invoke an external program, the function will try and fail if the executable is unavailable: the return code will be false. The vanishing of a program is a rare event: if it's there when we look for it, probably it will be there also a few moments later when we invoke it. For this reason, MBFL proposes a set of functions with which we can declare the intention of a script to use a set of programs; a command line option is predefined to let the user test the availability of all the declared programs before invoking the script.

mbfl_declare_program program Function

Registers program as the name of a program required by the script. The return value is always zero. 
 

mbfl_program_validate_declared Function

Validates the existence of all the declared programs. The return value is zero if all the programs are found, one otherwise. 

This function is invoked by mbfl_getopts_parse when the –validate-programs option is used on the command line. It is a good idea to invoke this function at the beginning of a script, just before starting to do stuff, example:

mbfl_program_validate_declared || mbfl_exit_program_not_found

If verbose messages are enabled: a brief summary is echoed to stderr; from the command line the option –verbose must be used before –validate-programs.

mbfl_program_found program Function

Prints the pathname of the previously declared program. Returns zero if the program was found, otherwise prints an error message and exits the script by invoking mbfl_exit_program_not_found. 

This function should be used to retrieve the pathname of the program to be used as first argument to mbfl_program_exec.

mbfl_exit_program_not_found Function

Terminates the script with exit code 20. This function may be redefined by a script to make use of a different exit code; it may even be redefined to execute arbitrary code and then exit. 
Catching signals

MBFL provides an interface to the trap builtin that allows the execution of more than one function when a signal is received; this may sound useless, but that is it.

mbfl_signal_map_signame_to_signum sigspec Function

Converts sigspec to the corresponding signal number, then prints the number. 
 

mbfl_signal_attach sigspec handler Function

Append handler to the list of functions that are executed whenever sigspec is received. 
 

mbfl_signal_invoke_handlers signum Function

Invokes all the handlers registered for signum. This function is not meant to be used during normal scripts execution, but it may be useful to debug a script. 
Manipulating strings

String Quote: Quoted characters. String Inspection: Inspecting a string. String Splitting: Splitting a string. String Case: Converting between upper and lower case. String Class: Matching a string with a class. String Misc: Miscellaneous functions. Quoted characters

mbfl_string_is_quoted_char string position Function

Returns true if the character at position in  string is quoted; else returns false. A character is considered quoted if it is preceeded by an odd number of backslashes (\). position is a zero-based index. 
 

mbfl_string_is_equal_unquoted_char string position char Function

Returns true if the character at position in  string is equal to char and is not quoted (according to mbfl_string_is_quoted_char); else returns false. position is a zero-based index. 
 

mbfl_string_quote string Function

Prints string with quoted characters. All the occurrences of the backslash character, \, are substituted with a quoted backslash, \\. Returns true. 
 

Node:String Inspection, Next:String Splitting, Previous:String Quote, Up:String

Inspecting a string

mbfl_string_index string index Function

Selects a character from a string. Echoes to stdout the selected character. If the index is out of range: the empty string is echoed to stdout. 
 

mbfl_string_first string char ?begin? Function

Searches characters in a string. Arguments: string, the target string; char, the character to look for; begin, optional, the index of the character in the target string from which the search begins (defaults to zero). 

Prints an integer representing the index of the first occurrence of char in string. If the character is not found: nothing is sent to stdout.

mbfl_string_last string char ?begin? Function

Searches characters in a string starting from the end. Arguments: string, the target string; char, the character to look for; begin, optional, the index of the character in the target string from which the search begins (defaults to zero). 

Prints an integer representing the index of the last occurrence of char in string. If the character is not found: nothing is sent to stdout.

mbfl_string_range string begin end Function

Extracts a range of characters from a string. Arguments: string, the source string; begin, the index of the first character in the range; end, optional, the index of the character next to the last in the range, this character is not extracted. end defaults to the last character in the string; if equal to end: the end of the range is the end of the string. Echoes to stdout the selected range of characters. 
mbfl_string_equal_substring string position pattern   Function
Returns true if the substring starting at position in string is equal to pattern; else returns false. If position plus the length of pattern is greater than the length of string: the return value is false, always. 

Splitting a string

mbfl_string_chars string Function

Splits a string into characters. Fills an array named  SPLITFIELD with the characters from the string; the number of elements in the array is stored in a variable named SPLITCOUNT. Both SPLITFIELD and SPLITCOUNT may be declared local in the scope of the caller. 

The difference between this function and using: ${STRING:$i:1}, is that this function detects backslash characters, \, and treats them as part of the following character. So, for example, the sequence \n is treated as a single char.

Example of usage for mbfl_string_chars: 

string=„abcde\nfghilm“ mbfl_string_chars „${string}“ # Now: # „${#string}“ = $SPLITCOUNT # a = „${SPLITFIELD[0]}“ # b = „${SPLITFIELD[1]}“ # c = „${SPLITFIELD[2]}“ # d = „${SPLITFIELD[3]}“ # e = „${SPLITFIELD[4]}“ # \n = „${SPLITFIELD[5]}“ # f = „${SPLITFIELD[6]}“ # g = „${SPLITFIELD[7]}“ # h = „${SPLITFIELD[8]}“ # i = „${SPLITFIELD[9]}“ # l = „${SPLITFIELD[10]}“ # m = „${SPLITFIELD[11]}“

mbfl_string_split string separator Function

Splits string into fields using seprator. Fills an array named SPLITFIELD with the characters from the string; the number of elements in the array is stored in a variable named SPLITCOUNT. Both SPLITFIELD and SPLITCOUNT may be declared local in the scope of the caller. 
 

Node:String Case, Next:String Class, Previous:String Splitting, Up:String

Converting between upper and lower case

mbfl_string_toupper string Function

Outputs string with all the occurrencies of lower case ASCII characters (no accents) turned into upper case. 
 

mbfl_string_tolower string Function

Outputs string with all the occurrencies of upper case ASCII characters (no accents) turned into lower case. 
 

Node:String Class, Next:String Misc, Previous:String Case, Up:String

Matching a string with a class

mbfl-string-is-alpha-char char Function

Returns true if char is in one of the ranges: a-z, A-Z. 
 

mbfl-string-is-digit-char char Function

Returns true if char is in one of the ranges: 0-9. 
 

mbfl-string-is-alnum-char char Function

Returns true if mbfl-string-is-alpha-char || mbfl-string-is-digit-char returns true when acting on char. 
 

mbfl-string-is-noblank-char char Function

Returns true if char is in none of the characters: , \n, \r, \f, \t. char is meant to be the unquoted version of the non-blank characters: the one obtained with: 

$'char'

mbfl-string-is-name-char char Function

Returns true if mbfl-string-is-alnum-char returns true when acting upon char or char is an underscore, _. 
 

mbfl-string-is-alpha string Function

mbfl-string-is-digit string   Function
mbfl-string-is-alnum string   Function
mbfl-string-is-noblank string   Function
mbfl-string-is-name string   Function
Return true if the associated char function returns true for each character in string. As an additional constraint: mbfl-string-is-name returns false if mbfl-string-is-digit returns true when acting upon the first character of string. 
 

Node:String Misc, Previous:String Class, Up:String

Miscellaneous functions

mbfl_string_replace string pattern ?subst? Function

Replaces all the occurrences of pattern in string with subst; prints the result. If not used, subst defaults to the empty string. 
 

mbfl_sprintf varname format … Function

Makes use of printf to format the string format with the additional arguments, then stores the result in varname: if this name is local in the scope of the caller, this has the effect of filling the variable in that scope. 
 

mbfl_string_skip string varname char Function

Skips all the characters in a string equal to char. varname is the name of a variable in the scope of the caller: its value is the offset of the first character to test in string. The offset is incremented until a char different from char is found, then the value of varname is update to the position of the different char. If the initial value of the offset corresponds to a char equal to char, the variable is left untouched. Returns true. 
 

Node:Dialog, Next:Variables, Previous:String, Up:Top

Interacting with the user

mbfl_dialog_yes_or_no string ?progname? Function

Prints the question string on the standard output and waits for the user to type yes or no in the standard input. Returns true if the user has typed  yes, false if the user has typed no. 

The optional parameter progname is used as prefix for the prompt; if not given: defaults to the value of script_PROGNAME (Service Variables for details).

mbfl_dialog_ask_password prompt Function

Prints prompts followed by a colon and a space, then reads a password from the terminal. Prints the password. 
 

Node:Variables, Next:Main, Previous:Dialog, Up:Top

Manipulating variables Variables Arrays: Variables Colon:

Node:Variables Arrays, Next:Variables Colon, Up:Variables

Manipulating arrays

mbfl_variable_find_in_array element Function

Searches the array mbfl_FIELDS for a value equal to element. If it is found: prints the index and returns true; else prints nothing and returns false. 

mbfl_FIELDS must be filled with elements having subsequent indexes starting at zero.

mbfl_variable_element_is_in_array element Function

A wrapper for mbfl_variable_find_in_array that does not print anything. 
 

Node:Variables Colon, Previous:Variables Arrays, Up:Variables

Manipulating colon variables mbfl_variable_colon_variable_to_array varname Function

Reads varname's value, a colon separated list of string, and stores each string in the array mbfl_FIELDS, starting with a base index of zero. 
 mbfl_variable_array_to_colon_variable varname   Function
Stores each value in the array mbfl_FIELDS in varname as a colon separated list of strings. 
 

mbfl_variable_colon_variable_drop_duplicate varname Function

Reads varname's value, a colon separated list of string, and removes duplicates. 
 

Node:Main, Next:Testing, Previous:Variables, Up:Top

Main function MBFL declares a function to drive the execution of the script; its purpose is to make use of the other modules to reduce the size of scripts depending on MBFL. All the code blocks in the script, with the exception of global variables declaration, should be enclosed in functions.

mbfl_main Function

Must be the last line of code in the script. Does the following. 

Registers the value of the variable script_PROGNAME in the message module using the function mbfl_message_set_progname. If it exists: invokes the function script_before_parsing_options. Parses command line options with mbfl_getopts_parse. If it exists: invokes the function script_after_parsing_options. Invokes the function whose name is stored in the global variable mbfl_main_SCRIPT_FUNCTION, if it exists, with no arguments; if its return value is non-zero: exits the script with the same code. The default value is main. Exits the script with the return code of the action function or zero.

mbfl_invoke_script_function funcname Function

If funcname is the name of an existing function: it is invoked with no arguments; the return value is the one of the function. The existence test is performed with: 

type -t FUNCNAME = function

mbfl_main_set_main funcname Function

Selects the main function storing funcname into mbfl_main_SCRIPT_FUNCTION. 
 

Node:Testing, Next:Package License, Previous:Main, Up:Top

Building test suites MBFL comes with a little library of functions that may be used to build test suites; its aim is at building tests for bash functions/commands/scripts. The ideas at the base of this library are taken from the tcltest package distributed with the TCL core 1; this package had contributions from the following people/entities: Sun Microsystems, Inc.; Scriptics Corporation; Ajuba Solutions; Don Porter, NIST; probably many many others. The library tries to do as much as possible using functions and aliases, not variables; this is an attempt to let the user redefine functions to his taste. Testing Intro: A way to organise a test suite. Testing Config: Configuring the package. Testing Running: Running tests. Testing Compare: Validating results by comparing. Testing Output: Validating results by output. Testing Messages: Printing messages from test functions. Testing Files: Handling files in tests.

Node:Testing Intro, Next:Testing Config, Up:Testing

A way to organise a test suite A useful way to organise a test suite is to split it into a set of files: one for each module to be tested. The file mbfltest.sh must be sourced at the beginning of each test file. The function dotest should be invoked at the end of each module in the test suite; each module should define functions starting with the same prefix. A module should be stored in a file, and should look like the following:

# mymodule.test –

source mbfltest.sh source module.sh

function module-featureA-1.1 () { … } function module-featureA-1.2 () { … } function module-featureA-2.1 () { … } function module-featureB-1.1 () { … } function module-featureB-1.2 () { … }

dotest module-

### end of file

the file should be executed with:

$ bash mymodule.test

To test just „feature A“:

$ TESTMATCH=module-featureA bash mymodule.test

Remember that the source builtin will look for files in the directories selected by the PATH environment variables, so we may want to do:

$ PATH=„path/to/modules:${PATH}“ \ TESTMATCH=module-featureA bash mymodule.test

It is better to put such stuff in a Makefile, with GNU make:

top_srcdir = … builddir = … BASHPROG = bash MODULES = moduleA moduleB

testdir = $(top_srcdir)/tests test_FILES = $(foreach f, $(MODULES), $(testdir)/$(f).test) test_TARGETS = test-modules

test_ENV = PATH=$(builddir):$(testdir):$(PATH) TESTMATCH=$(TESTMATCH) test_CMD = $(test_ENV) $(BASHPROG)

.PHONY: test-modules

test-modules: ifneq ($(strip $(test_FILES)),)

      @$(foreach f, $(test_FILES), $(test_CMD) $(f);)

endif

Node:Testing Config, Next:Testing Running, Previous:Testing Intro, Up:Testing

Configuring the package

dotest-set-verbose Function

dotest-unset-verbose   Function
Set or unset verbose execution. If verbose mode is on: some commands output messages on stderr describing what is going on. Examples: files and directories creation/removal. 
 

dotest-option-verbose Function

Returns true if verbose mode is on, false otherwise. 
 

dotest-set-test Function

dotest-unset-test   Function
Set or unset test execution. If test mode is on: external commands (like rm and mkdir) are not executed, the command line is sent to stderr. Test mode is meant to be used to debug the test library functions. 
 

dotest-option-test Function

Returns true if test mode is on, false otherwise. 
 

dotest-set-report-start Function

dotest-unset-report-start   Function
Set or unset printing a message upon starting a function. 
 

dotest-option-report-start Function

Returns true if start function reporting is on; otherwise returns false. 
 

dotest-set-report-success Function

dotest-unset-report-success   Function
Set or unset printing a message when a function execution succeeds. Failed tests always cause a message to be printed. 
 

dotest-option-report-success Function

Returns true if success function reporting is on; otherwise returns false. 
 

Node:Testing Running, Next:Testing Compare, Previous:Testing Config, Up:Testing

Running test functions

dotest pattern Funciton

Run all the functions matching pattern. Usually pattern is the first part of the name of the functions to be executed; the function names are selected with the following code: 

compgen -A function „$pattern“

There's no constraint on function names, but they must be one-word names. Before running a test function: the current process working directory is saved, and it is restored after the execution is terminated. The return value of the test functions is used as result of the test: true, the test succeeded; false, the test failed. Remembering that the return value of a function is the return value of its last executed command, the functions dotest-equal and dotest-output, and of course the test command, may be used to return the correct value.

Messages are printed before and after the execution of each function, according to the mode selected with: dotest-set-report-success, dotest-set-report-start, ... (Testing Config for details). 

The following environment variables may configure the behaviour of dotest. TESTMATCH Overrides the value selected with pattern.

TESTSTART If yes: it is equivalent to invoking dotest-set-report-start; if no: it is equivalent to invoking dotest-unset-report-start.

TESTSUCCESS If yes: it is equivalent to invoking dotest-set-report-success; if no: it is equivalent to invoking dotest-unset-report-success.

Node:Testing Compare, Next:Testing Output, Previous:Testing Running, Up:Testing

Validating results by comparing

dotest-equal expected got Function

Compares the two parameters and returns true if they are equal; returns false otherwise. In the latter case prints a message showing the expected value and the wrong one. Must be used as last command in a function, so that its return value is equal to that of the function. 
Example: 

function my-func () {

  echo $(($1 + $2))

} function mytest-1.1 () {

  dotest-result 5 `my-func 2 3`

} dotest mytest-

another example:

function my-func () {

  echo $(($1 + $2))

} function mytest-1.1 () {

  dotest-result 5 `my-func 2 3` && \
    dotest-result 5 `my-func 1 4` && \
    dotest-result 5 `my-func 3 2` && \

} dotest mytest-

Node:Testing Output, Next:Testing Messages, Previous:Testing Compare, Up:Testing

Validating results by output

dotest-output ?string? Function

Reads all the available lines from stdin accumulating them into a local variable, separated by \n; then compares the input with string, or the empty string if string is not present, and returns true if they are equal, false otherwise. 
Example of test for a function that echoes its three parameters: 

function my-lib-function () {

  echo $1 $2 $3

} function mytest-1.1 () {

  my-lib-function a b c | dotest-output a b c

} dotest mytest

Example of test for a function that is supposed to print nothing:

function my-lib-function () {

  test "$1" != "$2" && echo error

} function mytest-1.1 () {

  my-lib-function a a | dotest-output

} dotest mytest

Validating input Here is a small script that asks for a first name then a second name:

$ pg func2 #!/bin/sh # func2 echo -n „What is your first name :“ read F_NAME echo -n „What is your surname :“ read S_NAME

The task is to make sure that the characters entered in both variables contain letters only. To do this without functions would duplicate a lot of code. Using a function cuts this duplication down. To test for characters only, we can use awk. Here's the function to test if we only get upper or lower case characters.

char_name() { # char_name # to call: char_name string # assign the argument across to new variable _LETTERS_ONLY=$1 # use awk to test for characters only ! _LETTERS_ONLY=`echo $1|awk '{if($0~/[^a-z A-Z]/) print „1“}'` if [ „$_LETTERS_ONLY“ != „“ ] then

 # oops errors 
 return 1 

else

 # contains only chars 
 return 0 

fi }

We first assign the $1 variable to a more meaningful name. Awk is then used to test if the whole record passed contains only characters. The output of this command, which is 1 for non-letters and null for OK, is held in the variable _LETTERS_ONLY. A test on the variable is then carried out. If it holds any value then it's an error, but if it holds no value then it's OK. A return code is then executed based on this test. Using the return code enables the script to look cleaner when the test is done on the function on the calling part of the script. To test the outcome of the function we can use this format of the if statement if we wanted:

if char_name $F_NAME; then

echo "OK" 

else

echo "ERRORS" 

fi

If there is an error we can create another function to echo the error out to the screen:

name_error() # name_error # display an error message { echo „ $@ contains errors, it must contain only letters“ } The function name_error will be used to echo out all errors disregarding any invalid entries. Using the special variable $@ allows all arguments to be echoed. In this case it's the value of either F_NAME or S_NAME. Here's what the finished script now looks like, using the functions:

$ pg func2 !/bin/sh char_name() # char_name # to call: char_name string # check if $1 does indeed contain only characters a-z,A-Z { # assign the argurment across to new variable _LETTERS_ONLY=$1 _LETTERS_ONLY=`echo $1|awk '{if($0~/[^a-zA-Z]/) print „1“}'` if [ „$_LETTERS_ONLY“ != „“ ] then

# oops errors 
return 1 

else

# contains only chars 
return 0 

fi }

name_error() # display an error message { echo „ $@ contains errors, it must contain only letters“ }

while : do

echo -n "What is your first name :" 
read F_NAME 
if char_name $F_NAME 
then 
  # all ok breakout 
  break 
else 
  name_error $F_NAME 
fi 

done

while : do

echo -n "What is your surname :" 
read S_NAME 
if char_name $S_NAME 
then 
  # all ok breakout 
  break 
else 
  name_error $S_NAME 
fi 

done

Notice a while loop for each of the inputs; this makes sure that we will continue prompting until a correct value is input, then we break out of the loop. Of course, on a working script, an option would be given for the user to quit this cycle, and proper cursor controls would be used, as would checking for zero length fields. Here's what the output looks like when the script is run:

$ func2 What is your first name :Davi2d Davi2d contains errors, it must contain only letters What is your first name :David What is your surname :Tansley1 Tansley1 contains errors, it must contain only letters What is your surname :Tansley Reading a single character When navigating menus, one of the most frustrating tasks is having to keep hitting the return key after every selection, or when a 'press any key to continue' prompt appears. A command that can help us with not having to hit return to send a key sequence is the dd command. The dd command is used mostly for conversions and interrogating problems with data on tapes or normal tape archiving tasks, but it can also be used to create fixed length files. Here a 1-megabyte file is created with the filename myfile. dd if=/dev/zero of=myfile count=512 bs=2048

The dd command can interpret what is coming in from your keyboard, and can be used to accept so many characters. In this case we only want one character. The command dd needs to chop off the new line; this control character gets attached when the user hits return. dd will also send out only one character. Before any of that can happen the terminal must first be set into raw mode using the stty command. We save the settings before dd is invoked and then restore them after dd has finished. Here's the function:

read_a_char() # read_a_char { # save the settings SAVEDSTTY=`stty -g` # set terminal raw please

  stty cbreak 
# read and output only one character 
  dd if=/dev/tty bs=1 count=1 2> /dev/null 
# restore terminal and restore stty 
  stty -cbreak 

stty $SAVEDSTTY } To call the function and return the character typed in, use command substitution. Here's an example.

echo -n „Hit Any Key To Continue“ character=`read_a_char` echo „ In case you are wondering you pressed $character“

Testing for the presence of a directory Testing for the presence of directories is a fairly common task when copying files around. This function will test the filename passed to the function to see if it is a directory. Because we are using the return command with a succeed or failure value, the if statement becomes the most obvious choice in testing the result. Here's the function. isdir() { # is_it_a_directory

if [ $# -lt 1 ]; then

echo "isdir needs an argument" 
return 1 

fi # is it a directory ? _DIRECTORY_NAME=$1 if [ ! -d $_DIRECTORY_NAME ]; then

# no it is not 
return 1 

else

# yes it is 
return 0 

fi } Getting information from a login ID When you are on a big system, and you want to contact one of the users who is logged in, don't you just hate it when you have forgotten the person's full name? Many a time I have seen users locking up a process, but their user ID means nothing to me, so I have to grep the passwd file to get their full name. Then I can get on with the nice part where I can ring them up to give the user a telling off. Here's a function that can save you from grep ing the /etc/passwd file to see the user's full name. On my system the user's full name is kept in field 5 of the passwd file; yours might be different, so you will have to change the field number to suit your passwd file. The function is passed a user ID or many IDs, and the function just grep s the passwd file. Here's the function: whois() # whois # to call: whois userid { # check we have the right params if [ $# -lt 1 ]; then

echo "whois : need user id's please" 
return 1 

fi

for loop do

_USER_NAME=`grep $loop /etc/passwd | awk -F: '{print $4}'` 
if [ "$_USER_NAME" = "" ]; then 
  echo "whois: Sorry cannot find $loop" 
else 
  echo "$loop is $_USER_NAME" 
fi 

done }

The whois function can be called like this:

$ whois dave peters superman dave is David Tansley - admin accts peter is Peter Stromer - customer services whois: Sorry cannot find superman

Line numbering a text file When you are in vi you can number your lines which is great for debugging, but if you want to print out some files with line numbers then you have to use the command nl. Here is a function that does what nl does best – numbering the lines in a file. The original file is not overwritten.

number_file() # number_file # to call: number_file filename { _FILENAME=$1 # check we have the right params if [ $# -ne 1 ]; then

 echo "number_file: I need a filename to number" 
 return 1 

fi

loop=1

 while read LINE 
 do 
   echo "$loop: $LINE" 
   loop=`expr $loop + 1` 
 done < $_FILENAME 

} String to upper case You may need to convert text from lower to upper case sometimes, for example to create directories in a filesystem with upper case only, or to input data into a field you are validating that requires the text to be in upper case. Here is a function that will do it for you. No points for guessing it's tr. str_to_upper () # str_to_upper # to call: str_to_upper $1 { _STR=$1 # check we have the right params if [ $# -ne 1 ]; then

echo "number_file: I need a string to convert please" 
return 1 

fi echo $@ |tr '[a-z]' '[A-Z]' } The variable UPPER holds the newly returned upper case string. Notice the use again of using the special parameter $@ to pass all arguments. The str_to_upper can be called in two ways. You can either supply the string in a script like this:

UPPER=`str_to_upper „documents.live“` echo $upper or supply an argument to the function instead of a string, like this: UPPER=`str_to_upper $1` echo $UPPER Both of these examples use substitution to get the returned function results. is_upper The function str_to_upper does a case conversion, but sometimes you only need to know if a string is upper case before continuing with some processing, perhaps to write a field of text to a file. The is_upper function does just that. Using an if statement in the script will determine if the string passed is indeed upper case. Here is the function.

is_upper() # is_upper # to call: is_upper $1 { # check we have the right params if [ $# -ne 1 ]; then

echo "is_upper: I need a string to test OK" 
return 1 

fi # use awk to check we have only upper case _IS_UPPER=`echo $1|awk '{if($0~/[^A-Z]/) print „1“}'` if [ „$_IS_UPPER“ != „“ ] then

# no, they are not all upper case 
return 1 

else

# yes all upper case 
return 0 

fi }

To call the function is_upper simply send it a string argument. Here's how it could be called.

echo -n „Enter the filename :“ read FILENAME if is_upper $FILENAME; then

 echo "Great it's upper case" 
 # let's create a file maybe ?? 

else

 echo "Sorry it's not upper case" 
 # shall we convert it anyway using str_to_upper ??? 

fi

To test if a string is indeed lower case, just replace the existing awk statement with this one inside the function is_upper and call it is_lower.

_IS_LOWER=`echo $1|awk '{if($0~/[^a-z]/) print „1“}'`

String to lower case Now I've done it. Because I have shown you the str_to_upper, I'd better show you its sister function str_to_lower. No guesses here please on how this one works. str_to_lower () # str_to_lower # to call: str_to_lower $1 { # check we have the right params if [ $# -ne 1 ]; then

echo "str_to_lower: I need a string to convert please" 
return 1 

fi echo $@ |tr '[A-Z]' '[a-z]' }

The variable LOWER holds the newly returned lower case string. Notice the use again of using the special parameter $@ to pass all arguments. The str_to_lower can be called in two ways. You can either supply the string in a script like this:

LOWER=`str_to_lower „documents.live“` echo $LOWER

or supply an argument to the function instead of a string, like this:

LOWER=`str_to_upper $1` echo $LOWER

Length of string Validating input into a field is a common task in scripts. Validating can mean many things, whether it's numeric, character only, formats, or the length of the field. Suppose you had a script where the user enters data into a name field via an interactive screen. You will want to check that the field contains only a certain number of characters, say 20 for a person's name. It's easy for the user to input up to 50 characters into a field. This is what this next function will check. You pass the function two parameters, the actual string and the maximum length the string should be. Here's the function:

check_length() # check_length # to call: check_length string max_length_of_string { _STR=$1 _MAX=$2 # check we have the right params if [ $# -ne 2 ]; then

echo "check_length: I need a string and max length the string should be" 
return 1 

fi # check the length of the string _LENGTH=`echo $_STR |awk '{print length($0)}'` if [ „$_LENGTH“ -gt „$_MAX“ ]; then

# length of string is too big 
return 1 

else

# string is ok in length 
return 0 

fi }

You could call the function check_length like this:

$ pg test_name # !/bin/sh # test_name while : do

echo -n "Enter your FIRST name :" 
read NAME 
if check_length $NAME 10 
then 
  break 
  # do nothing fall through condition all is ok 
else 
  echo "The name field is too long 10 characters max" 
fi 

done

The loop will continue until the data input into the variable NAME is less than the MAX characters permitted which in this case is ten; the break command then lets it drop out of the loop. Using the above piece of code this is how the output could look.

$ val_max Enter your FIRST name :Pertererrrrrrrrrrrrrrr The name field is too long 10 characters max Enter your FIRST name :Peter You could use the wc command to get the length of the string, but beware: there is a glitch when using wc in taking input from the keyboard. If you hit the space bar a few times after typing in a name, wc will almost always retain some of the spaces as part of the string, thus giving a false length size. Awk truncates end of string spaces by default when reading in via the keyboard. Here's an example of the wc glitch (or maybe it's a feature):

echo -n „name :“ read NAME echo $NAME | wc -c

Running the above script segment (where is a space)

name :Peter

     6 

chop The chop function chops off characters from the beginning of a string. The function chop is passed a string; you specify how many characters to chop off the string starting from the first character. Suppose you had the string MYDOCUMENT.DOC and you wanted the MYDOCUMENT part chopped, so that the function returned only .DOC. You would pass the following to the chop function:

MYDOCUMENT.DOC 10 Here's the function chop:

chop() # chop # to call:chop string how_many_chars_to_chop { _STR=$1 _CHOP=$2 # awk's substr starts at 0, we need to increment by one # to reflect when the user says (ie) 2 chars to be chopped it will be 2 chars off # and not 1 CHOP=`expr $_CHOP + 1`

# check we have the right params if [ $# -ne 2 ]; then

 echo "check_length: I need a string and how many characters to chop" 
 return 1 

fi # check the length of the string first # we can't chop more than what's in the string !! _LENGTH=`echo $_STR |awk '{print length($0)}'` if [ „$_LENGTH“ -lt „$_CHOP“ ]; then

 echo "Sorry you have asked to chop more characters than there are in 
      the string" 
 return 1 

fi echo $_STR |awk '{print substr($1,'$_CHOP')}' }

The returned string newly chopped is held in the variable CHOPPED. To call the function chop, you could use:

CHOPPED=`chop „Honeysuckle“ 5` echo $CHOPPED suckle

or you could call this way:

echo -n „Enter the Filename :“ read FILENAME CHOPPED=`chop $FILENAME 1` # the first character would be chopped off !

Months When generating reports or creating screen displays, it is sometimes convenient to the programmer to have a quick way of displaying the full month. This function, called months, will accept the month number or month abbreviation and then return the full month. For example, passing 3 or 03 will return March. Here's the function.

months() { # months _MONTH=$1 # check we have the right params if [ $# -ne 1 ]; then

echo "months: I need a number 1 to 12 " 
return 1 

fi

case $_MONTH in 1|01|Jan)_FULL=„January“ ;; 2|02|Feb)_FULL=„February“ ;; 3|03|Mar)_FULL=„March“;; 4|04|Apr)_FULL=„April“;; 5|05|May)_FULL=„May“;; 6|06|Jun)_FULL=„June“;; 7|07|Jul)_FULL=„July“;; 8|08|Aug)_FULL=„August“;; 9|10|Sep|Sept)_FULL=„September“;; 10|Oct)_FULL=„October“;; 11|Nov)_FULL=„November“;; 12|Dec)_FULL=„December“;; *) echo „months: Unknown month“ return 1

;; 

esac echo $_FULL }

To call the function months you can use either of the following methods.

months 04

The above method will display the month April; or from a script:

MY_MONTH=`months 06` echo „Generating the Report for Month End $MY_MONTH“ … which would output the month June.

Calling functions inside a script

To use a function in a script, create the function, and make sure it is above the code that calls it. Here's a script that uses a couple of functions. We have seen the script before; it tests to see if a directory exists.

$ pg direc_check !/bin/sh # function file is_it_a_directory() { # is_it_a_directory # to call: is_it_a_directory directory_name _DIRECTORY_NAME=$1 if [ $# -lt 1 ]; then

 echo "is_it_a_directory: I need a directory name to check" 
 return 1 

fi # is it a directory ? if [ ! -d $_DIRECTORY_NAME ]; then

 return 1 

else

 return 0 

fi } #————————————————– error_msg() { # error_msg # beeps; display message; beeps again! echo -e „\007“ echo $@ echo -e „\007“ return 0 } }

### END OF FUNCTIONS

echo -n „enter destination directory :“ read DIREC if is_it_a_directory $DIREC then : else

 error_msg "$DIREC does not exist...creating it now" 
 mkdir $DIREC > /dev/null 2>&1 
 if [ $? != 0 ] 
 then 
   error_msg "Could not create directory:: check it out!" 
   exit 1 
 else : 
 fi 

fi # not a directory echo „extracting files…“

In the above script two functions are declared at the top of the script and called from the main part of the script. All functions should go at the top of the script before any of the main scripting blocks begin. Notice the error message statement; the function error_msg is used, and all arguments passed to the function error_msg are just echoed out with a couple of bleeps. Calling functions from a function file We have already seen how to call functions from the command line; these types of functions are generally used for system reporting utilities. Let's use the above function again, but this time put it in a function file. We will call it functions.sh, the sh meaning shell scripts.

$ pg functions.sh #!/bin/sh # functions.sh # main script functions is_it_a_directory() { # is_it_a_directory # to call: is_it_a_directory directory_name # if [ $# -lt 1 ]; then

 echo "is_it_a_directory: I need a directory name to check" 
 return 1 

fi # is it a directory ? DIRECTORY_NAME=$1 if [ ! -d $DIRECTORY_NAME ]; then

 return 1 

else

 return 0 

fi }

#———————————————

error_msg() { echo -e „\007“ echo $@ echo -e „\007“ return 0 }

Now let's create the script that will use functions in the file functions.sh. We can then use these functions. Notice the functions file is sourced with the command format:

. /<path to file>

A subshell will not be created using this method; all functions stay in the current shell.

$ pg direc_check !/bin/sh # direc_check # source the function file functions.sh # that's a <dot><space><forward slash> . /home/dave/bin/functions.sh

# now we can use the function(s)

echo -n „enter destination directory :“ read DIREC if is_it_a_directory $DIREC then : else

 error_msg "$DIREC does not exist...creating it now" 
 mkdir $DIREC > /dev/null 2>&1 
 if [ $? != 0 ] 
 then 
   error_msg "Could not create directory:: check it out!" 
   exit 1 
 else : 
 fi 

fi # not a directory echo „extracting files…“

When we run the above script we get the same output as if we had the function inside our script:

$ direc_check enter destination directory :AUDIT AUDIT does not exist…creating it now extracting files… Sourcing files is not only for functions To source a file, it does not only have to contain functions – it can contain global variables that make up a configuration file. Suppose you had a couple of backup scripts that archived different parts of a system. It would be a good idea to share one common configuration file. All you need to do is to create your variables inside a file then when one of the backup scripts kicks off it can load these variables in to see if the user wants to change any of the defaults before the archive actually begins. It may be the case that you want the archive to go to a different media. Of course this approach can be used by any scripts that share a common configuration to carry out a process. Here's an example. The following configuration contains default environments that are shared by a few backup scripts I use. Here's the file.

$ pg backfunc #!/bin/sh # name: backfunc # config file that holds the defaults for the archive systems _CODE=„comet“ _FULLBACKUP=„yes“ _LOGFILE=„/logs/backup/“ _DEVICE=„/dev/rmt/0n“ _INFORM=„yes“ _PRINT_STATS=„yes“

The descriptions are clear. The first field _CODE holds a code word. To be able to view this and thus change the values the user must first enter a code that matches up with the value of _CODE, which is „comet“. Here's the script that prompts for a password then displays the default configuration:

$ pg readfunc #!/bin/sh # readfunc

if [ -r backfunc ]; then

# source the file 
. /backfunc 

else

echo "$`basename $0` cannot locate backfunc file" 

fi

echo -n „Enter the code name :“ # does the code entered match the code from backfunc file ??? if [ „${CODE}“ != „${_CODE}“ ]; then

echo "Wrong code...exiting..will use defaults" 
exit 1 

fi

echo „ The environment config file reports“ echo „Full Backup Required : $_FULLBACKUP“ echo „The Logfile Is : $_LOGFILE“ echo „The Device To Backup To is : $_DEVICE“ echo „You Are To Be Informed by Mail : $_INFORM“ echo „A Statistic Report To Be Printed: $_PRINT_STATS“ When the script is run, you are prompted for the code. If the code matches, you can view the defaults. A fully working script would then let the user change the defaults. $ readback Enter the code name :comet

The environment config file reports 
Full Backup Required            : yes 
The Logfile Is                  : /logs/backup/ 
The Device To Backup To is      : /dev/rmt/0n 
You Are To Be Informed by Mail  : yes 
A Statistic Report To Be Printed: yes

Using functions will greatly reduce the time you spend scripting. Creating useable and reuseable functions makes good sense; it also makes your main scripts less maintenance-prone. When you have got a set of functions you like, put them in a functions file, then other scripts can use the functions as well.

Last modified: August 15, 2009

 
Nach oben
bash/string_operations.txt · Zuletzt geändert: 2015/06/08 18:22 (Externe Bearbeitung)
chimeric.de = chi`s home Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0
DFmW2CEce3htPL1uNQuHUVu4Tk6WXigFQp   Dogecoin Donations Accepted Here    DFmW2CEce3htPL1uNQuHUVu4Tk6WXigFQp  DFmW2CEce3htPL1uNQuHUVu4Tk6WXigFQp