Code Confidencebuild 3.0.0.201402161939

An Introduction to Tcl

All CDL scripts are implemented as Tcl scripts, and are read in by running the data through a standard Tcl interpreter, extended with a small number of additional commands such as cdl_option and cdl_component. Often it is not necessary to know the full details of Tcl syntax. Instead it is possible to copy an existing script, perform some copy and paste operations, and make appropriate changes to names and to various properties. However there are also cases where an understanding of Tcl syntax is very desirable, for example:

cdl_option CYGDAT_UITRON_MEMPOOLFIXED_EXTERNS {
    display       "Externs for initialization"
    flavor        data
    default_value {"static char fpool1[ 2000 ], \\\n\
	                        fpool2[ 2000 ], \\\n\
	                        fpool3[ 2000 ];"}
    …
}

This causes the cdl_option command to be executed, which in turn evaluates its body in a recursive invocation of the Tcl interpreter. When the default_value property is encountered the braces around the value part are processed by the interpreter, stopping it from doing further processing of the braced contents (except for backslash processing at the end of a line, that is special). In particular it prevents command substitution for [ 2000 ]. A single argument will be passed to the default_value command which expects a CDL expression, so the expression parsing code is passed the following:

"static char fpool1[ 2000 ], \\\n fpool2[ 2000 ], \\\n fpool3[ 2000 ];"

The CDL expression parsing code will treat this as a simple string constant, as opposed to a more complicated expression involving other options and various operators. The string parsing code will perform the usual backslash substitutions so the actual default value will be:

static char fpool1[ 2000 ], \
 fpool2[ 2000 ], \
 fpool3[ 2000 ];

If the user does not modify the option's value then the following will be generated in the appropriate configuration header file:

#define CYGDAT_UITRON_MEMPOOLFIXED_EXTERNS static char fpool1[ 2000 ], \
 fpool2[ 2000 ], \
 fpool3[ 2000 ];

Getting this desired result usually requires an understanding of both Tcl syntax and CDL expression syntax. Sometimes it is possible to substitute a certain amount of trial and error instead, but this may prove frustrating. It is also worth pointing out that many CDL scripts do not involve this level of complexity. On the other hand, some of the more advanced features of the CDL language involve fragments of Tcl code, for example the define_proc property. To use these component writers will need to know about the full Tcl language as well as the syntax.

Although the current example may seem to suggest that Tcl is rather complicated, it is actually a very simple yet powerful scripting language: the syntax is defined by just eleven rules. On occasion this simplicity means that Tcl's behavior is subtly different from other languages, which can confuse newcomers.

When the Tcl interpreter is passed some data such as puts Hello, it splits this data into a command and its arguments. The command will be terminated by a newline or by a semicolon, unless one of the quoting mechanisms is used. The command and each of its arguments are separated by white space. So in the following example:

puts Hello
set x 42

This will result in two separate commands being executed. The first command is puts and is passed a single argument, Hello. The second command is set and is passed two arguments, x and 42. The intervening newline character serves to terminate the first command, and a semi-colon separator could be used instead:

puts Hello;set x 42

Any white space surrounding the semicolon is just ignored because it does not serve to separate arguments.

Now consider the following:

set x Hello world

This is not valid Tcl. It is an attempt to invoke the set command with three arguments: x, Hello, and world. The set only takes two arguments, a variable name and a value, so it is necessary to combine the data into a single argument by quoting:

set x "Hello world"

When the Tcl interpreter encounters the first quote character it treats all subsequent data up to but not including the closing quote as part of the current argument. The quote marks are removed by the interpreter, so the second argument passed to the set command is just Hello world without the quote characters. This can be significant in the context of CDL scripts. For example:

cdl_option CYG_HAL_STARTUP {
    …
    default_value "RAM"
}

The Tcl interpreter strips off the quote marks so the CDL expression parsing code sees RAM instead of "RAM". It will treat this as a reference to some unknown option RAM rather than as a string constant, and the expression evaluation code will use a value of 0 when it encounters an option that is not currently loaded. Therefore the option CYG_HAL_STARTUP ends up with a default value of 0. Either braces or backslashes should be used to avoid this, for example default_value { "RAM" }.

Note: There are long-term plans to implement some sort of CDL validation utility cdllint which could catch common errors like this one.

A quoted argument continues until the closing quote character is encountered, which means that it can span multiple lines. Newline or semicolon characters do not terminate the current command in such cases. description properties usually make use of this:

cdl_package CYGPKG_ERROR {
    description   "
        This package contains the common list of error and
        status codes. It is held centrally to allow
        packages to interchange error codes and status
        codes in a common way, rather than each package
        having its own conventions for error/status
        reporting. The error codes are modelled on the
        POSIX style naming e.g. EINVAL etc. This package
        also provides the standard strerror() function to
        convert error codes to textual representation."
    …
}

The Tcl interpreter supports much the same forms of backslash substitution as other common programming languages. Some backslash sequences such as \n will be replaced by the appropriate character. The sequence \\ will be replaced by a single backslash. A backslash at the very end of a line will cause that backslash, the newline character, and any white space at the start of the next line to be replaced by a single space. Hence the following two Tcl commands are equivalent:

puts  "Hello\nworld\n"
puts \
"Hello
world
"

If a description string needs to contain quote marks or other special characters then backslash escapes can be used. In addition to quote and backslash characters, the Tcl interpreter treats square brackets, the $ character, and braces specially. Square brackets are used for command substitution, for example:

puts "The answer is [expr 6 * 9]"

When the Tcl interpreter encounters the square brackets it will treat the contents as another command that should be executed first, and the result of executing that is used when continuing to process the script. In this case the Tcl interpreter will execute the command expr 6 * 9, yielding a result of 42 [1] and then the Tcl interpreter will execute puts "The answer is 42". It should be noted that the interpreter performs only one level of substitution: if the result of performing command substitution performs further special characters such as square brackets then these will not be treated specially.

Command substitution will not prove useful for many CDL scripts, except for e.g. a define_proc property which involves a fragment of Tcl code. Potentially there are some interesting uses, for example to internationalize display strings. However care does have to be taken to avoid unexpected command substitution, for example if an option description involves square brackets then typically these would require backslash-escapes.

The $ character is used in Tcl scripts to perform variable substitution:

set x [expr 6 * 9]
puts "The answer is $x"

Variable substitution, like command substitution, is unlikely to prove useful for many CDL scripts except in the context of Tcl fragments. If it is necessary to have a $ character then a backslash escape may have to be used.

Braces are used to collect a sequence of characters into a single argument, just like quotes. The difference is that variable, command and backslash substitution do not occur inside braces (with the sole exception of backslash substitution at the end of a line). Therefore given a line in a CDL script such as:

default_value {"RAM"}

The braces are stripped off by the Tcl interpreter, leaving "RAM" which will be handled as a string constant by the expression parsing code. The same effect could be achieved using one of the following:

default_value \"RAM\"
default_value "\"RAM\""

Generally the use of braces is less confusing. At this stage it is worth noting that the basic format of CDL data makes use of braces:

cdl_option <name> {
     …
};

The cdl_option command is passed two arguments, a name and a body, where the body consists of everything inside the braces but not the braces themselves. This body can then be executed in a recursive invocation of the Tcl interpreter. If a CDL script contains mismatched braces then the interpreter is likely to get rather confused and the resulting diagnostics may be difficult to understand.

Comments in Tcl scripts are introduced by a hash character #. However, a hash character only introduces a comment if it occurs where a command is expected. Consider the following:

# This is a comment
puts "Hello" # world

The first line is a valid comment, since the hash character occurs right at the start where a command name is expected. The second line does not contain a comment. Instead it is an attempt to invoke the puts command with three arguments: Hello, # and world. These are not valid arguments for the puts command so an error will be raised. If the second line was rewritten as:

puts "Hello"; # world

then this is a valid Tcl script. The semicolon identifies the end of the current command, so the hash character occurs at a point where the next command would start and hence it is interpreted as the start of a comment.

This handling of comments can lead to subtle behavior. Consider the following:

cdl_option WHATEVER {
# This is a comment }
    default_value 0
    …
}

Consider the way the Tcl interpreter processes this. The command name and the first argument do not pose any special difficulties. The opening brace is interpreted as the start of the next argument, which continues until a closing brace is encountered. In this case the closing brace occurs on the second line, so the second argument passed to cdl_option is \n    # This is a comment. This second argument is processed in a recursive invocation of the Tcl interpreter and does not contain any commands, just a comment. Top-level script processing then resumes, and the next command that is encountered is default_value. Since the parser is not currently processing a configuration option this is an error. Later on the Tcl interpreter would encounter a closing brace by itself, which is also an error.

For component writers who need more information about Tcl, especially about the language rather than the syntax, various resources are available. A reasonable starting point is the Scriptics developer web site.

Notes

[1]

It is possible that some versions of the Tcl interpreter will instead produce a result of 54 when asked to multiply six by nine. Appropriate reference documentation should be consulted for more information on why 42 is in fact the correct answer.